Saturday, June 4, 2016

What does a completed ANOVA summary table explain or describe? A researcher conducted an experiment on the effects of a new “drug” on...

The central function of Analysis of Variance is to determine if the means and distributions of data sets show a meaningful difference between groups, significant with a certain level of confidence. Let's look at the output of a sample ANOVA, including how it might be applied to a clinical drug study. Included is an image of an ANOVA table from the support documentation for Minitab, a common statistical software package.

In the Minitab example, there are four sample groups of paint being tested for hardness once dry. Arguably the most important output on the ANOVA table is the p-value. Technically, the p-value is the chance that the null hypothesis is correct, the null hypothesis being that there is not a meaningful difference between the groups, that the means would be identical if we had an infinite number of samples in each group. For most scientific studies a p-value of .05 or less, which means 5% or less, is desired for statistical significance.  This can also be expressed as a confidence level of 95% or greater. Disciplines like psychology and economics, with greater variability, require less rigorous confidence intervals. In the Minitab example, we can see that the p-value is .0043, indicating that we can be 99.6% certain that there is a significant difference among the four groups of paint, but we don't know exactly which group differs from which. We would have to do ANOVAs of each pair, or another statistical technique on pairs, to be sure of that.


In the case of the clinical study described, we have three groups of patients and presumably some quantitative measure of depression after they have received their treatment. If we achieve a p-value of less than .05, it implies that there is statistically significant difference in the outcome of their treatment regimens, but we wouldn't know which groups differ from each other. We would have to run three additional paired ANOVAs to be sure of that. In drug studies, oftentimes you will see a placebo give a statistically significant improvement (or even side effects) compared to no treatment at all, but no significant difference between a drug and a placebo. Ideally, the drug company would want to see a statistically significant improvement caused by the drug versus placebo and no treatment in paired analyses.


The other values of the ANOVA table describe features of the data distribution, but I suspect are beyond the scope of your question. The DF refers to the "degrees of freedom" and is equal to the number of groups minus one. The DF for the clinical trial would be 2, as there are three treatment groups. The Adj SS refers to the "adjusted Sum of Squares," which is a measure of the variability within the data samples you present. This is closely tied to the precision of your data gathering as well as the spread of data in your groups. The Adj MS is the "Adjusted Mean Squares", a step towards finding statistical significance, determined by dividing sum of squares by the number of degrees of freedom. The second row of the chart includes the SS and MS for the error, the residuals of your measurements. The f-value gives you the ratio of the Adjusted MS over the MSE, or Mean Squared Error. There are numerous ways to use each of these values statistically. The supporting documentation for programs like Minitab and STATA, which are most widely distributed to students, should be helpful to further illustrate these concepts if you are interested.

No comments:

Post a Comment

What is the Exposition, Rising Action, Climax, and Falling Action of "One Thousand Dollars"?

Exposition A "decidedly amused" Bobby Gillian leaves the offices of Tolman & Sharp where he is given an envelope containing $1...