Verifying the Excel’s one-factor ANOVA results
In a one-factor ANOVA (Analysis of Variance), we check the replicate results of each group under a sole factor. For example, we may be evaluating the performance of four analysts who have each been asked to perform replicate determinations of identical samples drawn from a population. Thus, the factor is “Analyst” of which we have four so-called “groups” – Analyst A, Analyst B, Analyst C and Analyst D. The test data are laid out in a matrix with the four groups of the factor in each column and the replicates (five replicates in this example) in each row, as shown in the following table:
We then use the ANOVA principal inferential statistic, the F-test, to decide whether differences between treatment’s mean values are real in the population, or simply due to random error in the sample analysis. The F-test studies the ratio of two estimates of the variance. In this case, we use the variance between groups, divided by the variance within groups.
The MS Excel installation files have included an add-in that performs several statistical analyses. It is called “Data Analysis” add-in. If you do not find it labelled in the Ribbon of your spreadsheet, you can make it available to Excel by installing it with your Excel 2016 version by clicking “File -> Options -> Add-ins -> Analysis ToolPak” and then pressing “Enter”. You should now find Data Analysis in the Analysis Group on your Excel’s Data tab.
We can then click the “Data Analysis” button to look for “Anova: Single Factor” entry and start its dialog box accordingly.
For the above set of data from Analysts A to D, the Excel’s Data Analysis gives the following outputs, and we shall then verify the outcomes through manual calculations based on the first principles:
We know that variances are defined and calculated using the squared deviations of a single variable. In Excel, we can use the formula “=DEVSQ( )” to calculate each group of results. Also, we can use “=AVERAGE( )” function to calculate the individual mean of each Analyst.
In this univariate ANOVA example, we squared the deviation of a value from a mean, and the word “deviation” referred to the difference between each measurement result from the mean of the Analyst concerned.
The above figure shows the manual calculations using the Excel formulae agree very well with the Excel’s calculated data by its Data Analysis package. In words, we have:
- The Sum of Squares Between SSB uses DEVSQ( ) to take the sum of the squared deviations of each group (Analyst) mean from the grand mean, and multiplies by the number of replicates in each group;
- The Sum of Squares Within SSW uses the replicates of each DEVSQ( ) in J13:M13 to get the sum of the squared deviations of each measured value from the mean of its groups; then, the results from DEVSQ( ) are totaled;
- The Sum of Squares Total SST uses DEVSQ( ) to return the sum of the squared deviations of each measurement from the grand mean value. We can also just add up SSB and SSW to give SST.
Subsequently, we can also verify the Excel’s calculations of the F-value, P-value and the F critical value by using the various formulae as shown above.
We have normally set the level of significance at the P = 0.05 (or 5%), meaning that we are prepared to make a 5% error in rejecting the null hypothesis which states that there are no difference amongst the mean values of these four Analysts. The calculated P-value of 0.008 indicates that our risk of rejecting the null hypothesis is only at a low 0.8%.