Verifying the Excel’s one-factor ANOVA results
In a one-factor ANOVA (Analysis of Variance), we check the replicate results of each group under a sole factor. For example, we may be evaluating the performance of four analysts who have each been asked to perform replicate determinations of identical samples drawn from a population. Thus, the factor is “Analyst” of which we have four so-called “groups” – Analyst A, Analyst B, Analyst C and Analyst D. The test data are laid out in a matrix with the four groups of the factor in each column and the replicates (five replicates in this example) in each row, as shown in the following table:
We then use the ANOVA principal inferential statistic, the F-test, to decide whether differences between treatment’s mean values are real in the population, or simply due to random error in the sample analysis. The F-test studies the ratio of two estimates of the variance. In this case, we use the variance between groups, divided by the variance within groups.
The MS Excel installation files have included an add-in that performs several statistical analyses. It is called “Data Analysis” add-in. If you do not find it labelled in the Ribbon of your spreadsheet, you can make it available to Excel by installing it with your Excel 2016 version by clicking “File -> Options -> Add-ins -> Analysis ToolPak” and then pressing “Enter”. You should now find Data Analysis in the Analysis Group on your Excel’s Data tab.
We can then click the “Data Analysis” button to look for “Anova: Single Factor” entry and start its dialog box accordingly.
For the above set of data from Analysts A to D, the Excel’s Data Analysis gives the following outputs, and we shall then verify the outcomes through manual calculations based on the first principles:
We know that variances are defined and calculated using the squared deviations of a single variable. In Excel, we can use the formula “=DEVSQ( )” to calculate each group of results. Also, we can use “=AVERAGE( )” function to calculate the individual mean of each Analyst.
In this univariate ANOVA example, we squared the deviation of a value from a mean, and the word “deviation” referred to the difference between each measurement result from the mean of the Analyst concerned.
The above figure shows the manual calculations using the Excel formulae agree very well with the Excel’s calculated data by its Data Analysis package. In words, we have:
Subsequently, we can also verify the Excel’s calculations of the F-value, P-value and the F critical value by using the various formulae as shown above.
We have normally set the level of significance at the P = 0.05 (or 5%), meaning that we are prepared to make a 5% error in rejecting the null hypothesis which states that there are no difference amongst the mean values of these four Analysts. The calculated P-value of 0.008 indicates that our risk of rejecting the null hypothesis is only at a low 0.8%.
Statistical Tools in EXCEL
I have come across some course participants who are not aware of the fact that we can add a plug-in of Analysis ToolPak on the popular Microsoft Excel® spreadsheet. This tool package is very useful for statistical analysis. It uses appropriate statistical macro functions to calculate and display the results of your data analysis in an output table. Some tools even generate charts in addition to output tables.
To do so, you can simply open the Excel spreadsheet, first click the “File” tab, click “Options” , then click “Add-ins” which will show up a list of Inactive Application Add-ins under the “Manage” box. Check the “Analysis ToolPak” check box and click “OK”. The spreadsheet will automatically install this statistical package for you. You can find the “Data Analysis” tab when you click the “Data” tab on the spreadsheet tool bar as shown below:
When the “Data Analysis” button is clicked, the menu as shown below shows up. You can just choose the relevant statistical tool you want to work on by select it with a click.
The regular statistical tools useful in analyzing data collected from chemical experiments are analysis of variance (ANOVA), linear regression correlation, descriptive statistics, plotting histogram, F-tests for variances and various Student’s t-tests.