## Training and consultancy for testing laboratories. ### Verifying the Excel’s one-factor ANOVA results

Verifying the Excel’s one-factor ANOVA results

In a one-factor ANOVA (Analysis of Variance), we check the replicate results of each group under a sole factor. For example, we may be evaluating the performance of four analysts who have each been asked to perform replicate determinations of identical samples drawn from a population.  Thus, the factor is “Analyst” of which we have four so-called “groups” – Analyst A, Analyst B, Analyst C and Analyst D.  The test data are laid out in a matrix with the four groups of the factor in each column and the replicates (five replicates in this example) in each row, as shown in the following table:

We then use the ANOVA principal inferential statistic, the F-test, to decide whether differences between treatment’s mean values are real in the population, or simply due to random error in the sample analysis.  The F-test studies the ratio of two estimates of the variance.  In this case, we use the variance between groups, divided by the variance within groups.

The MS Excel installation files have included an add-in that performs several statistical analyses.  It is called “Data Analysis” add-in.  If you do not find it labelled in the Ribbon of your spreadsheet, you can make it available to Excel by installing it with your Excel 2016 version by clicking “File -> Options -> Add-ins -> Analysis ToolPak” and then pressing “Enter”. You should now find Data Analysis in the Analysis Group on your Excel’s Data tab.

We can then click the “Data Analysis” button to look for “Anova: Single Factor” entry and start its dialog box accordingly.

For the above set of data from Analysts A to D, the Excel’s Data Analysis gives the following outputs, and we shall then verify the outcomes through manual calculations based on the first principles:

We know that variances are defined and calculated using the squared deviations of a single variable.  In Excel, we can use the formula “=DEVSQ( )” to calculate each group of results.  Also, we can use “=AVERAGE( )” function to calculate the individual mean of each Analyst.

In this univariate ANOVA example, we squared the deviation of a value from a mean, and the word “deviation” referred to the difference between each measurement result from the mean of the Analyst concerned.

The above figure shows the manual calculations using the Excel formulae agree very well with the Excel’s calculated data by its Data Analysis package.  In words, we have:

• The Sum of Squares Between SSB uses DEVSQ( ) to take the sum of the squared deviations of each group (Analyst) mean from the grand mean, and multiplies by the number of replicates in each group;
• The Sum of Squares Within SSW uses the replicates of each DEVSQ( ) in J13:M13 to get the sum of the squared deviations of each measured value from the mean of its groups; then, the results from DEVSQ( ) are totaled;
• The Sum of Squares Total SST uses DEVSQ( ) to return the sum of the squared deviations of each measurement from the grand mean value.  We can also just add up SSB and SSW to give SST.

Subsequently, we can also verify the Excel’s calculations of the F-value, P-value and the F critical value by using the various formulae as shown above.

We have normally set the level of significance at the P = 0.05 (or 5%), meaning that we are prepared to make a 5% error in rejecting the null hypothesis which states that there are no difference amongst the mean values of these four Analysts.  The calculated P-value of 0.008 indicates that our risk of rejecting the null hypothesis is only at a low 0.8%.

### Are your linear regression data homoscedastic or heteroscedastic?

In instrumental analysis, there must be a measurement model, an equation that relates the amounts to be measured to the instrument response such as absorbance, transmittance, peak areas, peak heights, potential current, etc.  From this model, we can then derive the calibration equation.

It is our usual practice to perform the experiment in such as a way as to fix influence standard concentration of the measurand and the instrument response in a simple linear relationship, i.e.,

y = a + bx                                                                   ………. 

where

y is the indication of the instrument (i.e., the instrument response),

x is the independent variable (i.e., mostly for our purpose, the concentration of the measurand)

and,

a and b are the coefficients of the model, known as the intercept and slope (or gradient) of the curve, respectively.

Therefore, for a number of xi values, we will have the corresponding instrument responses, yi. We then fit the above model of equation to the data.

As usual, any particular instrumental measurement of yi will be subject to measurement error (ei), that is,

yi = a + bxi + ei                                                                            …….. 

To get this linear model, we have to find a line that is best fit for the data points that we have obtained experimentally. We use the ordinary least square (OLS) approach, which chooses the model parameters that minimize the residual sum of squares (RSS) of the predicted y values versus the actual or experimental y values.  The residual (or sometimes called error), in this case, means the difference between the predicted yi value derived from the above equation and the experimental yi value.

So, if the linear equation model is correct, the sum of all the differences from all the points (x, y) on the plot should be arithmetically equal to zero.

It must be stressed however, that for the sake of the above statement to be true, we make an important assumption, i.e., the uncertainty of the independent variable, xi, is very much less than in the instrument response, hence, only one error term ei in yi is considered due to this uncertainty which is sufficiently small to be neglected.  Such assumption is indeed valid for our laboratory analytical purposes and the estimation process of measurement error is then very much simplified.

What is another important assumption made in this OLS method?

It is that the data are known to be homoscedastic, which means that the errors in y are assumed to be independent of the concentration.  In other words, the variance of y remains constant and does not change for each xi value or for a range of x values.   This also means that all the points have equal weight when the slope and intercept of the line are calculated. The following plot illustrates this important point.

However, in many of our chemical analysis, this assumption is not likely to be valid.  In fact, many data are heteroscedastic, i.e. the standard deviation of the y-values increases with the concentration of the analyte, rather than having the constant value of variation at all concentrations. In other words, the errors that are approximately proportional to the analyte concentration. In fact, we find their relative standard deviations which are standard deviations divided by the mean values are roughly constant. The following plot illustrates this particular scenario.

In this case, the weighted regression method is to be applied. The regression line must be calculated to give additional weight to those points where the errors are smallest, i.e. it is important for the calculated line to pass close to such points than to pass close to the points representing higher concentrations with the largest errors.

This is achieved by giving each point a weighting inversely proportional to the corresponding y-direction variance, si2.  Without going through details of its calculations which can be quite tedious and complex as compared with those of the unweighted ones, , it is suffice to say that in our case of instrumental calibration which normally sees the experimental points fit a straight line very well, we would find the slope (b) and y-intercept (a) of the weighted line are remarkably similar to those of the unweighted line, and the results of the two approaches give very similar values for the concentrations of samples within the linearity of the calibration line.

So, does it mean that one the face of it, the weighted regression calculations have little value to us?

In addition to providing results very similar to those obtained from the simpler unweighted regression method, we find values in getting more realistic results on the estimation of the errors or confidence limits of those sample concentrations under study.  It can be shown by calculations that we will have narrower confidence limits at low concentrations in the weighted regression and its confidence limit increases with increasing instrumental signals, such as absorbance.  A general form of the confidence limits for a concentration determined using a weighted regression line is show in the sketch below:

These observations emphasize the particular importance of using weighted regression when the results of interest include those at low concentrations.  Similarly, detection limits may be more realistically assessed using the intercept and standard deviation obtained from a weighted regression graph.

### Decision rule in conformance testing with a given tolerance limit

Today there is a dilemma for an ISO/IEC 17025 accredited laboratory service provider in issuing a statement of conformity with specification to the clients after testing, particularly when the analysis result of the test sample is close to the specified value with its upper or lower measurement uncertainty crossing over the limit. The laboratory manager has to decide on the level of risk he is willing to take in stating such conformity.

However, there are certain trades which buy goods and commodities with a given tolerance allowance against the buying specification. A good example is in the trading of granular or pelletized compound fertilizers which contain multiple primary nutrients (e.g. N, P, K) in each individual granule.  A buyer usually allows some permissible 2- 5% tolerance on the buying specification as a lower limit to the declared value to allow variation in the manufacturing process. Some government departments of agriculture even allow up to a lower 10% tolerance limit in their procurement of compound fertilizers which will be re-sold to their farmers with a discount.

Given the permissible lower tolerance limit, the fertilizer buyer has taken his own risk of receiving a consignment that might be below his buying specification. This is rightly pointed out in the Eurolab’s Technical Report No. 01/2017 “Decision rule applied to conformity assessment” that by giving a tolerance limit above the upper specification limit, or below the lower specification limit, we can classify this as the customer’s or consumer’s risk.  In hypothesis testing context, we say this is a type II (beta-) error.

What will be the decision rule of test laboratory in issuing its conformity statement under such situation?

Let’s discuss this through an example.

A government procurement department purchased a consignment of 3000 bags of granular compound fertilizer with a guarantee of available plant nutrients expressed as a percentage by weight in it, e.g. a NPK of 15-15-15 marking on its bag indicates the presence of 15% nitrogen (N), 15% phosphorus (P2O5) and 15% potash (K2O) nutrients.  Representative samples were drawn and analyzed in its own fertilizer laboratory.

In the case of potash (K2O) content of 15% w/w, a permissible tolerance limit of 13.5% w/w is stated in the tender document, indicating that a fertilizer chemist can declare conformity at this tolerance level. The successful supplier of the tender will be charged a calculated fee for any specification non-conformity.

Our conventional approach of decision rules has been based on the comparison of single or interval of conformity limits with single measurement results.  Today, we have realized that each test result has its own measurement variability, normally expressed as measurement uncertainty with 95% confidence level.

Therefore, it is obvious that the conventional approach of stating conformity based on a single measurement result has exposed the laboratory to a 50% risk of having the true (actual) value of test parameter falling outside the given tolerance limit, rendering it to be non-conformance! Is the 50% risk bearable by the test laboratory?

Let say the average test result of K2O content of this fertilizer sample was found to be 13.8+0.55%w/w.  What is the critical value for us in deciding on conformity in this particular case with the usual 95% confidence level? Can we declare the result of 13.8%w/w found to be in conformity with specification referencing to its given tolerance limit of 13.5%w/w?

Let us first see how the critical value is estimated.  In hypothesis testing, we make the following hypotheses:

Ho :  Target tolerance value > 13.5%w/w

H1 :  Target tolerance value < 13.5%w/w

Use the following equation with an assumption that the variation of the laboratory analysis result agrees with the normal or Gaussian probability distribution:

where

mu is the tolerance value for the specification, i.e. 13.5%,

x(bar) , the critical value with 95% confidence (alpha- = 0.05),

z, the z -score of -1.645 for H1’s one-tailed test, and

u, the standard uncertainty of the test, i.e. U/2 = 0.55/2 or 0.275

By calculation, we have the critical value x(bar)  = 13.95%w/w, which, statistically speaking, was not significantly different from 13.5%w/w with 95% confidence.

Assuming the measurement uncertainty remains constant in this measurement region, such 13.95%w/w minus its lower uncertainty U of 0.55%w/w would give 13.40% which has (13.5-13.4) or 0.1%w/w K2O amount below the lower tolerance limit, thus exposing some 0.1/(2×0.55) or 9.1% risk.

When the reported test result of 13.8%w/w has an expanded U of 0.55%w/w, the range of measured values would be 13.25 to 14.35%w/w, indicating that there would be (13.50-13.25) or 0.25%w/w of K2O amount below the lower tolerance limit, thus exposing some 0.25/(2×0.55) or 22.7% risk in claiming conformity to the specification limit with reference to the tolerance limit given.

Visually, we can present these situations in the following sketch with U = 0.55%w/w:

The fertilizer laboratory manager thus has to make an informed decision rule on what level of risk that can be bearable to make a statement of conformity. Even the critical value of 13.95%w/w estimated by the hypothesis testing has an exposure of 9.1% risk instead of the expected 5% error or risk.  Why?

The reason is that the measurement uncertainty was traditionally evaluated by two-tailed (alpha- = 0.025) test under normal probability distribution with a coverage factor of 2 whilst the hypothesis testing was based on the one-tailed (alpha- = 0.05) test with a z-score of 1.645.

To reduce the risk of testing laboratory in issuing statement of conformity to zero, the laboratory manager may want to take a safe bet by setting his critical reporting value as (13.5%+0.55%) or 14.05%w/w so that its lower uncertainty value is exactly 13.5%w/w.  Barring any evaluation error for its measurement uncertainty, this conservative approach will let the test laboratory to have practically zero risk in issuing its conformity statement.

It may be noted that the ISO/IEC 17025:2017 requires the laboratory to communicate with the customers and clearly spell out its decision rule with the clients before undertaking the analytical task.  This is to avoid any unnecessary misunderstanding after issuance of test report with a statement of conformity or non-conformity.