### Outlier test statistics in analytical data

**Notes on outlier test statistics in
analytical data**

When an analytical method is repeated several times on a given sample, the measured values nearer to the mean (or average) of the data set tend to occur more often than those found further away from the mean value. This is the characteristic of analytical chemistry following the normal probability distribution and the phenomenon is known to be a measure of central tendency.

However, there are times and again that we notice some extremely low or high value(s) which are visibly distant from the remainder of data. These values can be suspected to be outliers which may be defined as observations in a set of data that appear to be inconsistent with the remainder of that set.

It is obvious that outlying values generally have an appreciable influence on calculated mean value and more influence on calculated standard deviation if they are not examined carefully and removed if necessary.

However we must remember that random variation of analysis does generate occasional values by chance. If so, these values are indeed part of the valid data and should generally be included in any statistical calculations. However undesirable human error or other deviation in the analytical process such as instrument failure may cause outliers to appear from such faulty procedure. Hence, it is important to have the effect of outliers minimized.

To minimize such effect, we have to find ways to identify outliers and distinguish them from chance variation. There are many outlier tests available which allow analysts to inspect suspect data and if necessary correct or remove erroneous values. These test statistics assume underlying a normal distribution and the test sample is relatively homogeneous.

Furthermore, outlier testing needs careful consideration where the population characteristics are not known, or, worse, known to be non-normal. For example, if the data were Poisson distributed, many valid high values might be incorrectly rejected because they appear inconsistent with a normal distribution. It is also crucial to consider whether outlying values might represent genuine features of the population.

Another approach is to use robust statistics which are not greatly affected by the presence of occasional extreme values and will still perform well when no outliers are present.

The outlier tests are aplenty for your disposal: Dixon’s, Grubb’s, Levene’s, Cochran’s, Thompson’s, Bartlett’s, Hartley’s, Brown-Forsythe’s, *etc*. They are quite simple to be applied on a set of analytical data. However, to be meaningful in the outcome, the number of data examined should be large rather than just a few.

Therefore, outlier tests are only to provide us with objective criteria or signal to investigate the cause; usually, outliers should not be removed from the data set solely because of the results of a statistical test. Instead, the tests highlight the need to inspect the data more closely in the first instance.

The general
guidelines for acting on outlier tests on analytical data, based on the outlier
testing and inspection procedure listed in ISO 5725 Part 2 *Accuracy
(trueness and precision) of measurement methods and results — Part 2:
Basic method for the determination of repeatability and reproducibility of a
standard measurement method** *are as follows:

- To test at the 95% and the 99% confidence level
- All outliers should be investigated and any errors corrected
- Outliers significant at the 99% level may be rejected unless there is a technical reason to retain them
- Outliers significant only at the 95% level (normally called ‘stragglers’) should be rejected only if there is an additional technical reason to do so
- Successive testing and rejection are permissible, but not to the extent of rejecting a large proportion of the data.

The above procedure leads to results which are not so seriously biased by rejection of chance extreme values, but are rather relatively insensitive to outliers at the frequency commonly encountered in measurement work. The application of robust statistics might be a better choice.

## Recent Comments