measurement uncertainty is important in analytical chemistry?
Conducting a laboratory analysis is to make informed
decisions on the samples drawn. The
result of an analytical measurement can be deemed incomplete without a
statement (or at least an implicit knowledge) of its uncertainty. This is because we cannot make a valid
decision based on the result alone, and nearly all analysis is conducted to
inform a decision.
We know that the uncertainty of a result is a parameter that
describes a range within which the value of the quantity being measured is
expected to lie, taking into account all sources of error, with a stated degree
of confidence (usually 95%). It characterizes
the extent to which the unknown value of the targeted analyte is known after
measurement, taking account of the given information from the measurement.
With a knowledge of uncertainty in hand, we can make the
following typical decisions based on analysis:
Does this particular laboratory have the capacity
to perform analyses of legal and statutory significance?
Does this batch of pesticide formulation contain
less than the maximum allowed concentration of an impurity?
Does this batch of animal feed contain at least
the minimum required concentration of profit (protein + fat)?
How pure is this batch of precious metal?
The figure below shows a variety of instances affecting decisions about compliance with externally imposed limits or specifications. The error bars can be taken as expanded uncertainties, effectively intervals containing the true value of the concentration of the analyte with 95% confidence.
We can make the following observations from the above illustration:
Result A clearly indicates the test
result is below the limit, as even the extremity of the uncertainty interval is
below the limit,
Result B is below the limit but the upper
end of the uncertainty is above the limit, so we not sure if the true value is
below the limit.
Result C is above the limit but the lower
end of the uncertainty is below the limit, so we are not sure that the true
value is above.
What conclusions can we draw from the equal
results D and E? Both results are above the limit but, while D
is clearly above the limit, E is not so because the greater uncertainty
interval extends below the limit.
In short, we have to make decisions on how to act upon
results B, C and E.
What is the level of risk that can be afforded to assume the test result
is in conformity with the stated specification or in compliance with the
By making such a decision rule, we must be serious in the evaluation of measurement uncertainty, making sure that the uncertainty obtained is reasonable. If not, any decision made on conformity or compliance will be meaningless.
Data analysis is a systematic
process examining datasets in order to draw valid
conclusions about the information they contain, increasingly with the aid of specialized
systems and software, leading to discovering useful information to make
informed decisions to verify or disapprove some scientific or business models,
theories or hypotheses.
As a researcher or laboratory analyst, we must have the drive
to obtain quality data in our work. A careful plan in database design and
statistical analysis with variable definitions, plausibility checks, data quality
checks and ability to identifying likely errors in data and resolving data
inconsistencies, etc. has to be established before embarking the full data
collection. More importantly, the plan
should not be altered without agreement of the project steering team in order
to reduce the extent of data dredging or hypothesis fishing leading to false
positive studies. Shortcomings in
initial data analysis may result in adopting inappropriate statistical methods
or making incorrect conclusions.
Our first step of initial data analysis is to check
consistency and accuracy of the data, such as looking up for any outlying data.
This can be visualized through plotting the data against time of data
collection or other independent parameters.
This should be done before embarking on more complex analyses.
After having satisfied that the data are reasonably error-free, we should get familiar with the collected data and examine them for any consistency of data formats, number and patterns of missing data, the probability distributions of its continuous variables, etc. For more advanced initial analysis, decisions have to be made about the way variables are used in further analyses with the aid of data analytics technologies or statistical techniques. These variables can be studied in their raw form, transformed to some standardized format, categorized or stratified into groups for modeling.
and successive dilution in constructing calibration curve
analytical instrument generally needs to be calibrated before measurements made
on prepared sample solutions, through construction of a linear regression
between the analytical responses and the concentrations of the standard analyte
solutions. A linear regression is favored over quadratic or exponential curve
as it incurs minimum error.
in standard calibration is found to be useful if replicates are genuinely
independent. The calibration precision is improved by increasing the number of
replicates, n, and
provides additional checks on the calibration solution preparation and on the
precision of different concentrations.
of its precision can be read from the variance of these calibration points. A
calibration curve might be found to have roughly constant standard deviations
in all these plotted points, whilst others may show a proportional increase in
standard deviation in line with the increase of analyte concentration. The
former behavior is known as “homoscedasticity” and the latter, “heteroscedasticity”.
be noted that increasing the number of independent concentration points has
actually little benefit after a certain extent. In fact, after having six
calibration points, it can be shown that any further increase in the number of
observations in calibration has relatively modest effect on the standard error
of prediction for a predicted x value
unless such number of points increases very substantially, say to 30 which of
course is not practical.
independent replication at each calibration point can be recommended as a
method of improving uncertainties. Indeed, independent replication is
accordingly a viable method of increasing n when
the best performance is desired.
However, replication suffers from an important drawback. Many analysts incline to simply injecting a calibration standard solution twice, instead of preparing duplicate standard solutions separately for the injection. By injecting the same standard solution twice into the analytical instrument, the plotted residuals will appear in close pairs but are clearly not independent. This is essentially useless for improving precision. Worse, it artificially increases the number of freedom for simple linear regression, giving a misleading small prediction interval.
Therefore ideally replicated observations should be entirely independent, using different stock calibration solutions if at all possible. Otherwise it is best to first examine replicated injections to check for outlying differences and then to calculate the calibration based on the mean value of y for each distinct concentration.
There is one side effect of replication that may be useful. If means of replicates are taken, the distribution of errors in the mean tend to be the normal distribution as the number of replicates increases, regardless of parent distribution. The distribution of the mean of as few as 3 replicates is very close to the normal distribution even with fairly extreme departure from normality. Averaging three or more replicates can therefore provide more accurate statistical inference in critical cases where non-normality is suspected.
pattern of calibration that we usually practice is doing a serial dilution,
resulting in logarithmically decreasing concentrations (for example, 16, 8, 4. 2
and 1 mg/L). This is simple and has the advantage of providing a high upper
calibrated level, which may be useful in analyzing routine samples that
occasionally show high values.
However, this layout has several disadvantages. First, errors in dilution are multiplied at each step, increasing the volume uncertainties, and perhaps worse, increasing the risk of any undetected gross dilution error (especially if the analyst commits the cardinal sin of using one of the calibration solutions as a QC sample as well!).
the highest concentration point has high leverage, affecting both the gradient
of the line plotted; errors at the high concentration will cause potentially
large variation in results.
Thirdly, departure in linearity are easier to detect with fairly even spaced points. In general, therefore, equally spaced calibration points across the range of interest should be much preferred.
Sampling is a
process of selecting a portion of material (statistically termed as
‘population’) to represent or provide information about a larger body or
material. It is essential for the whole
testing and calibration processes.
The old ISO/IEC 17025:2005 standard defines sampling as “a defined
procedure whereby a part of a substance, material or product is taken to
provide for testing or calibration of a representative sample of the whole. Sampling may also be required by the
appropriate specification for which the substance, material or product is to be
tested or calibrated. In certain cases (e.g. forensic analysis), the sample may
not be representative but is determined by availability.”
In other words, sampling, in general, should be carried out in random manner but so-called judgement sampling is also allowed in specific cases. This judgement sampling approach involves using knowledge about the material to be sampled and about the reason for sampling, to select specific samples for testing. For example, an insurance loss adjuster acting on behalf of a cargo insurance company to inspect a shipment of damaged cargo during transit will apply a judgement sampling procedure by selecting the worst damaged samples from the lot in order to determine the cause of damage.
2. Types of samples to be differentiated
sample Random sample(s) taken
from the material in the field. Several random
samples can be drawn and compositing the samples is done in the field before
sending it to the laboratory for analysis
Laboratorysample Sample(s) as prepared for sending to the laboratory, intended for inspection or testing.
Test sample A sub-sample, which is a selected portion of the laboratory sample, taken for laboratory analysis.
3. Principles of sampling
speaking, random sampling is a method of selection whereby each possible member
of a population has an equal chance of being selected so that unintended bias
can be minimized. It provides an unbiased estimate of the population parameters
on interest (e.g. mean), normally in terms of analyte concentration.
refers to something like “sufficiently like the population to allow inferences
about the population”. By taking a
single sample through any random process may not be necessary to have
representative composition of the bulk.
It is entirely possible that the composition of a particular sample
randomly selected may be completely unlike the bulk composition, unless the
population is very homogeneous in its composition distribution (such as
the saying that the test result is no better than the sample that it is based
upon. Sample taken for analysis should
be as representative of the sampling target as possible. Therefore, we must take the sampling variance
into serious consideration. The larger the sampling variance, the more likely
it is that the individual samples will be very different from the bulk.
practice, we must carry out representative sampling which involves obtaining
samples which are not only unbiased, but which also have sufficiently small
variance for the task in hand. In other words, we need to decide on the number
of random samples to be collected in the field to provide smaller sampling
variance in addition to choosing randomization procedures that provide unbiased
results. This is normally decided upon
information such as the specification limits and uncertainty expected.
is useful to combine a collection of field samples into a single homogenized
laboratory sample for analysis. The measured value for the composite laboratory
sample is then taken as an estimate of the mean value for the bulk material.
It is important to note also that the importance of a sound sub-sampling process in the laboratory cannot be over emphasized. Hence, there must be a SOP prepared to guide the laboratory analyst to draw the test sample for measurement from the sample that arrives at the laboratory.
4. Sampling uncertainty
uncertainty is recognized as an important contributor to the measurement uncertainty
associated with the reported results.
In the analysis of variance (ANOVA), we study the variations
of between- and within-groups in terms of their respective mean squares (MS)
which are calculated by dividing each sum of squares by its associated degrees
of freedom. The result, although termed
a mean square, is actually a measure of variance, which is the squared standard
The F-ratio is
then obtained as the result of dividing MS(between) and MS(within). Even if the population means are all equal to
one another, you may get an F-ratio
which is substantially larger than 1.0, simply because of sampling error to
cause a large variation between the samples (group). Such F-value may get even larger than the F-critical value from the F-probability
distribution at given degrees of freedom associated with the two MS at a
set significant Type I (alpha-) level of error.
Indeed, by referring to the distribution of F-ratios with different degrees of
freedom, you can determine the probability of observing an F-ratio as large as the one you calculate even if the populations
have the same mean values.
So, the P-value is
the probability of obtaining an F-ratio
as large or larger than the one observed, assuming that the null hypothesis of
no difference amongst group means is true.
However, under the ground rules that have been followed for
many years by inferential statistics, this probability must be equal to, or
smaller than, the significant alpha- (type I) error level that we have
established at the start of the experiment, and such alpha-level is normally
set at 0.05 (or 5%) for test laboratories. Using this level of significance, there is, on
average, a 1 in 20 chance that we shall reject the null hypothesis in our
decision when it is in fact true.
Hence, if we were to analyze a set of data by ANOVA and our P-value calculated was 0.008, which is
much smaller than alpha-value of 0.05, we can confidently say that we would be
committing just an error or risk of 0.8% to reject the null hypothesis which is
true. In other words, we are 99.2%
confident to reject the hypothesis which states no difference among the group
In a one-factor ANOVA (Analysis of Variance), we check the replicate results of each group under a sole factor. For example, we may be evaluating the performance of four analysts who have each been asked to perform replicate determinations of identical samples drawn from a population. Thus, the factor is “Analyst” of which we have four so-called “groups” – Analyst A, Analyst B, Analyst C and Analyst D. The test data are laid out in a matrix with the four groups of the factor in each column and the replicates (five replicates in this example) in each row, as shown in the following table:
We then use
the ANOVA principal inferential statistic, the F-test, to decide whether differences between treatment’s mean values
are real in the population, or simply due to random error in the sample
analysis. The F-test studies the ratio of two estimates of the variance. In this case, we use the variance between groups, divided by the variance within groups.
The MS Excel
installation files have included an add-in that performs several statistical
analyses. It is called “Data Analysis”
add-in. If you do not find it labelled
in the Ribbon of your spreadsheet, you can make it available to Excel by installing
it with your Excel 2016 version by clicking “File -> Options -> Add-ins
-> Analysis ToolPak” and then pressing “Enter”. You should now find Data
Analysis in the Analysis Group on your Excel’s Data tab.
We can then
click the “Data Analysis” button to look for “Anova: Single Factor” entry and
start its dialog box accordingly.
For the above set of data from Analysts A to D, the Excel’s Data Analysis gives the following outputs, and we shall then verify the outcomes through manual calculations based on the first principles:
We know that
variances are defined and calculated using the squared deviations of a single
variable. In Excel, we can use the formula
“=DEVSQ( )” to calculate each group of results. Also, we can use “=AVERAGE( )” function to
calculate the individual mean of each Analyst.
In this univariate
ANOVA example, we squared the deviation of a value from a mean, and the word “deviation”
referred to the difference between each measurement result from the mean of the
figure shows the manual calculations using the Excel formulae agree very well
with the Excel’s calculated data by its Data Analysis package. In words, we have:
Sum of Squares BetweenSSB uses DEVSQ( ) to take the sum of the
squared deviations of each group (Analyst) mean from the grand mean, and
multiplies by the number of replicates in each group;
Sum of Squares WithinSSW uses the replicates of each DEVSQ( )
in J13:M13 to get the sum of the squared deviations of each measured value from
the mean of its groups; then, the results from DEVSQ( ) are totaled;
Sum of Squares TotalSST uses DEVSQ( ) to return the sum of
the squared deviations of each measurement from the grand mean value. We can also just add up SSB and SSW to give SST.
we can also verify the Excel’s calculations of the F-value, P-value and the F critical value by using the various
formulae as shown above.
normally set the level of significance at the P = 0.05 (or 5%), meaning that we are prepared to make a 5% error
in rejecting the null hypothesis which states that there are no difference amongst
the mean values of these four Analysts.
The calculated P-value of 0.008
indicates that our risk of rejecting the null hypothesis is only at a low 0.8%.
instrumental analysis, there must be a measurement model, an equation that
relates the amounts to be measured to the instrument response such as
absorbance, transmittance, peak areas, peak heights, potential current, etc. From this model, we can then derive the
It is our
usual practice to perform the experiment in such as a way as to fix influence
standard concentration of the measurand and the instrument response in a simple
linear relationship, i.e.,
y = a + bx ………. 
y is the indication of the instrument
(i.e., the instrument response),
is the independent
variable (i.e., mostly for our purpose, the concentration of the measurand)
a and b are the coefficients of the model,
known as the intercept and slope (or gradient) of the curve, respectively.
for a number of xivalues, we will have the corresponding instrument
responses, yi. We then fit the above model of
equation to the data.
any particular instrumental measurement of
yiwill be subject to measurement error
(ei), that is,
yi = a + bxi +
ei …….. 
To get this
linear model, we have to find a line that is best fit for the data points that
we have obtained experimentally. We use the ordinary least square (OLS) approach, which chooses the model
parameters that minimize the residual sum of squares (RSS) of the predicted y values versus the actual or experimental y values. The
residual (or sometimes called error), in this case, means the difference
between the predicted yivalue derived from the above equation and the
experimental yi value.
So, if the linear
equation model is correct, the sum of all the differences from all the points (x, y) on the plot should be
arithmetically equal to zero.
It must be
stressed however, that for the sake of the above statement to be true, we make
an important assumption, i.e., the uncertainty of the independent variable, xi, is very much less than in the
instrument response, hence, only one error term eiin yi is considered due to this
uncertainty which is sufficiently small to be neglected. Such assumption is indeed valid for our laboratory
analytical purposes and the estimation process of measurement error is then
very much simplified.
another important assumption made in this OLS method?
It is that the data are known to be homoscedastic, which means that the errors in y are assumed to be independent of the concentration. In other words, the variance of y remains constant and does not change for each xivalue or for a range of x values. This also means that all the points have equal weight when the slope and intercept of the line are calculated. The following plot illustrates this important point.
However, in many of our chemical analysis, this assumption is not likely to be valid. In fact, many data are heteroscedastic, i.e. the standard deviation of the y-values increases with the concentration of the analyte, rather than having the constant value of variation at all concentrations. In other words, the errors that are approximately proportional to the analyte concentration. In fact, we find their relative standard deviations which are standard deviations divided by the mean values are roughly constant. The following plot illustrates this particular scenario.
case, the weighted regression method
is to be applied. The regression line must be calculated to give additional
weight to those points where the errors are smallest, i.e. it is important for
the calculated line to pass close to such points than to pass close to the
points representing higher concentrations with the largest errors.
achieved by giving each point a weighting inversely proportional to the
corresponding y-direction variance, si2.
Without going through details of its calculations which can be quite
tedious and complex as compared with those of the unweighted ones, , it is
suffice to say that in our case of instrumental calibration which normally sees
the experimental points fit a straight line very well, we would find the slope
(b) and y-intercept (a) of the weighted line are remarkably similar to those
of the unweighted line, and the results of the two approaches give very similar
values for the concentrations of samples within the linearity of the
So, does it
mean that one the face of it, the weighted regression calculations have little
value to us?
The answer is a No.
In addition to providing results very similar to those obtained from the simpler unweighted regression method, we find values in getting more realistic results on the estimation of the errors or confidence limits of those sample concentrations under study. It can be shown by calculations that we will have narrower confidence limits at low concentrations in the weighted regression and its confidence limit increases with increasing instrumental signals, such as absorbance. A general form of the confidence limits for a concentration determined using a weighted regression line is show in the sketch below:
observations emphasize the particular importance of using weighted regression
when the results of interest include those at low concentrations. Similarly, detection limits may be more realistically
assessed using the intercept and standard deviation obtained from a weighted
there is a dilemma for an ISO/IEC 17025 accredited laboratory service provider
in issuing a statement of conformity with specification to the clients after
testing, particularly when the analysis result of the test sample is close to
the specified value with its upper or lower measurement uncertainty crossing
over the limit. The laboratory manager has to decide on the level of risk he is
willing to take in stating such conformity.
there are certain trades which buy goods and commodities with a given tolerance
allowance against the buying specification. A good example is in the trading of
granular or pelletized compound fertilizers which contain multiple primary nutrients
(e.g. N, P, K) in each individual granule.
A buyer usually allows some permissible 2- 5% tolerance on the buying
specification as a lower limit to the declared value to allow variation in the
manufacturing process. Some government departments of agriculture even allow up
to a lower 10% tolerance limit in their procurement of compound fertilizers
which will be re-sold to their farmers with a discount.
the permissible lower tolerance limit, the fertilizer buyer has taken his own
risk of receiving a consignment that might be below his buying specification. This
is rightly pointed out in the Eurolab’s Technical Report No. 01/2017 “Decision rule applied to conformity
assessment” that by giving a tolerance limit above the upper specification
limit, or below the lower specification limit, we can classify this as the
customer’s or consumer’s risk. In
hypothesis testing context, we say this is a type II (beta-) error.
What will be the decision rule of test
laboratory in issuing its conformity statement under such situation?
discuss this through an example.
government procurement department purchased a consignment of 3000 bags of
granular compound fertilizer with a guarantee of available plant nutrients
expressed as a percentage by weight in it, e.g. a NPK of 15-15-15 marking on
its bag indicates the presence of 15% nitrogen (N), 15% phosphorus (P2O5)
and 15% potash (K2O) nutrients.
Representative samples were drawn and analyzed in its own fertilizer
the case of potash (K2O) content of 15% w/w, a permissible tolerance
limit of 13.5% w/w is stated in the tender document, indicating that a
fertilizer chemist can declare conformity at this tolerance level. The
successful supplier of the tender will be charged a calculated fee for any
conventional approach of decision rules has been based on the comparison of
single or interval of conformity limits with single measurement results. Today, we have realized that each test result
has its own measurement variability, normally expressed as measurement
uncertainty with 95% confidence level.
it is obvious that the conventional approach of stating conformity based on a
single measurement result has exposed the laboratory to a 50% risk of having
the true (actual) value of test parameter falling outside the given tolerance
limit, rendering it to be non-conformance! Is the 50% risk bearable by the test
say the average test result of K2O content of this fertilizer sample
was found to be 13.8+0.55%w/w. What
is the critical value for us in deciding on conformity in this particular case
with the usual 95% confidence level? Can we declare the result of 13.8%w/w found
to be in conformity with specification referencing to its given tolerance limit
us first see how the critical value is estimated. In hypothesis testing, we make the following
Ho : Target tolerance value > 13.5%w/w
H1 : Target tolerance value < 13.5%w/w
the following equation with an assumption that the variation of the laboratory
analysis result agrees with the normal or Gaussian probability distribution:
mu is the tolerance value for the specification, i.e. 13.5%,
x(bar) , the critical value with 95% confidence (alpha- = 0.05),
z, the z -score
of -1.645 for H1’s one-tailed test, and
u, the standard uncertainty of the test, i.e. U/2 = 0.55/2 or 0.275
By calculation, we have the critical value x(bar) = 13.95%w/w, which, statistically speaking, was not significantly different from 13.5%w/w with 95% confidence.
the measurement uncertainty remains constant in this measurement region, such
13.95%w/w minus its lower uncertainty
U of 0.55%w/w would give 13.40% which has
(13.5-13.4) or 0.1%w/w K2O amount below the lower tolerance limit,
thus exposing some 0.1/(2×0.55) or 9.1% risk.
the reported test result of 13.8%w/w has an expanded U of 0.55%w/w, the range of measured values
would be 13.25 to 14.35%w/w, indicating that there would be (13.50-13.25) or 0.25%w/w
of K2O amount below the lower tolerance limit, thus exposing some 0.25/(2×0.55)
or 22.7% risk in claiming conformity to the specification limit with reference
to the tolerance limit given.
Visually, we can present these situations in the following sketch with U = 0.55%w/w:
fertilizer laboratory manager thus has to make an informed decision rule on
what level of risk that can be bearable to make a statement of conformity. Even
the critical value of 13.95%w/w estimated by the hypothesis testing has an
exposure of 9.1% risk instead of the expected 5% error or risk. Why?
reason is that the measurement uncertainty was traditionally evaluated by
two-tailed (alpha- = 0.025) test under normal probability distribution with a
coverage factor of 2 whilst the hypothesis testing was based on the one-tailed
(alpha- = 0.05) test with a z-score of 1.645.
To reduce the risk of testing laboratory in issuing statement of conformity to zero, the laboratory manager may want to take a safe bet by setting his critical reporting value as (13.5%+0.55%) or 14.05%w/w so that its lower uncertainty value is exactly 13.5%w/w. Barring any evaluation error for its measurement uncertainty, this conservative approach will let the test laboratory to have practically zero risk in issuing its conformity statement.
may be noted that the ISO/IEC 17025:2017 requires the laboratory to communicate
with the customers and clearly spell out its decision rule with the clients
before undertaking the analytical task. This is to avoid any unnecessary
misunderstanding after issuance of test report with a statement of conformity
in making decision rules for conformance testing
In carrying out routine
testing on samples of commodities and products, we normally encounter requests
by clients to issue a statement on the conformity of the test results against
their stated specification limits or regulatory limits, in addition to standard
Conformance testing, as
the term suggests, is testing to determine whether a product or just
a medium complies with the requirements of a product specification, contract,
standard or safety regulation limit. It
refers to the issuance of a compliance statement to customers by the test /
calibration laboratory after testing.
Examples of statement can be:
Pass/Fail; Positive/Negative; On specification/Off specification.
Generally, such statements of conformance are issued after testing, against a target value with a certain degree of confidence. This is because there is always an element of measurement uncertainty associated with the test result obtained, normally expressed as X +/- U with 95% confidence.
It has been our usual practice
in all these years to make direct
comparison of measurement value with the specification or regulatory limits,
without realizing the risk involved in making such conformance statement.
For example, if the
specification minimum limit of the fat content in a product is 10%m/m, we would
without hesitation issue a statement of conformity to the client when the
sample test result is reported exactly as 10.0%m/m, little realizing that there
is a 50% chance that the true value of the analyte in the sample analyzed lies outside the limit! See Figure 1 below.
In here, we might have made an
assumption that the specification limit has taken measurement uncertainty in
account (which is not normally true), or, our measurement value has zero uncertainty which is also untrue.
Hence, by knowing the fact that there is a presence of uncertainty in all
measurements, we are actually taking some 50% risk to allow the actual true
value of the test parameter to be found outside the specification while making
such conformity statement.
Various guides published by learned professional organizations like ILAC, EuroLab and Eurachem have suggested various manners to make decision rules for such situation. Some have proposed to add a certain estimated amount of error to the measurement uncertainty of a test result and then state the result as passed only when such error added with uncertainty is more than the minimum acceptance limit. Similarly, a ‘fail’ statement is to be made for a test result when its uncertainty with added estimated error is less than the minimum acceptance limit.
The aim of adding an
additional estimated error is to make sure “safe” conclusions concerning
whether measurement errors are within acceptable limits. See
Figure 2 below.
Others have suggested to make
decision consideration only based on the measurement uncertainty found
associated with the test result without adding an estimated error. See Figure 3 below:
This is to ensure that if
another lab is tasked with taking the same measurements and using the same
decision rule, they will come to the similar conclusion about a “pass” or
“fail”, in order to avoid any undesirable implication.
However, by doing so, we are
faced with a dilemma on how to explain to the client who is a layman on the
rationale to make such pass/fail statement.
For discussion sake, let say we have got a mean result of the fat content as 10.30 +/- 0.45%m/m, indicating that the true value of the fat lies between the range of 9.85 – 10.75%m/m with 95% confidence. A simple calculation tells us that there is a 15% chance that the true value is to lie below the 10%m/m minimum mark. Do we want to take this risk by stating the result has conformed with the specification? In the past, we used to do so.
In fact, if we were to carry
out a hypothesis (or significance) testing, we would have found that the mean
value of 10.30%m/m found with a standard uncertainty of 0.225% (obtained by
dividing 0.45% with a coverage factor of 2) was not significantly different
from the target value of 10.0%m/m, given a set type I error (alpha-) of
0.05. So, statistically speaking, this
is a pass situation. In this sense, are we
safe to make this conformity
statement? The decision is yours!
Now, the opposite is also very true.
Still on the same example, a hypothesis testing would show that an average result of 9.7%m/m with a standard uncertainty of 0.225%m/m would not be significantly different from the target value of 10.0%m/m specification with 95% confidence. But, do you want to declare that this test result conforms with the specification limit of 10.0%m/m minimum? Traditionally we don’t. This will be a very safe statement on your side. But, if you claim it to be off-specification, your client may not be happy with you if he understands hypothesis testing. He may even challenge you for failing his shipment.
In fact, the critical value of
9.63%m/m can be calculated by the hypothesis testing for the sample analyzed to
be significantly different from 10.0%.
That means any figure lower than 9.63%m/m can then be confidently
claimed to be off specification!
Indeed, these are the challenges faced by third party testing providers today with the implementation of new ISO/IEC 17025:2017 standard.
To ‘inch’ the mean measured result nearer to the specification limit from either direction, you may want to review your measurement uncertainty evaluation associated with the measurement. If you can ‘improve’ the uncertainty by narrowing the uncertainty range, your mean value will come closer to the target value. Of course, there is always a limit for doing so.
Therefore you have to make decision rules to address the risk you can afford to take in making such statement of conformance or compliance as requested. Also, before starting your sample analysis and implementing these rules, you must communicate and get a written agreement with your client, as required by the revised ISO/IEC 17025 accreditation standard.