Training and consultancy for testing laboratories.

Posts tagged ‘hypothesis testing’

Decision rule in conformance testing with a given tolerance limit

Today there is a dilemma for an ISO/IEC 17025 accredited laboratory service provider in issuing a statement of conformity with specification to the clients after testing, particularly when the analysis result of the test sample is close to the specified value with its upper or lower measurement uncertainty crossing over the limit. The laboratory manager has to decide on the level of risk he is willing to take in stating such conformity.

However, there are certain trades which buy goods and commodities with a given tolerance allowance against the buying specification. A good example is in the trading of granular or pelletized compound fertilizers which contain multiple primary nutrients (e.g. N, P, K) in each individual granule.  A buyer usually allows some permissible 2- 5% tolerance on the buying specification as a lower limit to the declared value to allow variation in the manufacturing process. Some government departments of agriculture even allow up to a lower 10% tolerance limit in their procurement of compound fertilizers which will be re-sold to their farmers with a discount.

Given the permissible lower tolerance limit, the fertilizer buyer has taken his own risk of receiving a consignment that might be below his buying specification. This is rightly pointed out in the Eurolab’s Technical Report No. 01/2017 “Decision rule applied to conformity assessment” that by giving a tolerance limit above the upper specification limit, or below the lower specification limit, we can classify this as the customer’s or consumer’s risk.  In hypothesis testing context, we say this is a type II (beta-) error. 

What will be the decision rule of test laboratory in issuing its conformity statement under such situation?

Let’s discuss this through an example. 

A government procurement department purchased a consignment of 3000 bags of granular compound fertilizer with a guarantee of available plant nutrients expressed as a percentage by weight in it, e.g. a NPK of 15-15-15 marking on its bag indicates the presence of 15% nitrogen (N), 15% phosphorus (P2O5) and 15% potash (K2O) nutrients.  Representative samples were drawn and analyzed in its own fertilizer laboratory. 

In the case of potash (K2O) content of 15% w/w, a permissible tolerance limit of 13.5% w/w is stated in the tender document, indicating that a fertilizer chemist can declare conformity at this tolerance level. The successful supplier of the tender will be charged a calculated fee for any specification non-conformity.

Our conventional approach of decision rules has been based on the comparison of single or interval of conformity limits with single measurement results.  Today, we have realized that each test result has its own measurement variability, normally expressed as measurement uncertainty with 95% confidence level.

Therefore, it is obvious that the conventional approach of stating conformity based on a single measurement result has exposed the laboratory to a 50% risk of having the true (actual) value of test parameter falling outside the given tolerance limit, rendering it to be non-conformance! Is the 50% risk bearable by the test laboratory?

Let say the average test result of K2O content of this fertilizer sample was found to be 13.8+0.55%w/w.  What is the critical value for us in deciding on conformity in this particular case with the usual 95% confidence level? Can we declare the result of 13.8%w/w found to be in conformity with specification referencing to its given tolerance limit of 13.5%w/w?

Let us first see how the critical value is estimated.  In hypothesis testing, we make the following hypotheses:

Ho :  Target tolerance value > 13.5%w/w

H1 :  Target tolerance value < 13.5%w/w

Use the following equation with an assumption that the variation of the laboratory analysis result agrees with the normal or Gaussian probability distribution:

where

mu is the tolerance value for the specification, i.e. 13.5%, 

x(bar) , the critical value with 95% confidence (alpha- = 0.05),   

z, the z -score of -1.645 for H1’s one-tailed test, and

u, the standard uncertainty of the test, i.e. U/2 = 0.55/2 or 0.275

By calculation, we have the critical value x(bar)  = 13.95%w/w, which, statistically speaking, was not significantly different from 13.5%w/w with 95% confidence.

Assuming the measurement uncertainty remains constant in this measurement region, such 13.95%w/w minus its lower uncertainty U of 0.55%w/w would give 13.40% which has (13.5-13.4) or 0.1%w/w K2O amount below the lower tolerance limit, thus exposing some 0.1/(2×0.55) or 9.1% risk.

When the reported test result of 13.8%w/w has an expanded U of 0.55%w/w, the range of measured values would be 13.25 to 14.35%w/w, indicating that there would be (13.50-13.25) or 0.25%w/w of K2O amount below the lower tolerance limit, thus exposing some 0.25/(2×0.55) or 22.7% risk in claiming conformity to the specification limit with reference to the tolerance limit given.

Visually, we can present these situations in the following sketch with U = 0.55%w/w:

The fertilizer laboratory manager thus has to make an informed decision rule on what level of risk that can be bearable to make a statement of conformity. Even the critical value of 13.95%w/w estimated by the hypothesis testing has an exposure of 9.1% risk instead of the expected 5% error or risk.  Why?

The reason is that the measurement uncertainty was traditionally evaluated by two-tailed (alpha- = 0.025) test under normal probability distribution with a coverage factor of 2 whilst the hypothesis testing was based on the one-tailed (alpha- = 0.05) test with a z-score of 1.645.

To reduce the risk of testing laboratory in issuing statement of conformity to zero, the laboratory manager may want to take a safe bet by setting his critical reporting value as (13.5%+0.55%) or 14.05%w/w so that its lower uncertainty value is exactly 13.5%w/w.  Barring any evaluation error for its measurement uncertainty, this conservative approach will let the test laboratory to have practically zero risk in issuing its conformity statement. 

It may be noted that the ISO/IEC 17025:2017 requires the laboratory to communicate with the customers and clearly spell out its decision rule with the clients before undertaking the analytical task.  This is to avoid any unnecessary misunderstanding after issuance of test report with a statement of conformity or non-conformity.

Dilemmas in making decision rules for conformance testing

Dilemmas in making decision rules for conformance testing

In carrying out routine testing on samples of commodities and products, we normally encounter requests by clients to issue a statement on the conformity of the test results against their stated specification limits or regulatory limits, in addition to standard reporting.

Conformance testing, as the term suggests, is testing to determine whether a product or just a medium complies with the requirements of a product specification, contract, standard or safety regulation limit.  It refers to the issuance of a compliance statement to customers by the test / calibration laboratory after testing.  Examples of statement can be:  Pass/Fail; Positive/Negative; On specification/Off specification. 

Generally, such statements of conformance are issued after testing, against a target value with a certain degree of confidence.  This is because there is always an element of measurement uncertainty associated with the test result obtained, normally expressed as X +/- U with 95% confidence.

It has been our usual practice in all these years to make direct comparison of measurement value with the specification or regulatory limits, without realizing the risk involved in making such conformance statement.

For example, if the specification minimum limit of the fat content in a product is 10%m/m, we would without hesitation issue a statement of conformity to the client when the sample test result is reported exactly as 10.0%m/m, little realizing that there is a 50% chance that the true value of the analyte in the sample analyzed lies outside the limit!  See Figure 1 below.

In here, we might have made an assumption that the specification limit has taken measurement uncertainty in account (which is not normally true), or, our measurement value has zero uncertainty which is also untrue. Hence, by knowing the fact that there is a presence of uncertainty in all measurements, we are actually taking some 50% risk to allow the actual true value of the test parameter to be found outside the specification while making such conformity statement.

Various guides published by learned professional organizations like ILAC, EuroLab and Eurachem have suggested various manners to make decision rules for such situation. Some have proposed to add a certain estimated amount of error to the measurement uncertainty of a test result and then state the result as passed only when such error added with uncertainty is more than the minimum acceptance limit.  Similarly, a ‘fail’ statement is to be made for a test result when its uncertainty with added estimated error is less than the minimum acceptance limit. 

The aim of adding an additional estimated error is to make sure “safe” conclusions concerning whether measurement errors are within acceptable limits.   See Figure 2 below.

Others have suggested to make decision consideration only based on the measurement uncertainty found associated with the test result without adding an estimated error.  See Figure 3 below:

This is to ensure that if another lab is tasked with taking the same measurements and using the same decision rule, they will come to the similar conclusion about a “pass” or “fail”, in order to avoid any undesirable implication.

However, by doing so, we are faced with a dilemma on how to explain to the client who is a layman on the rationale to make such pass/fail statement.

For discussion sake, let say we have got a mean result of the fat content as 10.30 +/- 0.45%m/m, indicating that the true value of the fat lies between the range of 9.85 – 10.75%m/m with 95% confidence. A simple calculation tells us that there is a 15% chance that the true value is to lie below the 10%m/m minimum mark.  Do we want to take this risk by stating the result has conformed with the specification? In the past, we used to do so.

In fact, if we were to carry out a hypothesis (or significance) testing, we would have found that the mean value of 10.30%m/m found with a standard uncertainty of 0.225% (obtained by dividing 0.45% with a coverage factor of 2) was not significantly different from the target value of 10.0%m/m, given a set type I error (alpha-) of 0.05.  So, statistically speaking, this is a pass situation.  In this sense, are we safe to make this conformity statement?  The decision is yours!

Now, the opposite is also very true.

Still on the same example, a hypothesis testing would show that an average result of 9.7%m/m with a standard uncertainty of 0.225%m/m would not be significantly different from the target value of 10.0%m/m specification with 95% confidence. But, do you want to declare that this test result conforms with the specification limit of 10.0%m/m minimum? Traditionally we don’t. This will be a very safe statement on your side.  But, if  you claim it to be off-specification, your client may not be happy with you if he understands hypothesis testing. He may even challenge you for failing his shipment.

In fact, the critical value of 9.63%m/m can be calculated by the hypothesis testing for the sample analyzed to be significantly different from 10.0%.  That means any figure lower than 9.63%m/m can then be confidently claimed to be off specification!

Indeed, these are the challenges faced by third party testing providers today with the implementation of new ISO/IEC 17025:2017 standard.

To ‘inch’ the mean measured result nearer to the specification limit from either direction, you may want to review your measurement uncertainty evaluation associated with the measurement. If you can ‘improve’ the uncertainty by narrowing the uncertainty range, your mean value will come closer to the target value. Of course, there is always a limit for doing so.

Therefore you have to make decision rules to address the risk you can afford to take in making such statement of conformance or compliance as requested. Also, before starting your sample analysis and implementing these rules, you must communicate and get a written agreement with your client, as required by the revised ISO/IEC 17025 accreditation standard.

Decision rule and conformity testing

What is conformity testing?

Conformance testing is testing to determine whether a product, system or just a medium complies with the requirements of a product specification, contract, standard or safety regulation limit.  It refers to the issuance of a compliance statement to customers after testing.  Examples are:  Pass/Fail; Positive/Negative; On specs/Off specs, etc. 

Generally, statements of conformance are issued after testing, against a target value of the specification with a certain degree of confidence. It is usually applied in forensic, food, medical pharmaceutical, and manufacturing fields. Most QC laboratories in manufacturing industry (such as petroleum oils, foods and pharmaceutical products) and laboratories of government regulatory bodies regularly check the quality of an item against the stated specification and regulatory safety limits.

Decision rule involves measurement uncertainty

Why must measurement uncertainty be involved in the discussion of decision rule? 

To answer this, let us first be clear about the ISO definition of decision rule.  The ISO 17025:2017 clause 3.7 defines that: “Rule that describes how measurement uncertainty is accounted for when stating conformity with a specified requirement.”

Therefore, decision rule gives a prescription for the acceptance or rejection of a product based on consideration of the measurement result, its uncertainty associated, and the specification limit or limits.  Where product testing and calibration provide for reporting measured values, levels of measurement decision risk acceptable to both the customer and supplier must be prepared. Some statistical tools such as hypothesis testing covering both type I and type II errors are to be applied in decision risk assessment.

Hypothesis testing – comparison of two means

One of the most important properties of an analytical method is that it should be free from bias.  That is to say that the test result it gives for the amount of analyte is accurate, close to the true value.  This property can be verified by applying the method to a certified reference material or spiked standard solution with known amount of analyte.  We also can verify this by carrying out two parallel experiments to compare their means….

Hypothesis testing – comparison of two experimental means

 

7 practical steps of hypothesis testing

This is a follow-up of the last blog.  Read on ……

7 Steps of hypothesis testing

Revisiting hypothesis testing

Revisiting Hypothesis Testing

Few course participants had expressed their opinions that the subject of hypothesis testing was quite abstract and they have found it hard to grasp its concept and application.  I thought otherwise. Perhaps let’s go through its basics again.

We know the study of statistics can be broadly divided into descriptive statistics and inferential or analytical statistics.   Descriptive statistical techniques (like frequency distributions, mean, standard deviation, variance, central tendency, etc.) are useful for summarizing data obtained from samples, but they also provide tools for more advanced data analysis related to a broader picture on population where the samples are drawn from, through the application of probability theories in sampling distributions and confidential intervals.  We use the analysis of sample data variation collected to infer what the situation of its population parameter is to be.

A hypothesis is an educated guess about something around us, as long as we can put it to test either by experiment or just observations. So, hypothesis testing is a statistical method that is used in making statistical decisions using experimental data.  It is basically an assumption that we make about the population parameter. In the nutshell, we want to:

  • make a statement about something
  • collect sample data relating to the statement
  • if given that the statement is true and the sample outcome is unlikely, we shall realize that the statement probably is not true.

In short, we have to make decisions about the hypothesis. The decisions are to decide if we should accept the null hypothesis or if we should reject the null hypothesis with certain level of significance.  Therefore, every test in hypothesis testing produces a significance value for that particular test.  In hypothesis testing, if the significance value of the test is greater than the predetermined significance level, then we accept the null hypothesis.  If the significance value is less than the predetermined value, then we should reject the null hypothesis.

Let us have a simple illustration.

Assume we want to know if a particular coin is fair.  We can give a statistical statement (null hypothesis, Ho) that it is a fair coin.  The alternative hypothesis, H1 or Ha, of course, is that the coin is not a fair coin.

If we were to toss the coin, say 30 times and got heads 25 times.   We take this as an unlikely outcome given it is a fair coin, we can reject the null hypothesis saying that it is a fair coin.

In the next article, we shall discuss the steps to be taken in carrying out such hypothesis testing with a set of laboratory data.

 

 

 

 

Excel functions in Hypothesis Testing

Data analysis allows us to answer questions about the data or about the population that the sample data describes.

When we ask questions like “is the alcohol level in the suspect’s blood sample significantly greater than 50 mg/100 ml?” or “does my newly developed TEST method give the same results as the standard method?”, we need to determine the probability of finding the test data given the truth of a stated hypothesis (e.g. no significant difference) – hence “hypothesis testing” or also known as “significance testing”.

A hypothesis, therefore, is an assumptive statement which might, or might not, be true. We test the truth of a hypothesis, which is known as a null hypothesis, Ho, with parameter estimation (such as mean, µ or standard deviation, s) and a calculated probability for making a decision about whether the hypothesis is to be accepted (high p -value) or rejected (lower p -value) based on a pre-set confidence level, such as p = 0.05 for 95% confidence.

Whilst making a null hypothesis, we must also be prepared for an alternative hypothesis, H1, to fall back in case the Ho is rejected after a statistic test, such as F-test or Student’s t-test. The H1 hypothesis can be one of the following statements:

H1:  sa ≠ sb (2-sided or 2-tailed)

H1:  sa > sb (1- right sided or 1- right tailed)

H1:  sa < sb (1- left sided or 1- left tailed)

Generally a simple hypothesis test is one that determines whether or not the difference between two values is significant.  These values can be means, standard deviations, or variances.  So, for this case, we actually put forward the null hypothesis Ho that there is no real difference between the two s’s, and the observed difference arises from random effects only.  If the probability that the data are consistent with the null hypothesis falling below a pre-determined low value (e.g. p = 0.05 or 0.01), then the hypothesis is rejected at that probability.

For an illustration, let’s say we have obtained a t observed value after the Student’s t-statistic testing. If the p-value calculated is small, then the observed t-value is higher than the t-critical value at the pre-determined p-value. So, we do not believe in the null hypothesis and reject it.  If, on the other hand, the p -value is large, then the observed value of t is quite likely acceptable, being below the critical t-value based on the degrees of freedom at a set confidence level, so we cannot reject the null hypothesis.

We can use the MS Excel built-in functions to find the critical values of F– and t-tests at prescribed probability level, instead of checking them from their respective tables.

In the F-test for p=0.05 and degrees of freedom v = 7 and 6, the following critical one-tail inverse values are found to be the same (4.207) under all the old and new versions of the MS Excel spreadsheet since 2010:

“=FINV(0.05,7,6)”

“=F.INV(0.95,7,6)”

“=F.INV.RT(0.05,7,6)”

But, for the t-test, the old Excel function “=TINV” for the one-tail significance testing has been found to be a bit awkward, because this function giving the t-value has assumed that it is a two-tail probability in its algorithm.

To get a one-tail inverse value, we need to double the probability value, in the form of “=TINV(0.05*2, v)”.  This make explanation to someone with lesser knowledge of statistics difficult to apprehend.

For example, if we want to find a t-value at p=0.05 with v = 5 degrees of freedom, we can have the following options:

=TINV(0.05,5) 2.5705
=TINV(0.05*2,5) 2.0150
=T.INV(0.05,5) -2.0150
=T.INV(0.95,5) 2.0150
=T.INV.2T(0.05*2,5) 2.0150

So, it looks like better to use the new function “=T.INV(0.95,5)” or absolute value of “=T.INV(0.05,5)” for the one-tail test at 95% confidence.

The following thus summarizes the use of T.INV for one- or two-tail hypothesis testing:

  1. To find the t-value for a right-sided or greater than H1 test, use =T.INV(0.95, v)
  2. To find the t-value for a left-sided or less than H1 test, use =T.INV(0.05, v)
  3. To find the t-value for a two-sided H1 test, use =T.INV.2T(0.05, v)