Training and consultancy for testing laboratories.

Excel functions in Hypothesis Testing

Data analysis allows us to answer questions about the data or about the population that the sample data describes.

When we ask questions like “is the alcohol level in the suspect’s blood sample significantly greater than 50 mg/100 ml?” or “does my newly developed TEST method give the same results as the standard method?”, we need to determine the probability of finding the test data given the truth of a stated hypothesis (e.g. no significant difference) – hence “hypothesis testing” or also known as “significance testing”.

A hypothesis, therefore, is an assumptive statement which might, or might not, be true. We test the truth of a hypothesis, which is known as a null hypothesis, Ho, with parameter estimation (such as mean, µ or standard deviation, s) and a calculated probability for making a decision about whether the hypothesis is to be accepted (high p -value) or rejected (lower p -value) based on a pre-set confidence level, such as p = 0.05 for 95% confidence.

Whilst making a null hypothesis, we must also be prepared for an alternative hypothesis, H1, to fall back in case the Ho is rejected after a statistic test, such as F-test or Student’s t-test. The H1 hypothesis can be one of the following statements:

H1:  sa ≠ sb (2-sided or 2-tailed)

H1:  sa > sb (1- right sided or 1- right tailed)

H1:  sa < sb (1- left sided or 1- left tailed)

Generally a simple hypothesis test is one that determines whether or not the difference between two values is significant.  These values can be means, standard deviations, or variances.  So, for this case, we actually put forward the null hypothesis Ho that there is no real difference between the two s’s, and the observed difference arises from random effects only.  If the probability that the data are consistent with the null hypothesis falling below a pre-determined low value (e.g. p = 0.05 or 0.01), then the hypothesis is rejected at that probability.

For an illustration, let’s say we have obtained a t observed value after the Student’s t-statistic testing. If the p-value calculated is small, then the observed t-value is higher than the t-critical value at the pre-determined p-value. So, we do not believe in the null hypothesis and reject it.  If, on the other hand, the p -value is large, then the observed value of t is quite likely acceptable, being below the critical t-value based on the degrees of freedom at a set confidence level, so we cannot reject the null hypothesis.

We can use the MS Excel built-in functions to find the critical values of F– and t-tests at prescribed probability level, instead of checking them from their respective tables.

In the F-test for p=0.05 and degrees of freedom v = 7 and 6, the following critical one-tail inverse values are found to be the same (4.207) under all the old and new versions of the MS Excel spreadsheet since 2010:

“=FINV(0.05,7,6)”

“=F.INV(0.95,7,6)”

“=F.INV.RT(0.05,7,6)”

But, for the t-test, the old Excel function “=TINV” for the one-tail significance testing has been found to be a bit awkward, because this function giving the t-value has assumed that it is a two-tail probability in its algorithm.

To get a one-tail inverse value, we need to double the probability value, in the form of “=TINV(0.05*2, v)”.  This make explanation to someone with lesser knowledge of statistics difficult to apprehend.

For example, if we want to find a t-value at p=0.05 with v = 5 degrees of freedom, we can have the following options:

 =TINV(0.05,5) 2.5705 =TINV(0.05*2,5) 2.0150 =T.INV(0.05,5) -2.0150 =T.INV(0.95,5) 2.0150 =T.INV.2T(0.05*2,5) 2.0150

So, it looks like better to use the new function “=T.INV(0.95,5)” or absolute value of “=T.INV(0.05,5)” for the one-tail test at 95% confidence.

The following thus summarizes the use of T.INV for one- or two-tail hypothesis testing:

1. To find the t-value for a right-sided or greater than H1 test, use =T.INV(0.95, v)
2. To find the t-value for a left-sided or less than H1 test, use =T.INV(0.05, v)
3. To find the t-value for a two-sided H1 test, use =T.INV.2T(0.05, v)

What is data mining?

What is Data Mining?

The Pearson correlation – Significance testing

The Pearson correlation – Testing for significance