Training and consultancy for testing laboratories. Hypothesis testing – comparison of two means

One of the most important properties of an analytical method is that it should be free from bias.  That is to say that the test result it gives for the amount of analyte is accurate, close to the true value.  This property can be verified by applying the method to a certified reference material or spiked standard solution with known amount of analyte.  We also can verify this by carrying out two parallel experiments to compare their means….

Hypothesis testing – comparison of two experimental means

7 practical steps of hypothesis testing

This is a follow-up of the last blog.  Read on ……

7 Steps of hypothesis testing

Revisiting Hypothesis Testing

Few course participants had expressed their opinions that the subject of hypothesis testing was quite abstract and they have found it hard to grasp its concept and application.  I thought otherwise. Perhaps let’s go through its basics again.

We know the study of statistics can be broadly divided into descriptive statistics and inferential or analytical statistics.   Descriptive statistical techniques (like frequency distributions, mean, standard deviation, variance, central tendency, etc.) are useful for summarizing data obtained from samples, but they also provide tools for more advanced data analysis related to a broader picture on population where the samples are drawn from, through the application of probability theories in sampling distributions and confidential intervals.  We use the analysis of sample data variation collected to infer what the situation of its population parameter is to be.

A hypothesis is an educated guess about something around us, as long as we can put it to test either by experiment or just observations. So, hypothesis testing is a statistical method that is used in making statistical decisions using experimental data.  It is basically an assumption that we make about the population parameter. In the nutshell, we want to:

• make a statement about something
• collect sample data relating to the statement
• if given that the statement is true and the sample outcome is unlikely, we shall realize that the statement probably is not true.

In short, we have to make decisions about the hypothesis. The decisions are to decide if we should accept the null hypothesis or if we should reject the null hypothesis with certain level of significance.  Therefore, every test in hypothesis testing produces a significance value for that particular test.  In hypothesis testing, if the significance value of the test is greater than the predetermined significance level, then we accept the null hypothesis.  If the significance value is less than the predetermined value, then we should reject the null hypothesis.

Let us have a simple illustration.

Assume we want to know if a particular coin is fair.  We can give a statistical statement (null hypothesis, Ho) that it is a fair coin.  The alternative hypothesis, H1 or Ha, of course, is that the coin is not a fair coin.

If we were to toss the coin, say 30 times and got heads 25 times.   We take this as an unlikely outcome given it is a fair coin, we can reject the null hypothesis saying that it is a fair coin.

In the next article, we shall discuss the steps to be taken in carrying out such hypothesis testing with a set of laboratory data.

R and Student’s t-distribution R and Student t distribution

W.S. Gosset (aka Student) and t-distribution

Whilst attending an Eurachem Scientific Workshop on June 14-15, 2018 at Dublin of Ireland, the workshop organizer arranged the Workshop Banquet at the renowned Guinness St. James Gat Brewery, whose one of the employees was William Sealy Gosset, a chemist cum statistician.

Gosset was interested in analyzing quality data obtained small sample size in his routine work on quality control of raw materials, as he noticed that it was neither practical nor economical in analyzing hundreds of data.

At that time, making statistical inferences from small sample-sized data to their population was unthinkable.  The general accepted idea was that if you were to have a large sample size, say well over 30 observations, you could use the Gaussian’s normal distribution to describe your data.

In 1906, Gosset was sent to Karl Pearson’s laboratory at the University College London  on sabbatical.  Pearson then was one of the well known scientific figures of his time, who was later credited with establishing the field of statistics.

At the laboratory, Gosset discovered the “Student’s t-distribution”, which is an important pillar of modern statistics to use small sample-sized data to infer what we could expect from the population out there.  It is the origin of the concept of “statistical significance testing”.

Why didn’t Gosset name the distribution as Gosset’s instead of Student’s?

It is interesting to note that it was because his employer, Guinness objected to his proposal to publish the findings as it did not want the competitors to know their gained advantage in using this unique procedure to select the best varieties of barley and hops for their popular beer in a way that no other business could do.

So finally Gosset published his article on Pearson’s journal Biometrika in 1908 under the pseudonym “Student”, leading to the famous “Student’s t-distribution”.

In statistics and probability studies, the t-distribution is a probability distribution in dealing with a normally distributed population whilst the sample size is not large. It uses sample standard deviation (s) to estimate the population standard deviation (s) which is unknown. For small samples, the confidence limits of the population mean are given by: As the story goes, Gosset’s published paper was then mostly ignored by the statistical researchers until a young mathematician called R.A, Fisher discovered its importance and popularized it, particularly in estimating the random chance for considering a result “significant”.

Today, the t-distribution is routinely used as t- statistic tests for checking results for significance bias from true value, or for comparing measurements two sets of results and their means, and is also important for calculating confidence intervals.

This t-distribution is symmetric and resembles the normal distribution except for rather stronger “tails” due to more spread out because of the extra variability in smaller sample size.

Confidence intervals- How many measurements should you take? Confidence intervals – how many measurements to take