Training and consultancy for testing laboratories.

Posts tagged ‘Median’

Mean or Median?

Histogram

Mean or median? You decide

The mean and median are two of the three kinds of “averages”. The third one is called mode.  Both mean and median are measures of the central tendency of the dataset, but they have different meanings with different advantages and disadvantages in applications.

The sample mean is calculated as the average of all the data, i.e., we add up all the observations and divide by the number of observations.  The median on the other hand partitions the data into two parts such that there is an equal number of observations on either side of the median. So, if we have a set of 5 data arranged in ascending order, the middle value on the 3rd place is the median.  However, if we have a set of 6 data in ascending order, then the average of the 3rd and 4th data is the median.

One important advantage of the median is that it is not influenced by extreme values (or outliers statistically speaking) in the dataset. Only either the middle observation or average of the two middle observations is used in the calculation, whilst the actual values of the remaining data are not considered.  It is commonly used in proficiency testing programs to assess inter-laboratory comparison data. The robust statistics also uses MAD (median absolute deviation) as a measure of the variability of a univariate quantitative data set.

The mean on the other hand is sensitive to all values in the dataset because every observation in the data affects the mean value, and extreme observations can have a substantial influence on the mean calculated.

Generally speaking, the mean value has some very important mathematical properties that make it possible to prove theorems, such as the Central Limit Theorem.  We note that useful results within statistics and inference methods naturally give rise to the mean value as a parameter estimate.

It is much more problematic to prove mathematical results related to the median even though it is more robust to extreme observations.  So, the mean is used for systematic quantitative data, unless there is a situation with extreme values, where the median is used, instead.

Therefore, you have to make a professional judgement on which one is to be used to suit your purpose.

 

How to find quartile values?

many-interesting-ways-to-decide-on-quartile-values

A short note on IDA

a-short-note-on-initial-data-analysis

The Median and the IQR

the-median-and-the-interquartile-range

Robust statistics – MAD method

Robust Statistics – MAD method