Training and consultancy for testing laboratories.

Archive for the ‘R Language’ Category

R evaluation of Measurement uncertainty

At the recent Eurachem/PUC ISO 17025 training course in Nicosia, Cyprus on 20-21 February 2020, I had learnt something new from Dr Stephen Ellison’s presentation.

There is a measurement uncertainty package in the R Language, named “metRology”.  You can download this library when you are in the R environment.

For example, if we were asked to evaluate the uncertainty of the following expression:

expr = A + 2xB + 3xC + D/2

where A = 1, B = 3, C=2, D=11.  The sensitive coefficients, c’s, from the above expression are thus 1, 2, 3 and ½ for A, B, C and D, respectively.

Assuming the standard uncertainties of these parameters are constant at 1/10th of their values, the following steps demonstrate how the combined standard uncertainty can be evaluated.

> library(“metRology”)

Attaching package: ‘metRology’

The following objects are masked from ‘package:base’:

    cbind, rbind

> expr<-expression(A+B*2+C*3+D/2)

> x=list(A=1,B=3,C=2,D=11)

> u=lapply(x,function(x) x/10)

> u

$A

[1] 0.1

$B

[1] 0.3

$C

[1] 0.2

$D

[1] 1.1

>

> u.expr<-uncert(expr,x,u,method=”NUM”)

> u.expr

Uncertainty evaluation

Call:

  uncert.expression(obj = expr, x = x, u = u, method = “NUM”)

Expression: a + b * 2 + C * 3 + D/2

Evaluation method:  NUM

Uncertainty budget:

     x    u      c     u.c

A   1   0.1   1.0   0.10

B   3   0.3   2.0   0.60

C   2   0.2   3.0   0.60

D  11  1.1   0.5  0.55

   y:  18.5

u(y):  1.01612

R in testing sample variances

R

Before carrying out a statistic test to compare two sample means in a comparative study, we have to first test whether the sample variances are significantly different or not. The inferential statistic test used is called Fisher’s F ratio-test devised by Sir Ronald Fisher, a famous statistician.  It is widely used as a test of statistical significance.

R and F-test

 

R and Student’s t-distribution

1526367075598

R and Student t distribution

 

Application of R in standardizing normal distribution

R

In the last blog, we discussed how to use R to plot a normal distribution with actual data in hand.  Surely there are plenty of different possible normal distribution since the mean value can be anything at all, and so can the standard deviation. Therefore, it will be useful if we can find a way to standardize the normal distribution for our convenience when several normal distributions can be compared on the same basis…..

Using R to standardize normal distribution.docx

 

Using R in Normal Distribution study

R

Using R to study the Normal Probability Distribution

 

R computations with normal distribution

Std normal distribution density A

R computations with normal distributions

There are various R functions which are useful for computation with normal distributions, such as pnorm( ), qnorm( ), and dnorm( ).

The pnorm( ) function gives the cumulative distribution function, and the alphabet ‘p’ stands for probability.  The qnorm( ) is for quantiles whilst the dnorm( ) function, the density.

Let’s use the statistical notation for normal distribution: X ~ N(µ,sigma2).  We shall illustrate the usage of these R functions.

R function pnorm( )

For example, let X ~ N(8,4), then

(a)  the probability P(X < 2) can be computed via pnorm( ) in several different ways:

> pnorm(2,mean=8,sd=2)  #P(X<=2) in N(8,4)

[1] 0.001349898

> pnorm(2,8,2)  #P(X<=2) in N(8,4) simplified

[1] 0.001349898

(b)  the probability P(X < 1.96) for x ~ N(8,4) by R language is:

> pnorm(1.96,8,2)  #P(X<=1.96) in N(8,4)

[1] 0.001263873

Remember that for f(1.96) = 0.975 and f(1.645) = 0.950, respectively from the statistics table, the R gives us the same answers:

> pnorm(1.96,0,1)  #P(X<=1.96) in N(0,1)

[1] 0.9750021

> pnorm(1.645,0,1)  #P(X<=1.645) in N(0,1)

[1] 0.9500151

> pnorm(1.645)  #P(X<=1.645) in N(0,1) simplified

[1] 0.9500151

>

And, when P(X < -1.645), the R result indicates the area on the left hand side of the normal distribution curve:

> pnorm(-1.645)  #P(X<=-1.645) in N(0,1) simplified

[1] 0.04998491

>

 R function qnorm( )

In layman’s language, a quantile is where a series of sample data is sub-divided into equal proportions. In statistics, we divide a probability distribution into areas of equal probability. The simplest division that can be envisioned is into two equal halves, i.e., 50%.

The R function: qnorm( ) is used to compute the quantiles for the standard normal distribution using its density function f.

For example,

> qnorm(0.95)  #95.0% quantile of N(0,1)

[1] 1.644854

> qnorm(0.975)  #97.5% quantile of N(0,1)

[1] 1.959964

>

 R function dnorm( )

The density of a Gaussian formulae for normal distribution can be shown to be close to 0.4 when x = 0.

The R function dnorm(0) indeed gives the same result as below:

> dnorm(0)  # Density of N(0,1) evaluated at x= 0

[1] 0.3989423

>

Further remarks

Like pnorm( ), the functions qnorm( ) and dnorm( ) can also be used for normal distributions with non-zero mean and non-zero standard deviation or variance, simply by supplying the mean and standard deviation as extra arguments.

For example, for the N(8,4) distribution,  the results are self-explanatory:

> qnorm(0.975,8,2)  # 97.5% quantile of N(8,4)

[1] 11.91993

> dnorm(1,8,2)  #Density of N(8,4) at x=1

[1] 0.0004363413

> dnorm(4,8,2)  #Density of N(8,4) at x=4

[1] 0.02699548

>

 

Using R to generate a random sampling table

Sampling 8

The open source R programing language is a free software environment for statistical computing and graphics, and is easy to master. The official website is https://www.r-project.org/ . It can run on a wide variety of UNIX platforms, Windows and MacOS.

On September 24, 2016, this blog site published an article on how to use R to generate random numbers (https://consultglp.com/2016/09/24/how-to-use-r-to-generate-random-numbers/) .   In light of the newly revised ISO/IEC 17025 accreditation standards embracing sampling as another important criterion for technical competence assessment, the random number function of R becomes very handy for cargo surveyors and samplers to prepare their sampling plan on cargo shipment.

We can use the random number function of R to create a random number table to suit the needs in randomly selecting samples for laboratory quality analysis.

For example, there is a shipment of 1000 bags of coffee beans in a warehouse to be surveyed prior to be dispatched to port. The buyer requires a 5% sampling for laboratory quality testing.  That means some 50 bags have to be random selected before composite a portion of each bag into a suitable sized test sample through a quartering sub-sampling process.

The sampling plan, therefore, can be the following process:

1.  Label each bag with a sequential number

2.  Create 50 numbers in a random number table with the R command language:

> RandSampling=sample(500,50)

> dim(RandSampling)=c(10,5)

> RandSampling

[,1]   [,2]   [,3]   [,4]   [,5]

[1,]  154  424   84  486   82

[2,]   78  214  275  498  388

[3,]   93  104  478  148  258

[4,]  229  283   96  479  489

[5,]  487  211  216   59  263

[6,]   94  450   47  201  105

[7,]  330  121  130  276   56

[8,]   11  415  303  240  407

[9,]  427   60   71  142  409

[10,]  101  238  228  441  355

>

3.  Sample a portion (say, 500g) of the coffee beans from the bags with these selected numbers into a large sampling bag.

4.  Conduct a sample quartering process on site to reduce the test sample size to about 2.5 kg before sending to the laboratory for analysis.

 

 

 

Ensuring your random sampling process is really random

Cargo

How to ensure your simple random sampling is really random

R techniques in generating random numbers

r-techniques-in-generating-random-numbers

Using R to perform simple linear regression

using-r-to-perform-simple-linear-regression