## Training and consultancy for testing laboratories.

### R computations with normal distribution

R computations with normal distributions

There are various R functions which are useful for computation with normal distributions, such as pnorm( ), qnorm( ), and dnorm( ).

The pnorm( ) function gives the cumulative distribution function, and the alphabet ‘p’ stands for probability.  The qnorm( ) is for quantiles whilst the dnorm( ) function, the density.

Let’s use the statistical notation for normal distribution: X ~ N(µ,sigma2).  We shall illustrate the usage of these R functions.

R function pnorm( )

For example, let X ~ N(8,4), then

(a)  the probability P(X < 2) can be computed via pnorm( ) in several different ways:

> pnorm(2,mean=8,sd=2)  #P(X<=2) in N(8,4)

[1] 0.001349898

> pnorm(2,8,2)  #P(X<=2) in N(8,4) simplified

[1] 0.001349898

(b)  the probability P(X < 1.96) for x ~ N(8,4) by R language is:

> pnorm(1.96,8,2)  #P(X<=1.96) in N(8,4)

[1] 0.001263873

Remember that for f(1.96) = 0.975 and f(1.645) = 0.950, respectively from the statistics table, the R gives us the same answers:

> pnorm(1.96,0,1)  #P(X<=1.96) in N(0,1)

[1] 0.9750021

> pnorm(1.645,0,1)  #P(X<=1.645) in N(0,1)

[1] 0.9500151

> pnorm(1.645)  #P(X<=1.645) in N(0,1) simplified

[1] 0.9500151

>

And, when P(X < -1.645), the R result indicates the area on the left hand side of the normal distribution curve:

> pnorm(-1.645)  #P(X<=-1.645) in N(0,1) simplified

[1] 0.04998491

>

R function qnorm( )

In layman’s language, a quantile is where a series of sample data is sub-divided into equal proportions. In statistics, we divide a probability distribution into areas of equal probability. The simplest division that can be envisioned is into two equal halves, i.e., 50%.

The R function: qnorm( ) is used to compute the quantiles for the standard normal distribution using its density function f.

For example,

> qnorm(0.95)  #95.0% quantile of N(0,1)

[1] 1.644854

> qnorm(0.975)  #97.5% quantile of N(0,1)

[1] 1.959964

>

R function dnorm( )

The density of a Gaussian formulae for normal distribution can be shown to be close to 0.4 when x = 0.

The R function dnorm(0) indeed gives the same result as below:

> dnorm(0)  # Density of N(0,1) evaluated at x= 0

[1] 0.3989423

>

Further remarks

Like pnorm( ), the functions qnorm( ) and dnorm( ) can also be used for normal distributions with non-zero mean and non-zero standard deviation or variance, simply by supplying the mean and standard deviation as extra arguments.

For example, for the N(8,4) distribution,  the results are self-explanatory:

> qnorm(0.975,8,2)  # 97.5% quantile of N(8,4)

[1] 11.91993

> dnorm(1,8,2)  #Density of N(8,4) at x=1

[1] 0.0004363413

> dnorm(4,8,2)  #Density of N(8,4) at x=4

[1] 0.02699548

>

### Using R to generate a random sampling table

The open source R programing language is a free software environment for statistical computing and graphics, and is easy to master. The official website is https://www.r-project.org/ . It can run on a wide variety of UNIX platforms, Windows and MacOS.

On September 24, 2016, this blog site published an article on how to use R to generate random numbers (https://consultglp.com/2016/09/24/how-to-use-r-to-generate-random-numbers/) .   In light of the newly revised ISO/IEC 17025 accreditation standards embracing sampling as another important criterion for technical competence assessment, the random number function of R becomes very handy for cargo surveyors and samplers to prepare their sampling plan on cargo shipment.

We can use the random number function of R to create a random number table to suit the needs in randomly selecting samples for laboratory quality analysis.

For example, there is a shipment of 1000 bags of coffee beans in a warehouse to be surveyed prior to be dispatched to port. The buyer requires a 5% sampling for laboratory quality testing.  That means some 50 bags have to be random selected before composite a portion of each bag into a suitable sized test sample through a quartering sub-sampling process.

The sampling plan, therefore, can be the following process:

1.  Label each bag with a sequential number

2.  Create 50 numbers in a random number table with the R command language:

> RandSampling=sample(500,50)

> dim(RandSampling)=c(10,5)

> RandSampling

[,1]   [,2]   [,3]   [,4]   [,5]

[1,]  154  424   84  486   82

[2,]   78  214  275  498  388

[3,]   93  104  478  148  258

[4,]  229  283   96  479  489

[5,]  487  211  216   59  263

[6,]   94  450   47  201  105

[7,]  330  121  130  276   56

[8,]   11  415  303  240  407

[9,]  427   60   71  142  409

[10,]  101  238  228  441  355

>

3.  Sample a portion (say, 500g) of the coffee beans from the bags with these selected numbers into a large sampling bag.

4.  Conduct a sample quartering process on site to reduce the test sample size to about 2.5 kg before sending to the laboratory for analysis.

### R techniques in generating random numbers

r-techniques-in-generating-random-numbers

### Using R to perform simple linear regression

using-r-to-perform-simple-linear-regression

### Having a taste of R language

having-a-taste-of-r-language

### Interesting graph plotting with R

interesting-graph-plotting-with-r