Training and consultancy for testing laboratories.

Archive for the ‘R Language’ Category

R computations with normal distribution

Std normal distribution density A

R computations with normal distributions

There are various R functions which are useful for computation with normal distributions, such as pnorm( ), qnorm( ), and dnorm( ).

The pnorm( ) function gives the cumulative distribution function, and the alphabet ‘p’ stands for probability.  The qnorm( ) is for quantiles whilst the dnorm( ) function, the density.

Let’s use the statistical notation for normal distribution: X ~ N(µ,sigma2).  We shall illustrate the usage of these R functions.

R function pnorm( )

For example, let X ~ N(8,4), then

(a)  the probability P(X < 2) can be computed via pnorm( ) in several different ways:

> pnorm(2,mean=8,sd=2)  #P(X<=2) in N(8,4)

[1] 0.001349898

> pnorm(2,8,2)  #P(X<=2) in N(8,4) simplified

[1] 0.001349898

(b)  the probability P(X < 1.96) for x ~ N(8,4) by R language is:

> pnorm(1.96,8,2)  #P(X<=1.96) in N(8,4)

[1] 0.001263873

Remember that for f(1.96) = 0.975 and f(1.645) = 0.950, respectively from the statistics table, the R gives us the same answers:

> pnorm(1.96,0,1)  #P(X<=1.96) in N(0,1)

[1] 0.9750021

> pnorm(1.645,0,1)  #P(X<=1.645) in N(0,1)

[1] 0.9500151

> pnorm(1.645)  #P(X<=1.645) in N(0,1) simplified

[1] 0.9500151

>

And, when P(X < -1.645), the R result indicates the area on the left hand side of the normal distribution curve:

> pnorm(-1.645)  #P(X<=-1.645) in N(0,1) simplified

[1] 0.04998491

>

 R function qnorm( )

In layman’s language, a quantile is where a series of sample data is sub-divided into equal proportions. In statistics, we divide a probability distribution into areas of equal probability. The simplest division that can be envisioned is into two equal halves, i.e., 50%.

The R function: qnorm( ) is used to compute the quantiles for the standard normal distribution using its density function f.

For example,

> qnorm(0.95)  #95.0% quantile of N(0,1)

[1] 1.644854

> qnorm(0.975)  #97.5% quantile of N(0,1)

[1] 1.959964

>

 R function dnorm( )

The density of a Gaussian formulae for normal distribution can be shown to be close to 0.4 when x = 0.

The R function dnorm(0) indeed gives the same result as below:

> dnorm(0)  # Density of N(0,1) evaluated at x= 0

[1] 0.3989423

>

Further remarks

Like pnorm( ), the functions qnorm( ) and dnorm( ) can also be used for normal distributions with non-zero mean and non-zero standard deviation or variance, simply by supplying the mean and standard deviation as extra arguments.

For example, for the N(8,4) distribution,  the results are self-explanatory:

> qnorm(0.975,8,2)  # 97.5% quantile of N(8,4)

[1] 11.91993

> dnorm(1,8,2)  #Density of N(8,4) at x=1

[1] 0.0004363413

> dnorm(4,8,2)  #Density of N(8,4) at x=4

[1] 0.02699548

>

 

Using R to generate a random sampling table

Sampling 8

The open source R programing language is a free software environment for statistical computing and graphics, and is easy to master. The official website is https://www.r-project.org/ . It can run on a wide variety of UNIX platforms, Windows and MacOS.

On September 24, 2016, this blog site published an article on how to use R to generate random numbers (https://consultglp.com/2016/09/24/how-to-use-r-to-generate-random-numbers/) .   In light of the newly revised ISO/IEC 17025 accreditation standards embracing sampling as another important criterion for technical competence assessment, the random number function of R becomes very handy for cargo surveyors and samplers to prepare their sampling plan on cargo shipment.

We can use the random number function of R to create a random number table to suit the needs in randomly selecting samples for laboratory quality analysis.

For example, there is a shipment of 1000 bags of coffee beans in a warehouse to be surveyed prior to be dispatched to port. The buyer requires a 5% sampling for laboratory quality testing.  That means some 50 bags have to be random selected before composite a portion of each bag into a suitable sized test sample through a quartering sub-sampling process.

The sampling plan, therefore, can be the following process:

1.  Label each bag with a sequential number

2.  Create 50 numbers in a random number table with the R command language:

> RandSampling=sample(500,50)

> dim(RandSampling)=c(10,5)

> RandSampling

[,1]   [,2]   [,3]   [,4]   [,5]

[1,]  154  424   84  486   82

[2,]   78  214  275  498  388

[3,]   93  104  478  148  258

[4,]  229  283   96  479  489

[5,]  487  211  216   59  263

[6,]   94  450   47  201  105

[7,]  330  121  130  276   56

[8,]   11  415  303  240  407

[9,]  427   60   71  142  409

[10,]  101  238  228  441  355

>

3.  Sample a portion (say, 500g) of the coffee beans from the bags with these selected numbers into a large sampling bag.

4.  Conduct a sample quartering process on site to reduce the test sample size to about 2.5 kg before sending to the laboratory for analysis.

 

 

 

How to ensure your random sampling process is really random?

Cargo                                       How to ensure your simple random sampling is really random

R techniques in generating random numbers

r-techniques-in-generating-random-numbers

Using R to perform simple linear regression

using-r-to-perform-simple-linear-regression

Having a taste of R language

having-a-taste-of-r-language

Interesting graph plotting with R

interesting-graph-plotting-with-r