Some essential terminology and elements in DOE
Let’s look into some key operational characteristics that we think a good experiment should have in order to be effective. Among these, we would certainly include, but not limit to, subject matter expertise, “GLP” (Good Laboratory Practices), proper equipment maintenance and calibration, and so forth, none of which have anything to do with statistics.
But, there are other operational aspects of experimental practice that do intersect statistical design and analysis, and which, if not followed, can also lead to inconsistent, irreproducible results. Among these are randomization, blinding, proper replication, blocking and split plotting. We shall discuss them in due course.
To begin with, we need to define and understand some terminology. It should be emphasized that the DOE terminology is not uniform across disciplines and even across textbooks within a discipline. Common terms are described below.
- A response variable is an outcome of an experiment. It may be a quantitative measurement such as percentage by volume of mercury in a sample of river water, or it may be a qualitative results such as absence/presence of urine sugar.
- A factor is an experiment variable that is being investigated to determine its effect on a response. It is important to realize that a factor is considered controllable by the experimenter, i.e. the values, or “levels”, of the factor can be determined prior to the beginning of the test program and can be executed as stipulated in the experimental design. Examples of a factor can be temperature, pressure, catalyst.
- We use “level” to refer to the value of both qualitative and quantitative factors. Examples of “levels” of a temperature can be Control temperature CTRL and CTRL+20o
- Additional variables that may affect the response but cannot be controlled in an experiment are called covariates. Covariates are not additional responses, that is, their values are not affected by the factors in the experiment. Rather, covariates and the experimental factors jointly influence the response.
For example, in an experiment involving temperature and humidity factors, we can only control the temperature of the laboratory equipment but humidity can be measured but not controlled. In such experiment, temperature would be regarded as an experimental factor whilst humidity as a covariate.
- A test run is a single factor-level combination for which an observation (response) is obtained.
- Repeat tests are two or more observations that are obtained for a specified combination of levels of the factors. Repeated tests are actually distinct test runs, conducted under as identical experimental conditions as possible, but they need not be obtained in back-to-back test runs.
- Replications are repetitions of a portion of the experiment (or the entire experiment) under two or more different conditions, for example, on two or more different days.
It is to be noted that repeat tests and replication increase precision by reducing the standard deviation of the statistics used to estimate effects (see point 12 below).
- Experimental responses are only comparable when they result from observations taken on homogeneous samples or experiment units. The homogeneous samples do not differ from one another in any systematic manner and are as alike as possible on all characteristics that might affect the response.
- If experiment units (or samples) produced by one manufacturer are compared with same experiment units produced by a second manufacturer, any differences noted in the responses for one level of a factor could be due to the different levels of the factor, to the different manufacturers, or to both. In this situation, the effect of the factor is said to be cofounded with the effect due to the manufactures.
- When a satisfactory number of homogeneous experimental units cannot be obtained, statistically designed experiments are often blocked so that homogeneous experimental units received each level of the factor(s). Blocking divides the total number of samples into two or more groups or blocks (e.g. manufacturers) of homogeneous experimental units so that the units in each block are more homogeneous than the units of different block. Hence, blocking increases response’s precision (decreases variability) by controlling the systematic variation attributable to non-homogeneous experimental units or test conditions.
A good example is we can separate different ethnic groups in different block for a social behavior study.
- The terms designs and layout often are used interchangeably when referring the experiment designs. The layout or design of the experiment includes the choice of factors-level combinations to be examined, the number of repeat tests or replications (if any), blocking (if any), the assignment of the factor-level combinations to the experimental units, and the sequence of the test runs.
- An effect of the design factors on the response is measured by a change of the average response under two or more factor-level combinations. In its simplest form, the effect of a single two-level factor on a response is measured as the difference in the average response for the two levels of the factor; that is
Factor effect = average response at one level – average response at a second level
In other words, factor effects measure the influence of different levels of a factor on the value of the response. An observed effect is then said to be sufficiently precise if the standard deviation (or equivalently, the variance) of this statistic is sufficiently small through repeat testing or replication. There may be the presence of joint factor effects. We will discuss this in future blogs too.
- Randomization of the sequence of test runs or the assignment of factor-level combination to experimental units protects against unknown or unmeasured sources of possible bias. Randomization helps validate the assumptions need to apply certain statistical techniques.
For example, we know that there is a common problem of analytical instrument drift. If during a series of experiments, instrument drift builds up over time, leading to later tests being biased. If all tests involving one level of a factor are dun first and all tests involving the second level of a factor are run last, comparisons of the factor levels will be biased by this instrument drift and will not provide a true measure of the effect of the factor.
In fact, randomization of the test runs cannot prevent instrument drift but is can help ensure that all levels of a factor have an equal chance of being affected by the drift. If so, differences in the response for pairs of factor levels will likely reflect the effects of the factor levels and not the effect of the drift.