An ideal experimentation is to have an experimental design with relatively few runs that could cover many factors affecting the desired result. But, it is impossible to accurately control and manipulate “too many” experimental factors at a time. Worse still, if we consider the number of levels for each factor, the number of experimental runs required will grow exponentially.
We can illustrate this point easily by the following discussion..
Assume that we have a 2-factor experiment with factor A having 3 levels (say, temperature factor with 30oC, 60oC, 90oC as levels), and factor B with 2 levels (say, catalyst, X and Y). Then, there are 3 x 2 = 6 possible combination of these two factors:
A(30) X
A(60) X
A(90) X
A(30) Y
A(60) Y
A(90) Y
Similarly, if there were a third experimental factor C with 4 levels (say, Pressure, 1 bar, 2 bar, 3 bar, 4 bar), then there would be 3 x 2 x 4 = 24 possible combinations:
A(30) X 1bar
A(60) X 1bar
A(90) X 1bar
A(30) Y 1bar
A(60) Y 1bar
A(90) Y 1bar
A(30) X 2bar
A(60) X 2bar
A(90) X 2bar
A(30) Y 2bar
A(60) Y 2bar
A(90) Y 2bar
A(30) X 3bar
A(60) X 3bar
A(90) X 3bar
A(30) Y 3bar
A(60) Y 3bar
A(90) Y 3bar
A(30) X 4bar
A(60) X 4bar
A(90) X 4bar
A(30) Y 4bar
A(60) Y 4bar
A(90) Y 4bar
Therefore, the general pattern is obvious: if we have m factors F1, F2, …, Fm with number of levels k1, k2, …, km each respectively, then there are k1 x k2 x … x km combined possible runs in total. Note that if the number of levels is the same m for each factor, then the product of this combination is just k x k x … x k (m times) or km.
Let’s see how serious exponential growth of the number of experimental runs is when the number of levels and factors increase:
Clearly, if we want to run all possible combination in an experimental study, a so-called full-factorial experiment, the number of runs gets too large to be practical for more than 4 or 5 factors with 2 levels, and much more so at 5 factors with 3 levels.
You may then ask: if I were to stick to 2 level designs, could I find ways to control the number of experimental runs?
The answer is yes provided you are able in some way to select a subset of the possibilities in some clever way so that “most” of the important information that could be obtained by running all the possible combinations of factor settings is still gained, but with a drastically reduced number of runs. One may do so by basing on his previous scientific knowledge, theoretical inferences and/or experience. But, this will not be easy.
Indeed, there is a way to do so, that is to follow the Pareto Principle on 20-80 rule which defines as below:
The Pareto Principle For processes with many possible causes of variations, adequately controlling just the few most important is all that is required to produce consistent results. Or, to express it more quantitatively (but as a rough approximation, only), controlling the vital 20% of the causes achieves 80% of the desired effects.
In other words, in order to get the most bang for your buck, you should focus on the important stuff and ignore the unimportant. Of course, the challenge is determining what is important and what is not.
If you can do so, you can now have a 2 level factorial design happily covering say 10 factors to be studied in 8 runs on 3 important factors only instead of 210 = 1024 runs!