Design of Experiments
What is design of experiments?
Design of experiments (DOE) is a systematic approach used by scientists and engineers to study the effects of different inputs (e.g., speed, temperature, or vendor) on a process and the outputs of that process (e.g, yield, impurity, or cost). DOE is a powerful and efficient framework for understanding complex systems and making decisions in a reliable and data-driven way.
When to use DOE?
Broadly speaking, a designed experiment can help when you want to:
- Determine whether a factor, or group of factors, influences a response of interest.
- Understand the potential relationships among factors and responses.
- Optimize one or more responses by identifying which factor settings (e.g., a specific temperature and vendor) produce a desired outcome (e.g., greatest yield).
Why not change one factor at a time?
While it might seem more intuitive or straightforward to perform an experiment in which you only change one factor at a time, let’s look at a simple example to illustrate the power of using DOE.
In the one-factor-at-a-time (OFAT) approach, you test a single factor by changing the setting, or level, of that factor while keeping all other factors at a constant level. Then you repeat this process for each factor in your experiment.
OFAT example
Suppose, for example, you’re interested in maximizing the Yield of a chemical process and you know that Temperature and pH are key drivers of Yield. At the current settings of your factors (Temperature = 25°C and pH = 5.5), Yield is 83%.
To determine if you can increase the Yield, you decide to keep pH at its current setting and vary Temperature. You know from previous experience that at temperatures below 15°C or above 45°C the process runs poorly, so you decide to vary Temperature across that range in 5°C increments and record the results.
When pH is held constant at 5.5, you discover that the maximum Yield of 85% occurs when Temperature is set at 30°C, which is a small improvement from the current settings.
Next, you decide to keep the Temperature fixed at 30°C and vary pH. Based on past experience, you decide to vary pH from 5 to 8 in increments of 0.5 and record the results.
From the outcome of these 13 tests where you varied one factor at a time, you conclude that Yield is maximized at 86% when Temperature is set to 30°C and pH is 6. It also appears that Yield decreases as you go above or below those values. In other words, there appears to be curvature in the relationship between each factor and the response.
But can you really be sure that this OFAT experiment has captured the true relationships between Temperature, pH, and Yield? Since you did not vary Temperature and pH together in a systematic way, you can’t investigate the possibility of an interaction between these factors. That is, you can’t determine if the effect of Temperature on Yield changes depending on the level of pH and vice versa. If there is, in fact, an interaction, the shape of the response, Yield, might look very different than what you concluded after your OFAT experiment.
What kind of experiment would you need to do to assess whether these two factors interact with each other and to understand how the response truly behaves across the ranges of the factors (the experimental region)? You could test every possible combination of Temperature and pH across their ranges, but that would be time-consuming and costly. In this example, you would need to conduct 49 tests to cover the entire experimental region, assuming you change the factors by the same increments.
DOE provides a better way to learn how Temperature and pH affect Yield and whether they interact. Let’s take a look.
Using a designed experiment
For this two-factor example, we define the experimental region as the corners of a square: both factors at their low levels, both factors at their high levels, and the combinations where one factor is low and the other is high.
Testing these treatment combinations allows you to estimate the individual effects of each factor on Yield, as well as their possible interaction. Adding tests with the factors at their middle levels allows you to estimate any curvature in the shape of the response.
In all, there are nine treatment combinations. Replicating (repeating) at least one of the treatment combinations when possible allows you to test the statistical significance of the terms in your model. In this example, the experiment consists of 12 tests, or runs: nine treatment combinations plus three replicates. To ensure that the results are not affected by sources of variation outside of the experiment, you perform the runs in random order and record the results.
The results of this experiment indicate that the maximum Yield (91%) occurs when Temperature is at 45°C and pH is at 8, which is an improvement over the best results of the OFAT experiment (where the maximum Yield was 86%). Is it possible that there is an untested combination of settings for Temperature and pH within the experimental region that would produce an even higher Yield? How would you answer that question without performing more tests?
When you analyze the data, you’re building a statistical model that describes the relationship between Temperature, pH, and Yield. This is an interpolating model, meaning you can use it to make predictions at untested combinations of the factors within the experimental region. The model includes terms for the individual effects of Temperature and pH, their interaction, and their quadratic effects, where the β s are the estimated coefficients:
$$ Predicted\:Yield = \beta_0 + \beta_1 Temp + \beta_2 pH + \beta_{12} Temp * pH + \beta_{11} Temp^2 + \beta_{22} pH^2 $$
In this visualization of the response, the shape is different from the one indicated by the OFAT data. Instead of decreasing from the center of the experimental region, you can see that the surface rises and twists as Temperature and pH increase. The twisting suggests an interaction between the two factors – something that was not, and could not be, detected by the OFAT experiment.
You can then use this model to make predictions about future values of Yield, and in particular, to find settings of Temperature and pH within the experimental region that are predicted to maximize Yield. In this example, the model predicts that Yield will be maximized at 92% when Temperature is set to 45°C and pH is set to 7 – a combination that you did not directly test.
Of course, you will want to confirm this prediction by running a few more tests at the factor settings that predict the maximum Yield!
Summary
In this example, the OFAT experiment starting with the current settings of the process did not find the best settings for maximizing Yield and did not provide the data you would need to evaluate whether there is an interaction between Temperature and pH. In fact, the OFAT method clearly missed the true behavior of the system.
The only way to find those settings with the OFAT method would have been to test every possible treatment combination within the experimental region. In this scenario that would mean 49 separate tests, and there were only two factors. Imagine how many tests this approach would require in a more realistic situation where you have five factors, 10 factors, or more!
In our example above, the designed experiment enabled you to fit a model that included the possible interaction between Temperature and pH and make predictions throughout the experimental region without having to test every possible combination of the factors. By doing so, you were able to find the best settings with far fewer runs than if you had to test all the combinations (12 vs. 49, or about 25%). It was a simple example with a very limited number of factors. The benefits of using DOE only increase the more factors you have.