Screening Designs
What are screening designs?
Screening designs are a type of designed experiment that are often conducted as an initial step in identifying the most influential factors among many potential variables affecting a process or outcome. They are an efficient and rigorous way to systematically determine which factors should be included in subsequent experiments, using a relatively small number of experimental runs. Screening is about separating “the vital few from the trivial many.”
When should you use screening designs?
Screening designs can be helpful when any of the following are true:
- There are many potential factors to study.
- The important factors are unknown.
- The effects of the factors are unknown.
You might not need to screen factors in every situation. For example, screening methods might not be necessary in situations where you have a small number of factors, or you can afford all the design runs that are required to fit the most complex model that you can imagine.
Why and how do you use screening designs?
The process of experimenting typically involves a sequence of experiments. Often in the beginning stages of experimentation, you can think of many factors that might influence your process. The first task is to narrow down the long list of potentially important effects (main effects and, in some cases, interaction effects) to the few, most important ones. When there are a lot of factors under consideration, full factorial designs might be too time-consuming or expensive to perform. Full factorial designs can be wasteful, too, since you generally aren’t interested in fitting a model with three-factor and greater interactions (or at least not all of them!). Screening designs, however, can help you identify the largest effects in fewer runs.
The effectiveness of screening designs and analysis methods depends on four key principles. While these principles don’t hold in every situation, they’ve been found to be common enough in practice to be quite useful.
Sparsity
The principle of sparsity of effects says that while you might have many candidate factors with many more potential effects, only a small portion of them will actually be important to any one response.
Hierarchy
The hierarchy principle says that the likelihood that an effect is important decreases as the order of the model term increases. In other words, a higher-order model term, like a three-factor interaction, is much less likely to be important than a lower-order model term, like a two-factor interaction, and a two-factor interaction is less likely to be important than a main effect.
Heredity
The principle of heredity says the presence of higher-order terms is usually associated with the presence of lower-order effects of the same factors. So, for example, if the interaction of X1 and X3 is important, then it is more likely that the X1 or X3 main effect is also important.
Projection
The projection property refers to how well a design retains desirable statistical properties (such as estimability of effects and independence of estimates) when unimportant effects are removed from the model, and the design is “projected” into a lower-dimensional design with fewer factors (the important effects). A design with good projection properties will produce reliable results when you analyze this subset of factors.
The first step in planning a screening experiment is to identify all the factors associated with your process. The results will include your factors of interest – the factors that you can change during the experiment and that you expect to affect the response – as well as any factors that might introduce random variability, or noise, to the process. Ideally, you can control these noise factors during the experiment, or account for their effects in your statistical model. You also want to consider the possibility of higher-order effects, such as interactions.
In practice, however, you might not be able to include all the potential factors and their interactions in your experiment. Design decisions will depend on several considerations, including:
- The number of factors to be screened in relation to your experimental budget.
- The time required to run an experiment.
- The complexity of the model you want to fit (i.e., do you only want to estimate main effects, or do you also want to estimate some or all two-factor interactions?).
- The nature and amount of prior knowledge or subject matter expertise.
- The cost of missing important information should you fail to detect an effect that’s actually important.
In some cases, you might need to perform more than one experiment to identify the important effects. For example, maybe the initial experiment didn’t allow you to estimate two-factor interactions, so you must perform additional experiments to test those.
There are many methods that you can use to design a screening experiment. “Classic” designs, such as fractional factorial designs and Plackett-Burman designs, were developed in the early part of the 20th century, and while they are widely familiar, they have limitations. Modern methods, like custom designs and definitive screening designs, use an algorithmic approach and offer many advantages. Regardless of the method you use, screening designs are a first step toward determining how to improve or optimize a process.
Screening designs: An example
Suppose you are developing a manufacturing process where the responses of interest are Yield and Impurity. You’ve been tasked with finding the settings in your process that will maximize Yield and minimize Impurity. First, you need to understand what affects the responses and in what way.
You and your team come up with nine factors that you believe might be important in affecting Yield and Impurity. Seven of the factors are continuous, and two are categorical. Based on prior experience, you and your team choose factor ranges and levels that should, if the factor is truly important, produce a large enough change in the response to be detected by your experiment.
The factors and their ranges or levels are:
- Blend Time: 10-30 minutes
- Pressure: 60-80 kpa
- pH: 5-8
- Stir Rate: 100-120 rpm
- Catalyst: 1-2%
- Temperature: 15-45 degrees C
- Feed Rate: 10-15 L/min
- Vendor: Cheap, Fast, Good
- Particle Size: Small, Large
You don’t expect all nine factors to be important (sparsity of effects principle), but at this point, you don’t know which ones will be. You suspect that there might be at least one two-factor interaction and possibly quadratic effects, but you expect that they will be less important than the main effects (hierarchy principle). You also assume that any interactions that are present will involve the important main effects that you identify from your experiment (heredity principle). Lastly, you know that if you can remove unimportant effects from your model, you might be able to estimate interaction effects involving the important main effects, even if the original design did not allow their estimation (projection property).
There are many possible screening strategies: small experiments that allow you to estimate main effects only; medium-sized experiments that allow you to estimate main effects and some two-factor interactions; or large experiments that allow you to estimate main effects and all possible two-factor interactions. Which strategy you use is largely based on the considerations described above. (Quadratic effects are typically tested in optimization experiments once you have identified the important factors.)
For this example, suppose you have a fairly small budget for your screening experiment, so you decide to start with a main-effects-only design to screen for important factors. While you realize this could be a risky strategy, particularly if there are interaction effects present that are stronger than the main effects, you feel comfortable relying on the screening principles and have budgeted for additional experiments if you need to clarify the results of this one.
With the strategy and planning done, you design an experiment with 22 runs. Four of these are center points – runs where all of the continuous factors are at their middle levels.
There are several reasons to consider including center points in a design. Center points can provide replication in an otherwise unreplicated design, allowing you to estimate the pure error and get statistical tests for the terms in your model. You might choose to spread center point runs throughout the design to monitor whether there are any unexpected changes occurring in your process during the experiment; you expect the responses from those runs to be similar to one another because they are replicates.
In the context of screening designs, center points can be used to detect the presence of curvature in the response through a lack of fit test. A statistically significant lack of fit test indicates that the model might be missing one or more quadratic terms. Because the experiment was not designed to allow estimation of quadratic terms – just their detection – a significant lack of fit test suggests the need for additional experimentation to understand the curvature in the response.
You run the experiment and record the responses, Yield and Impurity, shown in the table below.
You use multiple linear regression to fit a model for each response. The factors are shown in order of importance (based on a measure called logworth) for each response in the graph below.
Based on your screening experiment, you determine that for Yield, the largest effects are Temperature and pH. The largest effects for Impurity are Temperature, pH, and Vendor.
Based on these results, your next steps might include reducing the model (removing unimportant terms), fitting a new model with the important terms and their interactions (if possible), and examining the lack of fit test to see whether there is any evidence for curvature in either response. These results can then guide your decisions about subsequent experiments as you try to understand and eventually optimize your process.