Response Surface Methodology
What is response surface methodology?
Response surface methodology (RSM) is used to develop, improve, or optimize a product or process. This methodology involves a collection of various statistical, graphical, and mathematical techniques that are used to explore and model the shape of a response across the experimental region.
When should response surface methodology be used?
Use screening designs to identify the factors that are important in determining the qualities of your product or the results of your process. After identifying the factors that are important to you, RSM helps you determine the factor settings to optimize your response or responses. Specifically, RSM can help you if you have:
- A general objective in mind for your response (e.g., higher is better, lower is better, or within a range is better).
- A specific objective for your goal (e.g., the response must be at least 90%, less than 5 mg/L, or at 20 ± 0.1mm).
- More than one response, with a unique goal for each.
In addition, RSM is most useful when you are confident that an optimum exists within the experimental region.
Why use response surface methodology?
Let’s suppose you used a screening design to narrow down the list of possible factors that are important in your process. Now, you want to find the factor settings that optimize one or more responses, but you need more detailed information than you can obtain with a typical screening experiment.
Or maybe you already know which factors are important, but you suspect the process could be improved to better meet your goals. In either case, understanding the shape of the response – the response surface – will help you identify the factor settings that are predicted to produce the optimum response, however that is defined (maximum, minimum, or target).
Let’s assume for the moment that all the factors under consideration are continuous.
An experimental factor that’s tested at two levels provides the data required to estimate a linear effect. For a single factor experiment, the response surface is modeled as a line; for two factors, the surface is modeled as a plane; and for three or more factors, the surface is modeled as a hyperplane.
But often the effect of a factor on the response is not linear. There might be a peak (maximum) or a valley (minimum) in the response between the low and high levels of a factor. In other words, you might see curvature in the response surface. In those cases, optimum conditions are difficult or even impossible to find using designs that assume a linear response. To estimate curvature in the response, you need to add new effects to your model, which requires more data points. For example, to estimate a quadratic effect (i.e., X2), a continuous factor must be tested at three levels, not just two. RSM designs include a third level (or sometimes more) for each continuous factor in your experiment.
These designs can be performed as stand-alone experiments, but they can also build on your existing data to estimate quadratic effects by specifying factor levels and treatment combinations that weren’t included in previous designs. Suppose you’ve performed a screening experiment to find out which factors are important. You’ve already collected valuable data about your system or process, but you typically won’t have the data you need to estimate curvature. Augmenting your existing data with a subsequent experiment that includes a third level for the continuous factors can be an efficient way to model the response surface.
Finding the factor settings that optimize one or more responses can be challenging. In most situations, a response depends on multiple factors, making the shape of the response surface multidimensional and potentially complex if the factors interact with one another. Response surface methodology enables you to explore that surface to find the best factor settings within the experimental region to meet your response goals.
Response surface methodology: An example
Let’s assume we are interested in optimizing a process by identifying which factor settings produce the highest Yield and lowest Impurity. We’ve been told that the goals for the two responses are equally important. Based on previous knowledge, we know three factors are important, but we want to improve the process by finding the operating conditions that best meet our goals. In this example, we are not augmenting a previous design as described above but creating a new RSM design.
The responses and factors are:
- Yield: The response goal is maximize (higher is better)
- Impurity: The response goal is minimize (lower is better)
- pH: The factor range is 5 to 8
- Temperature: The factor range is 15° to 45° Celsius
- Vendor: There are three vendors (factor levels): Good, Fast, and Cheap
For both continuous factors, pH and Temperature, we want to determine if there is a peak or a valley in either response between the high and low levels of the factors. The range of interest for pH is 5 to 8. To understand if the effect of pH on Yield and Impurity is not linear, we test a third, middle level between 5 and 8 (6.5). We also do this for Temperature with factor levels of 15°, 30°, and 45°. The three levels allow us to estimate the quadratic effect (curvature) for the continuous factors. Quadratic effects cannot be estimated for categorical factors. We are also interested in whether there are any interactions between the three factors. An RSM design with 18 runs will allow us to estimate the main effects, two-factor interactions, and quadratic effects.
After running the experiment, the responses are recorded in the data table.
Use multiple linear regression to model the responses. The full model for both responses includes the following terms:
- Intercept
- Three main effects (pH, Temperature, and Vendor)
- Three two-factor interactions
- Two quadratic effects (for the continuous factors)
After you fit separate models for Yield and Impurity, you can use variable selection to remove nonsignificant terms from the models. The terms that remain in the reduced models for each response are shown in the table below.
Note: Model terms in italics were not significant but remain in the models because they are involved in higher-order terms.
We found that both pH and Temperature and one or more of their higher order effects were important for both responses. However, Vendor (and its interaction with pH) was only important for Impurity.
You can visualize the shape of the response surface for the continuous factors with a three-dimensional surface plot. You can see in the plots where in the experimental region you could obtain higher values of Yield and lower values of Impurity.
Yield
Impurity
You can also visualize the response surface by looking at cross sections of the surface, or profiles. Here you can see how changing the level of an individual factor affects the predicted values of the responses (shown on the left). And you can see how the profiles of factors that are involved in interactions change depending on the level of the other factor.
The response goals are to maximize Yield and minimize Impurity. It’s possible to find a combination of factor settings within the experimental region that balances the tradeoffs between both goals. In this example, setting the pH to 6.85, the Temperature to 34.25°, and using the Vendor Fast is predicted to maximize Yield at 94.12% and minimize Impurity at 0.89%.
There might be other factor combinations that would produce similar results. You could also find settings that would produce higher Yield, for example, but at the expense of minimizing Impurity, or vice versa. You might be willing to accept that tradeoff if optimizing one of the responses was more important than the other.