Box-Behnken Designs

What is a Box-Behnken design?

Box-Behnken designs are a classic response surface design that is constructed using a subset of design points from a factorial design, with each factor set at three levels. Box-Behnken designs have points that are on the edges of a hypercube and equidistant from the center. They avoid extreme points and cannot be built up from factorial designs for sequential experimentation.

When should I use a Box-Behnken design?

Box-Behnken designs are used for response surface designs where the primary goals are prediction and optimization. A Box-Behnken design can be advantageous when you need to avoid combined factor extremes, such as running each continuous factor at the high level. Sometimes runs involving factor extremes are dangerous, physically impossible, or too expensive to run. Box-Benken designs only include continuous factors.

left
blue

An introduction to Box-Behnken designs

Box-Behnken designs are a type of classical response surface model (RSM) design and are known for avoiding extreme points. They include continuous factors only that are tested at three levels: low, middle and high or -1, 0, +1 in coded units.

Box-Behnken designs have two distinct characteristics: they are rotatable (or nearly rotatable) and they are spherical designs. Rotatable designs are designs where the predicted variance has the same value when the design is rotated about its center. Box-Behnken designs are also spherical designs where all design points are an equal distance from the center – as opposed to cuboidal designs, because the extremes of the cube are not design points. They can be advantageous when you need to avoid extreme points because of cost or undesirable side effects. However, the lack of design points at the vertices of the cube leads to higher prediction variance near the vertices where there is no data. Below is a 3D scatter plot of the design points for three factors. Notice that there are no design points in the corners of the design space.

Learn how to construct a Box-Behnken design in JMP

https://www.youtube.com/watch?v=QcKTOfYhGZw

When you have multiple factor types in addition to continuous (such as categorical, restrictions on the design space, non-standard models or a custom run budget), consider algorithmic designs for their added flexibility.

An example of a Box-Behnken design

Suppose we want to optimize tennis ball bounciness, which is measured as Stretch, to a standardized target value. The tennis ball bounce can vary depending on the amounts of Silica, Sulfur, and Silane used during manufacturing. The goal of the experiment is to improve the manufacturing process by optimizing factor settings to produce a Stretch of 450. Stretch cannot be below 350 or above 550 to produce acceptable product. We believe a second-order model is required and choose a Box-Behnken design to avoid the factors assuming extreme values at the same time. To create a Box-Behnken design, we first choose the response goal and factor ranges.

The response and factors are:

Each continuous factor is tested at a low, middle, and high value in a Box-Behnken design to produce a second-order model with quadratic and interaction terms. As an example, Silica is tested at 0.7, 1.2, and 1.7. With three continuous factors, we choose a Box-Behnken design with 15 runs with three center points. There are 12 runs to examine all possible combinations of low and high for each pair of factors, while the third factor is at the middle level plus the three center points. Below is the randomized design table with a column that shows the pattern of the design points.

For education purposes, below is the same table sorted by the pattern column. Notice how each run contains one of the factors at a middle value, while the other two factors are at either a high or low value. None of the runs are at the corners of the design space where there are combined factor extremes, such as +++, ---, or +--. This is a characteristic of a Box-Behnken design.

The values for Stretch are recorded in the data table after the experiment is executed following the original randomized design run order.

To analyze the experimental data, we will use multiple linear regression to fit the initial “full” specified statistical model for Stretch. The model terms include the following terms for a second-order model:

Inactive effects are removed from the model using variable selection. Active effects are terms that are statistically significant and influence the responses. Looking at the effect summary for the model with the active effects, we see curvature is an active effect for Sulfur and Silica with the quadratic term. There are also active interactions between Silica and Sulfur, as well as Sulfur and Silane.

Let’s look at the reduced model for Stretch below by looking at cross-sections of the surface, or profiles. We can see the active quadratic effect in Sulfur and Silica with the curved profiles. We see that Silane without an active quadratic effect is linear. Currently with the factors set at their respective midpoints, we predict a Stretch of 396.15. We want to optimize to find a combination of factor settings that produce a Stretch of 450.

In this example, setting Silica to 1.06, Sulfur to 1.91, and Silane to 43.82 is predicted to match the target of 450 for Stretch. There might be other factor combinations that would produce similar results.