### Statistics Knowledge Portal

A free online introduction to statistics

# Design of experiments

## What is design of experiments?

Design of experiments (DOE) is a systematic, efficient method that enables scientists and engineers to study the relationship between multiple input variables (aka factors) and key output variables (aka responses). It is a structured approach for collecting data and making discoveries.

## When to use DOE?

• To determine whether a factor, or a collection of factors, has an effect on the response.
• To determine whether factors interact in their effect on the response.
• To model the behavior of the response as a function of the factors.
• To optimize the response.

Ronald Fisher first introduced four enduring principles of DOE in 1926: the factorial principle, randomization, replication and blocking. Generating and analyzing these designs relied primarily on hand calculation in the past; until recently practitioners started using computer-generated designs for a more effective and efficient DOE.

## Why use DOE?

DOE is useful:

• In driving knowledge of cause and effect between factors.
• To experiment with all factors at the same time.
• To run trials that span the potential experimental region for our factors.
• In enabling us to understand the combined effect of the factors.

To illustrate the importance of DOE, let’s look at what will happen if DOE does NOT exist.

Experiments are likely to be carried out via trial and error or one-factor-at-a-time (OFAT) method.

## Trial-and-error method

Test different settings of two factors and see what the resulting yield is.

Say we want to determine the optimal temperature and time settings that will maximize yield through experiments.

How the experiment looks like using trial-and-error method:

1. Conduct a trial at starting values for the two variables and record the yield:

2. Adjust one or both values based on our results:

3. Repeat Step 2 until we think we've found the best set of values:

As you can tell, the cons of trial-and-error are:

• Inefficient, unstructured and ad hoc (worst if carried out without subject matter knowledge).
• Unlikely to find the optimum set of conditions across two or more factors.

## One factor at a time (OFAT) method

Change the value of the one factor, then measure the response, repeat the process with another factor.

In the same experiment of searching optimal temperature and time to maximize yield, this is how the experiment looks using an OFAT method:

1. Start with temperature: Find the temperature resulting in the highest yield, between 50 and 120 degrees.

1a. Run a total of eight trials. Each trial increases temperature by 10 degrees (i.e., 50, 60, 70 ... all the way to 120 degrees).

1b. With time fixed at 20 hours as a controlled variable.

1c. Measure yield for each batch.

2. Run the second experiment by varying time, to find the optimal value of time (between 4 and 24 hours).

2a. Run a total of six trials. Each trial increases temperature by 4 hours (i.e., 4, 8, 12… up to 24 hours).

2b. With temperature fixed at 90 degrees as a controlled variable.

2c. Measure yield for each batch.

3. After a total of 14 trials, we’ve identified the max yield (86.7%) happens when:

• Temperature is at 90 degrees; Time is at 12 hours.

As you can already tell, OFAT is a more structured approach compared to trial and error.

But there’s one major problem with OFAT: What if the optimal temperature and time settings look more like this?

We would have missed out acquiring the optimal temperature and time settings based on our previous OFAT experiments.

Therefore, OFAT’s con is:

• We’re unlikely to find the optimum set of conditions across two or more factors.

### How our trial and error and OFAT experiments look:

Notice that none of them has trials conducted at a low temperature and time AND near optimum conditions.

### What went wrong in the experiments?

• We didn't simultaneously change the settings of both factors.
• We didn't conduct trials throughout the potential experimental region.

The result was a lack of understanding on the combined effect of the two variables on the response. The two factors did interact in their effect on the response!

A more effective and efficient approach to experimentation is to use statistically designed experiments (DOE).

### Apply Full Factorial DOE on the same example

1. Experiment with two factors, each factor with two values.

These four trials form the corners of the design space:

2. Run all possible combinations of factor levels, in random order to average out effects of lurking variables.

3. (Optional) Replicate entire design by running each treatment twice to find out experimental error:

4. Analyzing the results enable us to build a statistical model that estimates the individual effects (Temperature & Time), and also their interaction.

It enables us to visualize and explore the interaction between the factors. An illustration of what their interaction looks like at temperature = 120; time = 4:

You can visualize, explore your model and find the most desirable settings for your factors using the JMP Prediction Profiler.

### Summary: DOE vs. OFAT/Trial-and-Error

• DOE requires fewer trials.
• DOE is more effective in finding the best settings to maximize yield.
• DOE enables us to derive a statistical model to predict results as a function of the two factors and their combined effect.