Key Principles of Experimental Design
What are the three basic principles of experimental design?
- Randomization of runs, or experimental trials, prevents systematic biases from being introduced into the experiment. Randomization refers not only to performing the runs in a random order, but also to resetting the conditions after each run.
- Blocking is a design technique that is used to reduce or control variability from nuisance factors.
- Replication enables the experimenter to obtain an estimate of experimental error.
These three basic principles or techniques are fundamental to experimental design. Randomization is used to reduce bias, while the other two principles help increase precision in the experiment.
Randomization
For each trial in the experiment, or experimental run, you apply a treatment and record the response. When you randomize the experiment, you run the treatments in random order. Randomization averages out the effects of uncontrolled (or lurking) variables.
Let’s look at an example. Suppose you are studying a cleaning process for titanium parts, with two factors, Bath Time and Solution Type. You conduct the experiment without randomization. It takes one day to conduct all six trials. You run all of the treatments with Bath Time at 10 minutes in the morning, and all of the treatments with Bath Time at 30 minutes in the afternoon. Meanwhile, the ambient temperature and humidity both increase throughout the day.
In your subsequent analysis, you might conclude that Bath Time is significant. However, because you didn't randomize the treatments, you can't separate the effect of Bath Time from the effects of ambient temperature and humidity. These effects are confused, or confounded. Randomizing the treatments can prevent this confusion.
Blocking
Blocking for nuisance variables
You can use a technique called blocking to minimize the impact of the variation caused by nuisance variables. For example, if you conduct the experiment across different batches, the individual batches might be different from one another.
You can divide your experiment into blocks to balance the potential variation from the different batches across the runs in your experiment.
Additionally, if you are conducting your experiment across more than one day, uncontrolled day-to-day variation can add a lot of unexplained variation to your results. If you include Day as a blocking variable in your experiment, you can account for the day-to-day variation in your analysis so that you can better detect your important effects.
Restricted randomization for hard-to-change factors
We’ve learned about the importance of randomization to average out the effects of uncontrolled or lurking variables. However, there are times you might not be able to completely randomize all of your treatments. You might have factors that are difficult to change, or factors that can only be changed in a certain order.
For example, suppose that an experimental factor is oven temperature.
Because of the time it takes for the oven to reach a set temperature, it wouldn't be practical to change the oven temperature between each run.
When you have hard-to-change factors like this, you'd run a split plot or strip plot experiment instead of a fully randomized experiment. For more information about determining how to randomize experiments using split and strip plot designs, see Lesson 3 of JMP’s Custom Design of Experiments free e-course.
Replication
Replication is the idea that you repeat the same experimental conditions one or more times and take new measurement(s) for these repeated settings. You can repeat, or replicate, a single treatment, a subset of the treatments, or all of the treatments. In a fully replicated experiment, each treatment is replicated at least once. Replication enables you to estimate the experimental error, which is the unexplained variation in your experiment (that is, the variation in your response that is not explained by changing your factors). An estimate of the experimental error is necessary for testing statistical significance.
For example, in the cleaning experiment above, the experiment requires, at an absolute minimum, a single run for each of the six treatments. Suppose we replicate the treatment at the settings of Bath Time = 10 and Solution Type = 1. When we randomize the order of the runs, this treatment combination might occur as the second run and also as the seventh run. Having replication for one or more design points allows for an estimate of the variation in the response when the factor settings are the same – in other words, the experimental error. Replicating more than one design point will improve the estimation of the error.
Let’s consider an experiment where you want to compare the hardness of four different types of drill bits by measuring the depth of indentations they make on metal sheets. The first few steps for this experiment might include the following tasks:
- Select all experimental units (metal sheets) from the same lot.
- Randomly assign treatments (drill bits) to the experimental units.
- Press a drill bit into the metal sheet.
- Measure and record the indentation depth.
Replication occurs when you assign identical treatments to more than one experimental unit. An improper interpretation of replication is a repeated measurement.
Suppose you are limited to four experimental units for this experiment, thus allowing only a sample size of one for each of the four types of drill bit (the treatment). If you are concerned that this is insufficient, you might consider applying two treatments to each metal sheet and measuring the indentations. You reason that this gives you a sample size of two observations (or one replicate) for each treatment.
What's wrong with this approach?
True replication means applying the same treatment to more than one experimental unit. You cannot apply different treatments (drill bits) to an individual metal sheet, because the metal sheet is defined as a single experimental unit. By using each sheet twice, you perform pseudo-replication, not true replication. It's not appropriate to treat the two samples as independent, because the observations from the same metal sheet are interdependent.