Factors in Designed Experiments

How do I identify the factors that are important to include in my designed experiment?

You should choose experimental factors by considering all the variables in the process or system under study. It is good practice to list all potential sources of variability, including those that you cannot control. A cause-and-effect diagram or a process map can be useful for brainstorming and organizing potential variables.

In the design of the experiment, you specify which sources of variability you can control and randomize those that you cannot control.

What variable types can I use for factors?

Factors can be continuous, discrete numeric, or categorical. They can be used as experimental treatments, or as blocks, covariates, and even mixture components in your experiment.

How do I select the factor levels for my experiment?

When selecting factors, be bold, but not foolish. Choose values in the range of interest that are spread wide enough apart to produce a change, but not so wide that you can miss relationships over a realistic range.

How can I address additional sources of variation and nuisance variables?

You can minimize extra variation by holding the unimportant variables constant, if possible, or by collecting data on variables that you can measure but can’t control and including that in the analysis. You should use blocking to control extraneous sources of variation and randomize the runs to protect your experiment against unknown sources of variation.

Identifying sources of variation

When you conduct an experiment, you systematically manipulate your experimental factors while attempting to minimize or control other sources of variation. There might be many sources of variation in the response.

As a first step, your team might need to generate a list of all of the potential factors. You can use a cause-and-effect diagram or a process map to help with this step.

A cause-and-effect diagram can be used during brainstorming sessions to help identify potential sources of variability in a process. Experimenters begin with the process characteristic that they want to study, then work backward to create a structured list of process inputs that might cause variation in the process output characteristic under investigation. It is not unusual to study multiple process output characteristics with one designed experiment. In those instances, multiple cause-and-effect diagrams can help identify process inputs that relate to multiple process output characteristics.

Here are some guidelines for identifying the important factors for your experiment:

Experimental factors, controlled variables, and uncontrolled variables

After developing your list of potential factors, you'll want to classify these factors into one of three categories: variables you're interested in studying (the experimental factors), controlled variables that are held constant, and uncontrolled (noise or nuisance) variables. The experimental factors will be manipulated in the experiment. You will choose specific values for the levels of these factors and the purpose of the experiment will be to see how changing the values of these experimental factors changes the outcome or response.

The controlled variables might have an effect on the response, but these are variables that you're not interested in studying in this particular experiment. You should hold those variables constant at the same value for the entire experiment. But remember, this choice means that the interpretation of your experiment is valid only when those variables are fixed at that value. For example, if the lab from which you source your experimental substrate has a potential effect on the response, but you do not have interest or resources to study this factor in your current experiment, then you should choose one vendor lab to source the substrate for the entire experiment. The results of the experiment are now only known to be valid for the chosen vendor lab.

If you have the resources to source from multiple labs, then you can incorporate this variable into the experiment by blocking for this lab variable. You can read more about blocking at Key Principles of Experimental Design.

Noise, or nuisance, variables can't be controlled or held constant. However, noise variables might be important, and it might be possible to measure some of these variables. You should record the values of these variables for each experimental trial and include them in your analysis as uncontrolled variables.

Factor types

Factors can be continuous, discrete numeric, or categorical. They can be used as experimental treatments, or as blocks, covariates, and even mixture components in your experiment.

More about data types:

Choosing factor levels

After you've identified the experimental factors, you need to determine the operating ranges for each factor and identify factor levels to use in the experiment. George Box, a DOE pioneer, famously said, "If you really want to know what effect a variable has, you actually have to change that variable!"

If the high and low levels of your factor are too close together, you might not see any effect of changing the factor on the response. As a result, you might miss an important effect.

On the other hand, if the levels are too far apart, the behavior of the response over the large range of the factor might be difficult to model. Or, if you set the factor levels wider than you have used in production, you might even cause your process or equipment to fail. So, it's important to try to think in advance about how the response or responses might vary in relation to the changes you make in a factor.

Ideally you want to bracket the optimal factor settings within the factor ranges of your experiment. This might mean making the ranges wider than you think you should – but not too wide!

If you have any concern that some combinations of factor settings might not work, you can run trials to determine whether the settings are feasible. For example, before you conduct an experiment on production equipment, you might want to run smaller-scale trials on test equipment to make sure that all of the combinations of factor settings actually work.

To illustrate the importance of choosing your factor levels, let's consider a simple example. In this scenario, notice that the operating range for the factor is 0 to 3.

The true, but unknown, relationship between the factor and the response is shown by the green curve. You don't know much about this factor, so you design an experiment. You decide to use only two levels, or design points. The question is, where do you put these design points? If you put them as shown in the figure below, you conclude that the factor has a strong positive effect.

But if you put them as shown in the next figure, you conclude that the factor has a weak positive effect.

At another pair of values [see figure below], you conclude that there is a strong negative effect.

But if you use the values shown in the next two figures, you conclude that there is no effect.

Obviously, this is an extreme example, and the green curve is what you are trying to estimate with your data. This true relationship is not known. The point is that where you set your factor levels can lead to very different conclusions regarding the relationship between the factor and the response.

This example also reminds us that, with a two-level experiment, you can fit only a first-order model. If you think there is curvature in the relationship, you need at least three levels of the factor to model this curvature, as shown below

Drill bit example: Identifying factors, addressing nuisance factors, and choosing factor levels

In this experiment, you want to compare the Hardness measurements of indentations made by different drill bits. To begin, you list the potential sources of variability. Your primary interest in this experiment is to find the effect that changing the Bit Type and the drill bit Diameter has on the Hardness. Therefore, your experimental factors are Bit Type and Diameter.

In addition to the factors of interest (the type and size of drill bit), some variability that is caused by nuisance factors for this experiment might be as follows:

For all three of these nuisance factors, you have found a solution to control their potential variability by holding them all (lot, machine, and instrument) constant. By addressing these sources of variability, you have essentially neutralized the nuisance factors.

Your experiment is now quite simple: you will use one lot of metal sheets, one machine, and one instrument for the entire experiment. You only need to determine the factor levels for Bit Type and Diameter. Your machine shop has four bit types available, and you will test all four. These four types (Purple, Green, Orange, and Blue) are your factor levels for Bit Type in this experiment. Bit Type is a categorical factor with four levels. You also want to study the effect of the diameter of the drill bit on its hardness. You are interested in any diameters in the range of 1/16” to 1/8”, but you are limited to using only the sizes that are manufactured by your supply company. There are five diameters in your range of interest: 1/16”, 5/64”, 3/32”, 7/64”, and 1/8”. These are the potential factor levels for your experiment. However, since you are interested in modeling Diameter as a continuous factor, you do not need to select all five available sizes. Instead, you can choose only one low and one high value in order to fit a linear relationship between Diameter and Hardness. Or, to fit a curved polynomial relationship, you can add a middle point for the Diameter. Choosing 1/16”, 3/32”, and 1/8”, you can model both the linear and the quadratic effect for the Diameter factor.