Control Charts to Monitor More Than One Source of Variation
When is it possible to have more than one source of variation to monitor?
You might have more than one source of random variation in your process under the following conditions:
- Your process is not a continuous process.
- Your process is run in batches.
- Your data come from a nested hierarchy.
In these cases, the total random variation in the data is made up of specific variance components. Instead of controlling total variation with control charts, consider using the methods described here to control the variance components
In many fields, multiple sources of variation can arise simultaneously, each contributing to the overall variability in a process. In manufacturing, if lots of parts are processed together, the part-to-part variation within a lot might be different than lot-to-lot variability. In biotechnology, batch-to-batch variability might be different than within-batch variability. In semiconductor fabrication, variability within a wafer might be different than variability between wafers. In all these cases, there is more than one identified source of variation in the data.
Why do you need control charts to monitor more than one source of variation?
One of the assumptions of Shewhart control charts is that there is only one source of variation in your data, that is, subgroup-to-subgroup variability. Suppose you are using an Xbar-S chart to control the process. Each subgroup’s observations are assumed to be from a stable distribution with constant mean and standard deviation.
The underlying model is $y = \mu + \varepsilon,\ \varepsilon \sim \mathcal{N}(0, \sigma^2)$ where $\mu$ is the mean of the process and $\sigma$ is the standard deviation of the process.
So if you collect observations for each subgroup, each individual observation is assumed to come from the same distribution. Continuous processes have one source of variability. An Xbar-S control chart on data from this model looks fine.
Some processes change settings so much between subgroups that you can’t make this assumption. Consider a batch process that has both batch-to-batch and within-batch variability.
The underlying model for this process is:
$y = \mu + \textit{batch} + \varepsilon,\ \textit{batch} \sim \mathcal{N}(0, \sigma^2_{\textit{batch}}),\ \varepsilon \sim \mathcal{N}(0, \sigma^2)$
Let’s take a look at this control chart.
The traditional Xbar control chart limit calculation uses the within-subgroup variability, measured by $\overline{S}$. The traditional calculations work well for continuous processing with just one source of variation. However, this process has two sources of variation, between-subgroup and within-subgroup. By using only within-subgroup variation to calculate the Xbar chart limits, you are ignoring an important source of variability.
Notice that the S chart is using appropriate limits; the within-subgroup variation is being used to judge the within-subgroup s.
What to do?
An appropriate estimate of between-subgroup variation comes from the variation of the subgroup means themselves, that is, the moving range of subgroup means. Note that this method is the same as using an individuals control chart on the subgroup means. An appropriate chart to control the batch process is the three-way chart (see below). The first chart plots the subgroup averages with limits calculated from the moving range of the subgroup averages. The second chart plots the moving range of the subgroup averages on an MR chart. The third chart plots the subgroup standard deviation (or range) on an S (or R) chart.
When there are more than two sources of variation in your process, you can still use control charts to monitor the process.
For more information
See A Deeper Dive into Determining Components of Variation in a Process or Lesson 5, Section 2 of Statistical Process Control Course - JMP User Community for more details.
Multiple sources of variability with counting data
Multiple sources of variability can happen with attribute (counting) data as well. The assumption for a binomial P chart is that the probability of success p stays the same for each subgroup. Sometimes when you have large subgroup sizes that vary from subgroup to subgroup, this assumption does not hold, and the P chart limits will not be appropriate.
Let’s look at an example. Suppose you count the number of printer jams that occur each month. You use a P chart to control the proportion of pages jammed out of the total number of pages printed each month.
The P' (pronounced P-prime) control chart plots the standardized proportions on an individuals control chart, using their average moving range to estimate $\sigma$.