Probability Limits on Control Charts

When do you need to calculate control limits based on a fitted distribution?

Shewhart showed that 3$\sigma$ limits on control charts work no matter the distribution of data. Shewhart’s observations are backed up by the empirical rule, which says that 98-100% of the stable distribution of a process will fall between the mean of the process plus or minus three times the standard deviation of the process. You can see this for many different types of distributions below.

Distribution % in $\mu$ ± 1 $\sigma$ % in $\mu$ ± 2 $\sigma$ % in $\mu$ ± 3 $\sigma$
Normal 68 95 99.7
Lognormal 91 96 98.2
Uniform 58 100 100
Beta(2, 5) 58 93 98.8
Percent of different distributions falling withing three standard deviations around the mean.

Because of the empirical rule, you don’t need to calculate control limits based on a fitted distribution.

However, despite the empirical rule, if the probability distribution of the in-control process is known, it can be useful to use that distribution to calculate probability limits. This practice can be especially useful when your process data come from a known distribution that exhibits skewness. You can also use it when your known distribution might change to another known distribution, for example, a Weibull distribution of life times changing to an exponential distribution as the failure modes change.

As an example, consider the following individuals control chart.

Control chart on skewed data

Suppose you collect data on the weight of impurities in a chemical sample, measured in micrograms. The theoretical range of the amount ranges from zero to the weight of the sample, which should be around 1 gram. Consider the following control chart on the individual impurities values.

Figure 1: Individuals control chart on skewed data.

You’ll notice that the 3$\sigma$ limits are working. Most of the data from the stable process fall within the control limits. However, the lower control limit does not seem appropriate for these data. The smallest value is slightly bigger than zero, but the lower control limit is negative. Also, there are several signals on the moving range chart.

The power of subgroups

If you can subgroup your data, the subgroup averages of skewed data can be more normally distributed due to the central limit theorem. (See “Inferential Statistics” lesson, part of the free JMP course Statistical Decisions Using ANOVA and Regression.)

For example, suppose all values from chemical impurities can be put into rational subgroups. The following Xbar-S chart is created.

Figure 2: Xbar-S chart on subgrouped impurities data.

But what do you do if you can’t form rational subgroups, that is, if the opportunity for special cause variation exists between every data value? If subgrouping is not an option, you might consider fitting a distribution and using quantiles of that distribution as control limits.

Finding probability limits

To find appropriate control limits for a chart on data from a skewed distribution:

  1. Fit the distribution to the data.
  2. Examine the histogram with an overlaid probability curve and a quantile plot.
  3. Find the quantiles of the model for 0.005, 0.5, 0.995.
  4. Add these limits to the individuals control chart.

To determine an appropriate distribution for the data, use your subject matter knowledge. For example, time between failures often follows a Weibull, exponential, or lognormal distribution. High count data often follow an exponential or gamma distribution. If you do not have the necessary subject matter knowledge, use an objective criterion like AICc to guide you. The best model minimizes AICc. Also examine quantile plots. For more information on model comparison criteria like AICc, see A Deeper Dive into Likelihood.

Let’s see these steps in action for the impurities data. First, let’s find a distribution that fits well and makes sense from a subject matter perspective. The exponential distribution provides the best fit (by AICc) and the theoretical curve matches the histogram.

Figure 3: The best fit distribution for the impurities data is the exponential distribution.

Next, we can find the 0.5% and 99.5% quantiles of the fitted exponential distribution. These values can be used to find control limits.

Figure 4: The 0.5% and 99.5% quantiles of fitted exponential distribution can be used for control limits.

Finally, we create an X-MR chart on the impurities, then change the limits to the quantiles of the fitted exponential distribution. There is one out-of-control point on the X chart, with two corresponding out-of-control points on the MR chart. This subgroup warrants investigation.

Figure 5: Control chart on the impurities data with limits based on the exponential distribution.