Process Capability
What is process capability?
Process capability is a very useful statistical measure of how well a process stays within its specification limits. For our purposes, the word “process” is meant in a very generalized way, and it can be any measurement that is important to monitor and control. For example, it could be a physical attribute (such as length or width), a chemical property (such as pH or reactivity), a performance measure (such as rate or strength), just to name a few. Most processes have a list of important characteristics that are monitored, and that list can be a few critical measurements – or thousands of them. Measures of process capability, known as capability indices, have been developed that provide a convenient summary of how well the process is performing. These indices help us answer important questions, such as:
- Is the process performing well?
- If not, what’s the nature of the problem?
- Which process parameters need attention?
There are two important concepts to consider when thinking about capability: natural variation and specification limits.
What is natural variation?
Natural variation is unavoidable. Every variable being measured will have a range of values. Values will also tend to exhibit a characteristic shape that describes their distribution. There are many different shapes that can occur, and some are more common than others. The normal distribution is the most common and is at the heart of capability indices, although the indices can be adapted to other distributions.
Understanding natural variation is a fundamental component to any exercise involving capability. Everything we measure will have variation and a characteristic shape. Often this shape is determined by performing an exercise of characterization that includes sources of variation. Sources of variation can be described as either common cause or special cause. Common cause variation is natural, inherent, and unavoidable. Special cause variation is assignable to a specific source. It is often unexpected and considered some kind of error or malfunction. A comprehensive characterization should include all the sources of common cause variation, including time (day-to-day, week-to-week variation), equipment (variations across maintenance and calibration or perhaps multiple machines can be substituted), personnel changes (such as production shifts), and raw materials (differences that occur batch to batch). Once a process is running, the full extent of common cause variation will become evident. Control charts are an excellent tool to monitor common cause variation. The data can be collected and charted as a distribution, as seen in the example below.
In this example, you’ll notice that the distribution has the characteristic shape of a normal distribution. A hypothesis test called the Shapiro-Wilk test can be used to determine if the distribution of the data is different from a normal distribution. The overlayed green bell-shaped curve is the normal distribution function with the same mean (80.26) and standard deviation (4.848) as the data. This graph is a visual indication that the data follow a normal distribution. Notice that specification limits are also shown on the graph (LSL for lower specification limit and USL for upper specification limit) along with a target. This process appears to be centered on the target and capable meeting specification. Capability indices provide a numerical way of summarizing the capability of the process.
What are specification limits?
Specification limits define the acceptable limits for a given process. Processes are expected to stay within these limits, but that isn’t always the case. When a characteristic goes outside its specification limits, there are a variety of implications, some are more urgent than others. Noncompliant results can lead to:
- Scrappage and the associated cost.
- Maintenance to return the process to the acceptable range.
- Loss of productivity.
- Production shortfalls in meeting commitments and the resulting ripple effect it can have in other areas, including customer needs and expectations.
Since the consequences of noncompliance can be high, it’s important to carefully choose specification limits that are “right sized,” neither too tight nor too wide. When limits are too tight, the process is described as “not capable,” which consumes a lot of resources trying to produce compliant results. Limits that are too wide can hide quality issues.
To illustrate capability, let’s consider a glass vase wrapped in bubble wrap and packed in a cardboard box for shipment. The dimensions of the box are analogous to the specification limits. The size of the vase wrapped in bubble wrap is analogous to a process characteristic being monitored. The process has natural variation, it’s not possible to produce exactly the same dimensions on every vase. If the box is too small, the specification limits are too tight, meaning many vases must be rewrapped to fit in the box. If the box is too big, the vase is not secure, which can lead to quality issues, perhaps damage during shipment. Since neither outcome is desirable, it’s important to understand the range of natural variation that can occur and to choose a right-sized box that accommodates the natural range of values, but not much more than that.
What if limits can't be changed?
In the above example, we assume that the specification limits can be changed, but this is not always true. Specification limits are often dictated by the requirements of the process coming from an internal or external customer. For example, a process parameter like “impurity level” is likely to have an upper specification limit that’s fixed and can’t be changed. When limits are fixed, the only option to improve capability is to optimize the process, generally by centering the process and/or reducing process variation. Usually, this is done using control plans and designed experimentation.
What are process capability indices?
Process capability indices are statistical tools we use to determine how capable a process is at meeting specifications.
Below are several possibilities for capability.
Process condition
Distribution within the limits
Relevant capability indices
While visualizing your data is highly encouraged, it is convenient to have numerical indices that describe each variation of capability. We will see that one indicator summarizes if a process is capable or not. Then four additional indicators summarize why a process is not capable, distinguishing each of the graphs shown above. The numerical method is especially important if there are many process characteristics and that resources can be focused on the ones that need the most attention. These five indicators are known as Cpk, Cpl, Cpu, Cp, and Cpm.
Five more indicators, known as Ppk, Ppl, Ppu, Pp and Ppm, are identical except for the choice of the estimator of s used in the calculation. The Cpk family uses a short-term estimator of $\sigma$, while the Ppk family uses a long-term estimator of $\sigma$. The formulas for the Cpk family is most common, so we’ll describe those first. Next, we’ll look at an example of Ppk, and discuss why it’s important. It’s worth noting, that some organizations may be calculating Ppk and referring to it as Cpk. That might sound confusing, but conceptually Cpk and Ppk are very similar and interpreted in a similar way. In fact, if the process is stable, Cpk and Ppk should be practically equal.
How to calculate capability indices
Calculating Cpk
Capability indices are calculated as the ratio of the width of the specification limits to the width of the natural process variation. If the spec limits are wider than the data range, capability will be greater than 1. If the limits are narrower than the data range, capability will be less than 1. Since data aren’t always centered within spec limits, the ratio is calculated for the top and bottom half of the data and compared to the corresponding upper and lower specification interval (see below).
Learn how to calculate Cpk in JMP
https://www.youtube.com/watch?v=gFMdYxYiBRU
- To see more quality and reliability JMP tutorials, visit JMP's Quality and Reliability playlist on YouTube.
- To follow along using the sample data included with the software, download a free trial of JMP.
For more details on the formulas for the capability indices, see capability indices formulas.
The 3$ \sigma $ principle
The 3$\sigma$ principle is at the heart of the capability indices. The normal distribution has 99.7% of the data falling between $\mu$ ±3$\sigma$, which is close enough to 100% that the indices use an estimate of 3$\sigma$ in the denominator.
Interpretation of Cpk, Cpl, and Cpu
Cpk is a convenient summary of the capability of the process. If Cpk > 1, the process is described as “being capable.” If Cpk < 1, the process is described as “not capable” and nonconformant results are expected, which means it’s likely that results will occur outside the compliance limits. It’s worth noting that many organizations use a critical threshold number that’s greater than 1 (1.33 is a typical value). Cpl and Cpu provide a next-level detail of capability. These indices are used to help determine the reason for low Cpk. Sometimes both Cpl and Cpu are less than the critical threshold (e.g. 1 or 1.33).
The other capability indices: Cp and Cpm
It’s also possible that the limits are wide enough to accommodate process variation but that the process is not centered within the limits. This means the process would be capable, if it were centered between the limits. The Cp index helps us answer this question.
The Cpm index was developed for processes where it’s very important to be centered on the target, and this index is particularly sensitive to conditions where the process is not centered on the target.
The Ppk family of capability indices
As mentioned above, for each Cpk index there is a corresponding Ppk index; the only difference is the value used for sigma. Cpk uses a short-term, or within-subgroup, estimator of $\sigma$, while Ppk uses a long-term, or overall, estimator of $\sigma$.
It’s important to look at both Cpk and Ppk, particularly early in the learning phase of a process. The root cause for large differences between Cpk and Ppk should be understood and may indicate a problem that needs to be addressed. The underlying condition is referred to as “stability” and there are stability indices, the ratio of short- and long-term sigma, that can be used to quantify it. Control charts are an effective tool to visualize and monitor this condition. Often, particularly for mature processes, there is little difference between Cpk and Ppk, so one of them is chosen as the best indicator to monitor.
Capability index formulas
Cpk, Cpl, and Cpu
Cp
Cpm
The formulas above use the following notation:
LSL = Lower specification limit
USL = Upper specification limit
T = Process target
$\hat{\mu}$ = Estimate for the mean of the process
$\hat{\sigma}$ = Estimate for the sigma of the process
Estimation of $\sigma$ for process capability
All the capability indices rely heavily on the estimation of $\sigma$, but there are issues with underestimating and overestimating $\sigma$.
Underestimating $\sigma$ can occur when:
- Insufficient data is collected.
- All the sources of natural variation are not included in the estimate.
Overestimating $\sigma$ can occur when:
- Special cause sources of variation are included in the estimate.
- The process is not stable and in control.
Properly estimating $\sigma$ often involves subject matter expertise and consideration, often by brainstorming all potential sources of variation. While not an exhaustive list, the reasons can include:
- Data collected across sufficient time.
- Sufficient amount of data collected (some industry specifications have recommendations as discussed in Lesson 3, Section 3 of JMP’s online statistical course, Statistical Process Control).
- Variations in equipment and their maintenance and calibrations.
- Variations in personnel.
- Variations in raw materials and inputs into the process.
- Environmental variation (e.g., temperature, humidity. etc.).
- Aggregating results (such as batches) or not, which can lead to different estimates.
For more details, please see JMP’s online statistical course’s module on control charts.
Achieving desired capability
Large-scale capability analysis
It’s common for organizations to monitor many process parameters. It’s convenient to summarize all of them to highlight the ones needing the most attention so you can focus resources on them.
Capability analysis summarized in a single graph
The figure below shows an example of a single graph summary using JMP’s goal plot. For this analysis, we’ve normalized 200 process variables using their specification limits, mean, and standard deviation. This standardization allows us to plot all of these variables on a single graph:
- The area marked with a red triangle shows the limits of capability using a 1.33 criterion.
- All the points inside the triangle are process variables that have capability indices higher than 1.33.
- All the points outside the red triangle area are process parameters that have capability indices less than 1.33.
- The points to the left of the red triangle have Ppl less than 1.33 (process is centered too low).
- The points to the right of the triangle have Ppu less than 1.33 (process is centered too high).
- The points above the triangle are failing from a Pp perspective (process distribution is too wide to fit within the limits).
- The points near the red lines of the triangle are the marginal cases.
Standardized box plots
Standardized box plots are another effective tool for visualizing the capability of many variables. In the graph below, data were scaled to limits of +/- 0.5. With this kind of visualization, it’s quick and easy to see that some variables are centered and capable (Parameter 1), some are shifted high (Parameter 2), some are too wide for the limits (Parameter 3), and some shifted low (Parameter 4).
Process performance graph
JMP’s process performance graph is another way to visualize the capability of many parameters with one graph. It also includes an assessment of stability. Points that are low on the graph indicate a process with poor capability. Points that are on the right-hand side have poor stability (these processes would show issues on a control chart). The best quadrant, the upper left, are parameters that are both capable and stable.