Control Charts for Attribute Data
What are attribute control charts?
Attribute control charts are used to monitor count data instead of measurement data. Some examples are road accidents per year on a given stretch of highway, the size of a mold colony in a standard petri dish after an hour of growth, or the number of nonconforming items in a manufacturing lot.
When should you consider using attribute control charts?
Shewhart control charts don’t require the data to only be measurement data. Count data can be charted with Shewhart charts as well. Count data are modeled by discrete distributions, and these distributions can be used to find 3$\sigma$ limits.
What is the area of opportunity?
Often, counts are nonconformances, defects, nonconforming items, or defective items. There is a difference between a nonconformance or defect on an item and a nonconforming or defective item. A nonconforming item might contain one or more nonconformances. It’s important to understand what is being counted when looking at attribute control charts, just as it is important to define the measurement for variables control charts.
The area of opportunity of an attribute chart is an important concept. The context of the data needs to be taken into account before analysis. If two counts have different areas of opportunity, they must be converted to rates in order to be compared. Areas of opportunity can be either finite or infinite.
For example, say you are counting the number of flaws on the cover of a book that has just been bound. To compare two counts, the books need to be the same size. It wouldn’t be fair to compare one flaw on a small book with three flaws on a large book. The rates are comparable, for example, one flaw per square foot vs. three flaws per square foot.
When counting events to plot on control charts, two distributions are commonly applied:
- Binomial, used when counting events and non-events.
- Poisson, used when counting events only.
Two other distributions are more rarely applied:
- Negative binomial, used when counting the number of events between rare incidents.
- Weibull, used when counting the number of time intervals between rare incidents.
Binomial distribution, P and NP charts
Binomial probabilities are used to find limits for P and NP charts. The binomial distribution is used to find the number of “successes” in a fixed number of Bernoulli trials. A Bernoulli trial is an experiment with two outcomes, “success” and “failure,” and the probability of “success” stays the same between trials.
Suppose we ship batches of books after an inspection step. The inspection examines each book for defects. In a batch of size n books, there can be 0, 1, 2, …, n books with defects. If we assume the probability of a defective book is the same from book to book, the number of defective books in a batch should follow a binomial distribution. When a batch of books has been inspected, we can plot the number of defective books on a control chart.
NP charts display the number of nonconforming or defective items in a subgroup sample. The formulas use the short-term estimates of the binomial mean and standard deviation to find 3$\sigma$ control limits. If the batch sizes are not constant, the centerline and control limits are not constant.
P charts display the proportion of nonconforming or defective items in a subgroup sample. A P chart contains exactly the same information as an NP chart on the same data, however, it can be easier to read when your data have unequal subgroup sizes, since the centerline is always constant.
Poisson distribution, C and U charts
Poisson probabilities are used to find limits for C and U charts. The Poisson distribution models the number of arrivals of a Poisson process in a unit of time or space. A Poisson process is an arrival process in which the number of arrivals in an interval (or space) depends only on the length of the interval (or the size of the space). The arrival rate of the process, the number of arrivals in a unit of time, is denoted by $\lambda$. There is no upper bound on the number of arrivals.
C charts display the number of nonconformances in a sample and use the short-term estimates of the standard deviation of the Poisson distribution to estimate 3$\sigma$ control limits. If the area of opportunity varies between subgroups, so do the centerline and limits.
U charts display the proportion of nonconformances in a subgroup, where the subgroup might have a varying number of inspection units. The plotting statistic is the number of nonconformances divided by the area of opportunity for the subgroup. If the area of opportunity varies between subgroups, the control limits also vary, but the centerline remains constant.
Should I use binomial or Poisson distribution?
If you can count both conforming and nonconforming items, for example, the number of books with cover scratches in a box of 24 books, use binomial-based charts. The total number of books that you are counting is known and finite. If you can only count nonconformances, for example, the number of scratches on the cover of one book, then use Poisson-based charts. The total number of possible scratches has no theoretical upper bound.
Control chart for rare events
Sometimes you’ll want to count events that don’t happen very often. As an example, suppose you have daily data on interrupts to your inventory system, which is an electronic system subject to outages due to weather, worn-out parts, wildlife, power surges, and other causes. You plot a C chart of the interrupts by day.
Every instance of an interrupt is also an out-of-control point. The C chart is not helpful for monitoring the process. There are two control charts designed to monitor rare events.
The G control chart uses 3$\sigma$ limits based on a negative binomial distribution instead of a Poisson distribution. The negative binomial distribution is similar to the Poisson, but includes an extra parameter to allow for overdispersion, that is, allowing the variance to be bigger than the mean. The G chart on the interrupts data shows no signals.
Another common method for monitoring rare events is to measure the time between events instead of the count of events. The Weibull distribution is used to find 3$\sigma$ control limits on the T chart. Note that higher time between events is better, so points above the upper control limit indicate an unusual run of no interrupts. A run of points below the centerline or a decreasing trend to the data would indicate an increase in the system failure rate.
For more information
For details on control chart limit calculations for all charts on this page, see Lesson 5, Section 1 of Statistical Process Control Course - JMP User Community