Binomial Distribution
What is the binomial distribution?
The binomial distribution is a discrete probability distribution that describes the probability of a given number of successes in n independent Bernoulli trials, where the probability of success in each trial, p, is constant. It is the underlying statistical distribution for the NP and P attribute control charts.
What are some examples of the binomial distribution?
There are a variety of real-world examples of outcomes that arise from a binomial distribution, including:
- The occurrence of heads in five flips of a coin, either a fair coin (p = 0.5) or a biased coin (p ≠ 0.5).
- Successful free throws made in 10 attempts in the game of basketball.
- Respondents who reply favorably to a yes/no question on a survey.
- Defective items in a lot of 100 items made from the same process, where the defect rate is known.
When should I consider using the binomial distribution?
The binomial distribution is useful when modeling the results of n independent binary trials. The trials must be independent with the same probability of success on each trial. You might collect data from n trials and estimate the probability of success from the ratio of the number of successes to the number of trials n. You might also model the number or proportion of successes as a function of independent variables, as in logistic regression, penalized regression, or a generalized linear model.
The binomial distribution gives probabilities associated with outcomes in multiple Bernoulli trials. For one trial only, the binomial distribution is equivalent to the Bernoulli distribution. The number of trials in the binomial is fixed and finite, and you can count both successes and failures.
If the number of trials is random, consider using the negative binomial distribution instead. The negative binomial models the number of Bernoulli trials until the kth success. Here k (number of successes) is fixed and n (number of trials) is random.
If you can only count the successes, consider using a Poisson distribution that models the number of successes (or events) in a time of fixed length or space of fixed size. As an example, consider the results of inspecting a lot of 100 items made from the same process; it is reasonable to assume that defects on defective items are independent. The binomial distribution models the number of defective (and non-defective) items in the lot. The Poisson distribution models the number of defects in the lot. With the binomial, you can count both defective and non-defective items. With the Poisson, you can count only defects, not “non-defects.”
Characteristics of a binomial random variable
| Model parameters |
n, the number of trials p, the probability of success on each trial |
| Mass function | $ p(X = x) = \binom{n}{x} p^{x} (1 - p)^{\,n - x}, \quad x = 0, 1, \ldots, n $ |
| Mean | np |
| Variance | np(1 – p) |
The graph below shows a binomial distribution when n = 10 for p = 0.40 and p = 0.65.
Using the binomial distribution to calculate probabilities
Suppose an athlete has a 65% chance of making a successful free throw in a game of basketball. What is the probability that she makes at least nine free throws in 10 attempts?
Let X be the number of free throws scored in 10 attempts. Then X ~ binomial(10, 0.65). The probability that she makes at least nine free throws is the probability that she makes either nine or 10 free throws. We can use the probability mass function to find the sum of those probabilities.
$ P(X \ge 9) = P(X = 9) + P(X = 10) = \binom{10}{9} 0.65^{9} 0.35^{1} + \binom{10}{10} 0.65^{10} 0.35^{0} = 0.086 $