Probability Distributions
What is a probability distribution?
A probability distribution explains how likely different outcomes or values of a random variable are. A random variable is a measured variable whose outcome is uncertain. Random variables can be continuous (e.g., the yield of a chemical reaction or the time it takes to drive to work each day) or discrete (e.g., the number of heads in five flips of a fair coin or the number of defective items found during 100% inspection of a lot). Probability distributions describe the probabilities assigned to each possible value of a discrete random variable, or for a continuous random variable, probabilities over a range of values. In statistics, probability distributions can be used to evaluate the likelihood of a calculated test statistic, like t or F (the outcome), in the context of hypothesis testing.
What are the parameters of a distribution?
Probability distributions have parameters, fixed values which often represent physical characteristics of the distribution. They are numbers that are used in the formula for the distribution function. Examples of parameters are the mean and standard deviation of a normal distribution or the arrival rate of a Poisson distribution.
What is a probability density function?
The probability density function (PDF) describes the probability of a continuous random variable X in any interval [a, b], or $P(a ≤ X ≤ b)$. For example, what is the probability that your commute to the office will take between 18 and 20 minutes? The PDF tells you how the probability is distributed across all possible values of X. The range of the PDF is non-negative and the integral over all values of the random variable is one.
What is a cumulative distribution function?
The cumulative distribution function (CDF) of a continuous random variable X gives the probability that X is less than or equal to a given value, or $F(x) = P(X \leq x)$, for all x. For example, the CDF of the time it takes to commute to work each day gives the probability that it will take 20 minutes or less to reach the office. The CDF is the integral of the PDF.
What is a probability mass function?
The probability mass function (PMF) gives the probability of a discrete random variable X taking a specific value, or $p(x_j) = P(X = x_j)$, for all values xj that X can take on. For example, the PMF of the number of heads in five flips of a fair coin gives the probability of getting 0, 1, 2, 3, 4, or 5 heads. The range of the PMF is non-negative and the sum over all values of the random variable is one.
What is a cumulative mass function?
The cumulative distribution function for a discrete random variable is sometimes referred to as the cumulative mass function. It describes how the probability of a discrete random variable X adds up as the values increase, and tells you the total probability of the outcome being a given value or smaller, $P(X ≤ x)$. For example, it can tell you the probability of getting three or fewer heads in five flips of a fair coin.
What are some common probability models?
| Discrete probability distributions | Continuous probability distributions |
| Bernoulli | Normal |
| Binomial | t |
| Beta binomial | Chi-square |
| Multinomial | F |
| Geometric | Weibull |
| Hypergeometric | Exponential |
| Poisson | Lognormal |
| Negative binomial | Gamma |
| Discrete uniform | Beta |
| SHASH | |
| Johnson | |
| Uniform | |
| Cauchy |