Confidence Intervals

What is a confidence interval?

A confidence interval is an interval of plausible values for a population parameter, such as the population mean. It is typically centered at an estimate of the parameter, and the range of the interval represents the uncertainty in the estimate due to the fact that you are sampling from the population.

What is a confidence level?

The confidence level of a confidence interval is used to state your confidence in the interval estimate of the population parameter. For example, suppose you measure the grams of protein in a sample of energy bars and calculate a 95% confidence interval for the mean. You are 95% confident that the true mean is contained in your interval. Another way to think about this is that if you were to collect multiple samples of the energy bars and calculate 95% confidence intervals for each, 95% of these intervals would contain the unknown true mean grams of protein (the population parameter). Of course, you want to be as confident as possible, but increasing the confidence level increases the width, or range, of the confidence interval.

The animation below shows a thought experiment in which 100 samples are collected from a population where the mean is zero. For each sample, the 95% confidence interval is plotted. In most samples (95 out of 100, in fact) the confidence interval contains the true population mean, but in five samples it does not. In real life, you would not collect all these samples. This is simply a visual illustration of the interpretation of a confidence interval.

What is sampling uncertainty?

In any data set, there are two sources of variation. One is from the population. Individuals in a population vary; therefore, individuals in a sample vary as well. The other source of variation is from the sampling itself. In theory, you can take multiple representative samples from your population, and each of these samples will be different from each other because they are made up of unique subsets of individuals. This sample-to-sample variation is called sampling uncertainty.

How do you calculate a confidence interval?

You can use statistical software, like JMP, to calculate confidence intervals on population parameters. Formulas for confidence intervals exist for common parameters such as:

The general formula for a confidence interval is:

estimate $\pm$ distribution quantile (1 - $\alpha$ / 2) X standard error of estimate

where

As an example, suppose you have measurements of the amount of protein in 31 energy bars (the data for this example can be found here). You believe the measurements come from one population. You want to estimate the mean of that population, so you calculate the sample mean and get 21.40 grams. You know the mean of the population isn’t exactly 21.40, so you calculate a 95% confidence interval for the mean. You find the standard deviation, s, of the sample to be 2.54. The standard error of the mean is:

$\frac{s}{\sqrt{n}} = \frac{2.54}{\sqrt{31}} = 0.46$

The sample mean of independent, normally distributed data has a t distribution, so you find the quantile of a t distribution with n – 1 = 30 degrees of freedom corresponding to a 95% confidence interval, which is 2.04. Using the formula, you calculate the confidence interval as

$\bar{y} \pm t_{n-1,\alpha/2} \frac{S}{\sqrt{n}} = 21.40 \pm 2.04 \cdot 2.54 / \sqrt{31} = (20.47,\;22.33).$

Statistical software allows you to plot the data and verify the calculation.

What are the assumptions for a confidence interval?

The assumptions of the confidence interval calculation are:

For example, to believe the confidence interval on the mean is a 95% confidence interval, you would ensure that you have collected your data sample in a way that is representative of all sources of variation in the process. If so, then even if the population is not normally distributed, the central limit theorem allows for the sample mean to come from a normal distribution, therefore the t distribution is appropriate.

When should you use bootstrap confidence intervals?

In some situations, the sampling distribution of the estimator is unknown, so the confidence interval formula cannot be used. An alternate method is to resample from the data and use a bootstrap confidence interval to estimate the parameter of interest.

Bootstrapping repeatedly resamples the observations in your data with replacement to produce a bootstrap sample of size n. Because you are resampling with replacement, some observations might not appear in the bootstrap sample, others might appear multiple times. Multiple bootstrap samples are created, and the estimator is calculated for each one to form a bootstrap sampling distribution of the estimator. The quantiles of this bootstrap sampling distribution are then a confidence interval on the parameter of interest.

For example, let’s find a bootstrap confidence interval on the mean of our energy bar data. (Note that it is not necessary to use bootstrapping in this case. The theory is well-understood, and the data do not violate assumptions, but we will continue with this example for convenience.) You can use statistical software to resample from the data, each time calculating the mean of the resampled data.

The image below shows the histogram of the means of 2,500 bootstrap samples from the energy bar data, summary statistics for the bootstrap samples, and bootstrap confidence interval calculations for different levels of confidence.

The original estimate of the sample mean is 21.40. You can see in the Summary Statistics table above that the mean of the 2,500 bootstrap sample means is also 21.40. The 95% confidence interval on the mean of the population using the resampled data is (20.53, 22.30).