Descriptive Statistics

Style

section-padding-none

What are descriptive statistics?

There are two main branches of statistics: descriptive statistics and inferential statistics. The branch known as descriptive statistics is concerned with summarizing data in a meaningful way and describing its main features. Descriptive statistics include measures of central tendency (such as the mean or median) and measures of variability (such as the standard deviation or quantiles). Histograms and box plots are examples of graphs that can be used to display these statistics.

What’s the difference between descriptive statistics and inferential statistics?

Descriptive statistics are often calculated for a sample of data drawn from a population, and are used to describe key features of the sample. The population (illustrated on the left) includes all of the individuals or measurement values of interest. A sample (illustrated on the right) is a subset of the population.

In contrast to inferential statistical methods, which use probability and statistical models to draw conclusions about a larger population based on data from a sample, descriptive statistics do not attempt to generalize beyond the data in the sample. These sample statistics provide a summary of the information contained within the sample itself, which is then the basis for making inferences about the characteristics of the population.

What are examples of descriptive statistics?

The table below lists common descriptive statistics and brief definitions. Measures of central tendency are statistics that describe a “typical” value around which the data points tend to cluster, such as the mean, median, mode, and geometric mean. Measures of variability are statistics that describe the extent to which the data points tend to deviate from, or spread around, the central tendency. The standard deviation, variance, quantiles, and the interquartile range (IQR) are examples of measures of variability. Moments are measures that describe key characteristics of a data distribution, like the center, spread, and shape. They include the mean, variance, skewness, and kurtosis.

Term	Definition
Mean	The arithmetic average of the data
Median	The middle value in the data when the data are ordered
Mode	The value occurring most frequently in the data
Geometric mean	The geometric average of the data
Standard deviation	The spread of the data values around the mean, expressed on the same scale as the data
Variance	The spread of the data values around the mean, expressed on a squared scale
Quantiles	Values in a data set where a given proportion of the observations fall at or below that value
Percentiles	Specialized quantiles that divide the data into hundredths
Quartiles	Specialized quantiles that divide the data into quarters
Interquartile range (IQR)	The difference between the third and first quartiles (75^th and 25^th percentiles); the middle 50% of the data
Skewness	The departure from symmetry of a data distribution
Kurtosis	How “heavy” or “light” the tails of a data distribution are

layout

2 column

Style

columns-75-25, section-top-padding-xsmall