Mode

What is the mode?

The mode of a set of data values is the most frequently occurring data value. It is often used to describe categorical data, whether ordinal or nominal. If two categories have the same number of data values, there can be more than one mode. You can visualize the mode by examining a histogram of the data. Modes occur at the local maxima of the histogram. Any peak of the distribution is a mode.

Mode is also used to describe how many peaks there are in distribution of continuous data. Any peak of the distribution is a mode. Multiple peaks typically correspond to multiple sources of variability in the process that provided the data. Examination of modes can give insight into those sources of variation.

Data can have no modes if all values occur only once. Data can have one mode if there is one peak in the histogram. Data can also have more than one mode if there are multiple peaks in the histogram.

How do you calculate the mode?

For categorical data, the mode is the most frequently occurring value. To find it, count how many data values are in each level of the categorical variable. The level with the most data values is the mode. If there is a tie, there is more than one mode. If all levels have the same amount of data, there is no mode.

As an example, consider the table below. We asked 100 people which flavor of ice cream they preferred: vanilla, chocolate, or strawberry. Here are the results:

Flavor Number of people
Vanilla 53
Chocolate 31
Strawberry 16

The histogram of these data is given in the following figure. The mode is the most frequently occurring flavor, vanilla.

Data can have more than one mode. Data with two modes are called bimodal; data with more than two modes are called multimodal.

With continuous data, it is not useful to count how many of each data value there is, since no two values (measured on a fine enough scale) will be exactly equal. Instead, the data are binned as they are for a histogram. The mode is the midpoint of the bin which has the highest frequency. Of course, the bin width is an important parameter for determining the mode or modes.

How can I tell how many modes there are for continuous data?

For continuous data, the notion of a mode doesn’t make sense, since all data values will be different, if measured to enough decimal places. The use of binning data and using the midpoints of bins to replace the continuous data values leads to the histogram method of finding modes.

Another method is to estimate the probability density function from the data themselves, then determine the modes of the estimated density. Essentially, kernel density estimation replaces each data value with a normal distribution with a specified standard deviation. Each of these normal distributions is then summed to give the overall curve. The mode is at the maximum value of this curve. Multiple modes can be estimated using local optima.

As an example, and a comparison of the histogram and kernel density estimate, consider the data values 5, 7, 8, 9.5, 10, 10.5, 11, 11.2, 12, 13, 15, 17, and 18. In the figure below, the data are plotted as blue circles, the normal distribution kernel functions for each observation are drawn in black, and the overall estimator for the data distribution is in red. When the standard deviation of each kernel function decreases, more modes appear. When the standard deviation of each kernel function increases, modes disappear. Kernel density estimation involves finding the best value of standard deviation for the normal distribution kernel functions for your application.

Can I use the mode for categorical variables?

Yes! In fact, the mode is the measure of central tendency that is always applicable to categorical data.

Examples of the mode

If you have JMP on your computer, you can download the JMP data set Univariate Statistics Data.jmp for your own analysis. (If you don't have access to JMP, download a free trial here.)

Examine the modes for the data in the Univariate Statistics Data. Are there no modes, one mode, or more than one mode?