Publication date: 09/27/2021

This section contains statistical details for the Explorer Patterns utility.

To calculate the rarity for longest runs, first define the following variables:

n = the number of rows in the column

k = the number of times a specific value occurs in the column

p = k/n = the probability of observing the specific value in the column

m = the length of the run

N = the number of unique runs

Then, the rarity for longest runs is calculated as follows:

Rarity = −log2(1 − (1 − pm - 1)N)

To calculate the rarity for longest sequences, first define the following variables:

p = the probability of observing the specific sequence one time in the column

k = the number of times the starting value of the sequence occurs in the column

Then, the rarity for longest sequences is calculated as follows:

Rarity = −log2(1 − (1 − p)k)

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).