To compute the pth quantile of n non-missing values in a column, arrange the n values in ascending order and call these column values y1y2, ..., yn. Compute the rank number for the pth quantile as p / 100(n + 1).
 • If the result is an integer, the pth quantile is that rank’s corresponding value.
 • If the result is not an integer, the pth quantile is found by interpolation. The pth quantile, denoted qp, is computed as follows:
 ‒ n is the number of non-missing values for a variable
 ‒ y1, y2, ..., yn represents the ordered values of the variable
 ‒ yn+1 is taken to be yn
 ‒ i is the integer part and f is the fractional part of (n+1)p.
 ‒ (n + 1)p = i + f
The value y12 is the 75th quantile. The 90th quantile is interpolated by computing a weighted average of the 14th and 15th ranked values as y90 = 0.6y14 + 0.4y15.
 1 Each column value is multiplied by its corresponding weight or frequency.
 2 These values are added and divided by the sum of the weights or frequencies.
The standard error means is computed by dividing the sample standard deviation, s, by the square root of N. In the launch window, if you specified a column for Weight or Freq, then the denominator is the square root of the sum of the weights or frequencies.
and wi is a weight term (= 1 for equally weighted items).
where wi is a weight term (= 1 for equally weighted items). Using this formula, the Normal distribution has a kurtosis of 0.
where ri is the rank of the ith observation, and N is the number of non-missing (and nonexcluded) observations.
where Φ is the cumulative probability distribution function for the normal distribution.
 • There are N observations:
X1, X2, ..., XN
 • The null hypothesis is:
H0: median = m
 • The differences between observations and the hypothesized value m are calculated as follows:
Dj = Xj - m
 • There are N pairs of observations from two populations:
X1, X2, ..., XN and Y1, Y2, ..., YN
 • The null hypothesis is:
H0: medianX - Y = 0
 • The differences between pairs of observations are calculated as follows:
Dj = Xj -Yj
 • The absolute values of the differences, , are ranked from smallest to largest.
 • The ranks start with the value 1, even if there are differences of zero.
 • When there are tied absolute differences, they are assigned the average, or midrank, of the ranks of the observations.
Denote the rank or midrank for a difference by Rj. Define the signed rank for as follows:
 •
 • If the difference is zero, the signed rank is 0.
 •
is the number of signed ranks that equal zero
R+ is the sum of the positive signed ranks
For , exact p-values are calculated.
For N > 20, a Student’s t approximation to the statistic defined below is used. Note that a correction for ties is applied. See Iman (1974) and Lehmann (1998).
Under the null hypothesis, the mean of W is zero. The variance of W is given by the following:
The last summation in the expression for Var(W) is a correction for ties. The notation di for i > 0 represents the number of values in the ith group of non-zero signed ranks. (If there are no ties for a given signed rank, then di = 1 and the summand is 0.)
The statistic t given by the following has an approximate t distribution with - 1 degrees of freedom:
The Test Statistic is distributed as a Chi-square variable with n - 1 degrees of freedom when the population is normal.
The Min PValue is the p-value of the two-tailed test, and is calculated as follows:
where p1 is the lower one-tail p-value and p2 is the upper one-tail p-value.
 • is the cumulative probability distribution function for the normal distribution
 • ri is the rank of the ith observation
 • N is the number of non-missing observations
 • X is the original column
 •
 • is the standard deviation of column X
 • For m future observations:
 • For the mean of m future observations:
 • For the standard deviation of m future observations:
where m = number of future observations, and n = number of points in current analysis sample.
 • The one-sided intervals are formed by using 1-α in the quantile functions.
t is the quantile from the non-central t-distribution, and is the standard normal quantile.
s = standard deviation and is a constant that can be found in Table 4 of Odeh and Owen 1980).
To determine g, consider the fraction of the population captured by the tolerance interval. Tamhane and Dunlop (2000) give this fraction as follows:
where Φ denotes the standard normal c.d.f. (cumulative distribution function). Therefore, g solves the following equation:
where 1-γ is the fraction of all future observations contained in the tolerance interval.
 • Long-term uses the overall sigma. This option is used for statistics, and computes sigma as follows:
Note: There is a preference for Distribution called Ppk Capability Labeling that labels the long-term capability output with Ppk labels. This option is found using File > Preferences, then select Platforms > Distribution.
 • Specified Sigma enables you to type a specific, known sigma used for computing capability analyses. Sigma is user-specified, and is therefore not computed.
 • Moving Range enables you to enter a range span, which computes sigma as follows:
d2(n) is the expected value of the range of n independent normally distributed variables with unit standard deviation.
 • Short Term Sigma, Group by Fixed Subgroup Size if r is the number of subgroups of size nj and each ith subgroup is defined by the order of the data, sigma is computed as follows:
 • This formula is commonly referred to as the Root Mean Square Error, or RMSE.
 • USL is the upper spec limit
 • LSL is the lower spec limit
where γ = same as above.
 • A capability index of 1.33 is considered to be the minimum acceptable. For a normal distribution, this gives an expected number of nonconforming units of about 6 per 100,000.
 • Exact 100(1 - α)% lower and upper confidence limits for CPL are computed using a generalization of the method of Chou et al. (1990), who point out that the 100(1 - α) lower confidence limit for CPL (denoted by CPLLCL) satisfies the following equation:
where Tn-1(δ) has a non-central t-distribution with n - 1 degrees of freedom and noncentrality parameter δ.
 • Exact 100(1 - α)% lower and upper confidence limits for CPU are also computed using a generalization of the method of Chou et al. (1990), who point out that the 100(1 - α) lower confidence limit for CPU (denoted CPULCL) satisfies the following equation:
where Tn-1(δ) has a non-central t-distribution with n - 1 degrees of freedom and noncentrality parameter δ.
 • Sigma Quality is defined as the following
For example, if there are 3 defects in n=1,000,000 observations, the formula yields 6.03, or a 6.03 sigma process. The results of the computations of the Sigma Quality Above USL and Sigma Quality Below LSL column values do not sum to the Sigma Quality Total Outside column value because calculating Sigma Quality involves finding normal distribution quantiles, and is therefore not additive.
 • Here are the Benchmark Z formulas:
The Normal fitting option estimates the parameters of the normal distribution. The normal distribution is often used to model measures that are symmetric with most of the values falling in the middle of the curve. Select the Normal fitting for any set of data and test how well a normal distribution fits your data.
 • μ (the mean) defines the location of the distribution on the x-axis
 • σ (standard deviation) defines the dispersion or spread of the distribution
E(x) = μ
Var(x) = σ2
The LogNormal fitting option estimates the parameters μ (scale) and σ (shape) for the two-parameter lognormal distribution. A variable Y is lognormal if and only if is normal. The data must be greater than zero.
E(x) =
Var(x) =
The Weibull distribution has different shapes depending on the values of α (scale) and β (shape). It often provides a good model for estimating the length of life, especially for mechanical devices and in biology. The Weibull option is the same as the Weibull with threshold option, with a threshold (θ) parameter of zero. For the Weibull with threshold option, JMP estimates the threshold as the minimum value. If you know what the threshold should be, set it by using the Fix Parameters option. See Fit Distribution Options.
E(x) =
Var(x) =
The Extreme Value distribution is a two parameter Weibull (α, β) distribution with the transformed parameters δ = 1 / β and λ = ln(α).
The Exponential distribution is a special case of the two-parameter Weibull when β = 1 and α = σ, and also a special case of the Gamma distribution when α = 1.
E(x) = σ
Var(x) = σ2
Devore (1995) notes that an exponential distribution is memoryless. Memoryless means that if you check a component after t hours and it is still working, the distribution of additional lifetime (the conditional probability of additional life given that the component has lived until t) is the same as the original distribution.
The Gamma fitting option estimates the gamma distribution parameters, α > 0 and σ > 0. The parameter α, called alpha in the fitted gamma report, describes shape or curvature. The parameter σ, called sigma, is the scale parameter of the distribution. A third parameter, θ, called the Threshold, is the lower endpoint parameter. It is set to zero by default, unless there are negative values. You can also set its value by using the Fix Parameters option. See Fit Distribution Options.
E(x) = ασ + θ
Var(x) = ασ2
 • The standard gamma distribution has σ = 1. Sigma is called the scale parameter because values other than 1 stretch or compress the distribution along the x-axis.
 • The Chi-square distribution occurs when σ = 2, α = ν/2, and θ = 0.
 •
The standard beta distribution is useful for modeling the behavior of random variables that are constrained to fall in the interval 0,1. For example, proportions always fall between 0 and 1. The Beta fitting option estimates two shape parameters, α > 0 and β > 0. There are also θ and σ, which are used to define the lower threshold as θ, and the upper threshold as θ + σ. The beta distribution has values only for the interval defined by . The θ is estimated as the minimum value, and σ is estimated as the range. The standard beta distribution occurs when θ = 0 and σ = 1.
Set parameters to fixed values by using the Fix Parameters option. The upper threshold must be greater than or equal to the maximum data value, and the lower threshold must be less than or equal to the minimum data value. For details about the Fix Parameters option, see Fit Distribution Options.
E(x) =
Var(x) =
The Normal Mixtures option fits a mixture of normal distributions. This flexible distribution is capable of fitting multi-modal data.
Fit a mixture of two or three normal distributions by selecting the Normal 2 Mixture or Normal 3 Mixture options. Alternatively, you can fit a mixture of k normal distributions by selecting the Other option. A separate mean, standard deviation, and proportion of the whole is estimated for each group.
E(x) =
Var(x) =
where μi, σi, and πi are the respective mean, standard deviation, and proportion for the ith group, and is the standard normal pdf.
The Smooth Curve option fits a smooth curve using nonparametric density estimation (kernel density estimation). The smooth curve is overlaid on the histogram and a slider appears beneath the plot. Control the amount of smoothing by changing the kernel standard deviation with the slider. The initial Kernel Std estimate is calculated from the standard deviation of the data.
 • Johnson Su, which is unbounded.
 • Johnson Sb, which has bounds on both tails defined by parameters that can be estimated.
 • Johnson Sl, which is bounded in one tail by a parameter that can be estimated. The Johnson Sl family contains the family of lognormal distributions.
pdf:          for   θ < x < θ+σ;   0 < σ
pdf:          for   θ < x if σ = 1;   θ > x if σ = -1
If z = ~ N(0,1), then x ~ Glog(μ,σ,λ).
When λ = 0, the Glog reduces to the LogNormal (μ,σ).
 ‒ logL is the logLikelihood
 ‒ n is the sample size
 ‒ ν is the number of parameters
E(x) = λ
Var(x) = λ
This distribution is useful when the data is a combination of several Poisson(μ) distributions, each with a different μ. One example is the overall number of accidents combined from multiple intersections, when the mean number of accidents (μ) varies between the intersections.
The Gamma Poisson distribution results from assuming that x|μ follows a Poisson distribution and μ follows a Gamma(α,τ). The Gamma Poisson has parameters λ = ατ and σ = τ+1. The parameter σ is a dispersion parameter. If σ > 1, there is over dispersion, meaning there is more variation in x than explained by the Poisson alone. If σ = 1, x reduces to Poisson(λ).
E(x) = λ
Var(x) = λσ
Remember that x|μ ~ Poisson(μ), while μ~ Gamma(α,τ). The platform estimates λ = ατ and σ = τ+1. To obtain estimates for α and τ, use the following formulas:
If the estimate of σ is 1, the formulas do not work. In that case, the Gamma Poisson has reduced to the Poisson(λ), and is the estimate of λ.
If the estimate for α is an integer, the Gamma Poisson is equivalent to a Negative Binomial with the following pmf:
with r = α and (1-p)/p = τ.
Run demoGammaPoisson.jsl in the JMP Samples/Scripts folder to compare a Gamma Poisson distribution with parameters λ and σ to a Poisson distribution with parameter λ.
The Binomial option accepts data in two formats: a constant sample size, or a column containing sample sizes.
E(x) = np
Var(x) = np(1-p)
The Beta Binomial distribution results from assuming that x|π follows a Binomial(n,π) distribution and π follows a Beta(α,β). The Beta Binomial has parameters p = α/(α+β) and δ = 1/(α+β+1). The parameter δ is a dispersion parameter. When δ > 0, there is over dispersion, meaning there is more variation in x than explained by the Binomial alone. When δ < 0, there is under dispersion. When δ = 0, x is distributed as Binomial(n,p). The Beta Binomial only exists when .
E(x) = np
Var(x) = np(1-p)[1+(n-1)δ]
Remember that x|π ~ Binomial(n,π), while π ~ Beta(α,β). The parameters p = α/(α+β) and δ = 1/(α+β+1) are estimated by the platform. To obtain estimates of α and β, use the following formulas:
If the estimate of δ is 0, the formulas do not work. In that case, the Beta Binomial has reduced to the Binomial(n,p), and is the estimate of p.
Run demoBetaBinomial.jsl in the JMP Samples/Scripts folder to compare a Beta Binomial distribution with dispersion parameter δ to a Binomial distribution with parameters p and n = 20.
The fitted quantiles in the Diagnostic Plot and the fitted quantiles saved with the Save Fitted Quantiles command are formed using the following method:
 1 The data are sorted and ranked. Ties are assigned different ranks.
 2 Compute the p[i] = rank[i]/(n+1).
 3 Compute the quantile[i] = Quantiled(p[i]) where Quantiled is the quantile function for the specific fitted distribution, and i = 1,2,...,n.
 Distribution Parameters Goodness of Fit Test μ and σ are unknown Shapiro-Wilk (for n ≤ 2000) Kolmogorov-Smirnov-Lillefors (for n > 2000) μ and σ are both known Kolmogorov-Smirnov-Lillefors either μ or σ is known (none) LogNormal μ and σ are known or unknown Weibull α and β known or unknown Weibull with threshold α, β and θ known or unknown Extreme Value α and β known or unknown Exponential σ is known or unknown Gamma α and σ are known either α or σ is unknown (none) Beta α and β are known either α or β is unknown (none) Binomial ρ is known or unknown and n is known Kolmogorov's D (for n ≤ 30) Pearson χ2 (for n > 30) Beta Binomial ρ and δ known or unknown Kolmogorov's D (for n ≤ 30) Pearson χ2 (for n > 30) Poisson λ known or unknown Kolmogorov's D (for n ≤ 30) Pearson χ2 (for n > 30) Gamma Poisson λ or σ known or unknown Kolmogorov's D (for n ≤ 30) Pearson χ2 (for n > 30)

Writing T for the target, LSL, and USL for the lower and upper specification limits, and Pα for the α*100th percentile, the generalized capability indices are as follows:
For example, for a Normal distribution, where K=3, the 3 standard deviations below and above the mean correspond to the 0.00135th quantile and 0.99865th quantile, respectively. The lower specification limit is set at the 0.00135th quantile, and the upper specification limit is set at the 0.99865th quantile of the fitted distribution. A capability analysis is returned based on those specification limits.