Statistical Details | JMP 12

JMP Support 800.450.0135 (US)

Documentation Feedback
Your feedback is important to us. us any comments about our documentation.

Fitting Linear Models • Generalized Regression Models • Statistical Details

•

Statistical Details

Statistical Details for Estimation Methods

Penalized regression methods introduce bias to the regression coefficients by penalizing them.

Ridge Regression

An l2 penalty is applied to the regression coefficients during ridge regression. Ridge regression coefficient estimates are given by the following:

is the l2 penalty, λ is the tuning parameter, N is the number of rows, and p is the number of variables.

Lasso Regression

An l1 penalty is applied to the regression coefficients during Lasso. Coefficient estimates for the Lasso are given by the following:

is the l1 penalty, λ is the tuning parameter, N is the number of rows, and p is the number of variables

The Elastic Net combines both l1 and l2 penalties. Coefficient estimates for the Elastic Net are given by the following:

The notation used in this equation is as follows:

•	is the l1 penalty

•	is the l2 penalty

•	λ is the tuning parameter

•	α is a parameter that determines the mix of the l1 and l2 penalties

•	N is the number of rows

•	p is the number of variables

Adaptive Methods

The adaptive Lasso method uses weighted penalties to provide consistent estimates of coefficients. The weighted form of the l1 penalty is

is the MLE when the MLE exists. If the MLE does not exist and the response distribution is normal, estimation is done using least squares and

is the solution obtained using a generalized inverse. If the response distribution is not normal,

is the ridge solution.

For the adaptive Lasso, this weighted form of the l1 penalty is used in determining the

coefficients.

The adaptive Elastic Net uses this weighted form of the l1 penalty and also imposes a weighted form of the l2 penalty. The weighted form of the l2 penalty for the adaptive Elastic Net is

is the MLE when the MLE exists. If the MLE does not exist and the response distribution is normal, estimation is done using least squares and

is the solution obtained using a generalized inverse. If the response distribution is not normal,

is the ridge solution.

Statistical Details for Advanced Controls

The tuning parameters for ridge regression and the Lasso that best minimize the penalized likelihood are found by searching a grid of tuning parameter values. This grid of values lies between a lower and an upper bound for the tuning parameter. You can specify the number of grid points under Advanced Controls.

The lower bound is zero except in special cases where it is set to 0.01. See Tuning Parameter. When the lower bound for the tuning parameter is zero, the solution is unpenalized and the coefficients are the MLEs. The upper bound is the smallest value for which all of the non-intercept terms are zero.

The grid of values between the lower and upper bounds is iteratively searched to determine the best value of the tuning parameter. The grid of possible tuning parameters can be set up in three different scales: linear, log, and square root.

In some cases, there is a large gap between the unpenalized estimates and the previous step. This large gap can distort the solution path. The log scale focuses its search on small tuning parameter values with few large values, whereas the linear scale evenly disperses the search from the minimum to the maximum value. The square root scale is a compromise between the other two scales. Options for Tuning Parameter Grid Scale shows the different grid scales.

Options for Tuning Parameter Grid Scale

Statistical Details for Distributions

The distributions fit by the Generalized Regression personality are given below in terms of the parameters used in model fitting. Although it is not specifically stated as part of their descriptions, the Generalized Regression personality enables you to input non-integer values for the discrete distributions.

Normal Distribution

Cauchy Distribution

Exponential Distribution

Gamma Distribution

Beta Distribution

Binomial Distribution

Beta Binomial Distribution

Poisson Distribution

Negative Binomial Distribution

Zero-Inflated Binomial Distribution

Zero-Inflated Beta Binomial Distribution

Zero-Inflated Poisson Distribution

Zero-Inflated Negative Binomial Distribution

Zero-Inflated Gamma Distribution