The Generalized Regression personality features regularized, or penalized, regression techniques. Such techniques attempt to fit better models by shrinking the model coefficients toward zero. The resulting estimates are biased. This increase in bias can result in decreased prediction variance, thus lowering overall prediction error. Two of these techniques, the Lasso and the Elastic Net, include variable selection as part of the modeling procedure.
Modeling techniques such as the Elastic Net and the Lasso are particularly promising for large data sets, where collinearity is typically a problem. In fact, modern data sets often include more variables than observations. This situation is sometimes referred to as the p > n problem, where n is the number of observations and p is the number of predictors. Such data sets require variable selection if traditional modeling techniques are to be used.
The Elastic Net and Lasso are also useful for small data sets with little correlation, including designed experiments. They can be used to obtain better predictive models or to select variables for model reduction or for future study.
The Lasso and Elastic Net are relatively recent techniques (Tibshirani, 1996, Zou and Hastie, 2005). Both techniques penalize the size of the model coefficients, resulting in a continuous shrinkage. An optimal level of shrinkage is determined by one of several validation methods. Both techniques have the ability to shrink coefficients to zero. In this way, variable selection is built into the modeling procedure. The Elastic Net model subsumes both the Lasso and ridge regression as special cases.
The Lasso has two shortcomings. When several variables are highly correlated, it tends to select only one variable from that group. When the number of variables, p, exceeds the number of observations, n, the Lasso selects at most n predictors.
The Elastic Net, on the other hand, tends to select all variables from a correlated group, fitting appropriate coefficients. It can also select more than n predictors when p > n. The Elastic Net fit generally takes more processing time than the Lasso.
Ridge regression was among the first of the penalized regression methods proposed (Hoerl, 1962, Hoerl and Kennard, 1970). Ridge regression does not shrink coefficients to zero, so it does not perform variable selection.
The Maximum Likelihood method is a classical approach. It provides a baseline to which you can compare the other techniques.
The Generalized Regression personality also fits an adaptive version of the Lasso and the Elastic Net. These adaptive versions attempt to penalize active variables less than inactive variables. The adaptive versions were developed to ensure that the oracle property holds. The oracle property guarantees the following: Asymptotically, your estimates are what they would have been had you known in advance which predictors were active contributors to the model. More specifically:
The Generalized Regression personality enables you to specify a variety of distributions for your response variable. The distributions fit include normal, binomial, Poisson, zero-inflated Poisson, negative binomial, zero-inflated negative binomial, and gamma. This flexibility enables you to fit categorical and count responses, as well as continuous responses, and specifically, right-skewed continuous responses. The personality also provides a variety of validation criteria for model selection.