Mixed models let you analyze data that involves both time and space. For example, you might use mixed models in a study design where multiple subjects are measured at multiple times during the course of a drug trial, or in crossover designs in the pharmaceutical, manufacturing or chemical industries.
In JMP Pro 11, you can now fit mixed effect models to your data by using the new Mixed Model personality in Fit Model. You can specify fixed, random and repeated effects; correlate groups of variables; and set up subject and continuous effects – all with an intuitive drag-and-drop interface.
In addition, you can now calculate the covariance parameters for a wider variety of correlation structures:
- When the experimental units on which the data is measured can be grouped into clusters, and the data from a common cluster is correlated.
- When repeated measurements are taken on the same experimental unit, and these repeated measurements are correlated or exhibit variability that changes.
Generalized regression is a class of new modeling techniques well suited to building better predictive models, even with challenging data. It fits generalized linear models using regularized or penalized regression techniques.
Standard estimation techniques break down when you have predictors that are strongly correlated or more predictors than observations. And when there are many correlated predictors (often the case in observational data), stepwise regression or other standard techniques can yield unsatisfactory results. Such models are often over-fit and generalize poorly to new data. But how do you decide which variables to cull before modeling or, worse, how much time do you lose manually preprocessing data sets in preparation for modeling?
Generalized regression in JMP Pro is the answer to building predictive models of large, messy data sets. It is an important addition to your analytics toolbox for performing variable selection or building data mining models over a very large number of predictors. Generalized regression helps handle multicollinearity in your explanatory variables in a way that is very natural – avoiding over-fitting by imposing a penalty on large fluctuations on the estimated parameters.
The regularization techniques available within the generalized regression personality include Ridge, Lasso, adaptive Lasso, Elastic Net and the adaptive Elastic Net to help better identify X’s that may have explanatory power. Harnessing these techniques is as easy as any other modeling personality in Fit Model – simply identify your response, construct model effects and pick the desired estimation and validation method. JMP automatically fits your data, performs variable selection when appropriate, and builds a predictive model that can be generalized to new data.
The Generalized Regression personality gives options to choose the appropriate distributional for the response you are modeling. In addition to the standard normal or binomial distributions, you also have the option of choosing:
- Poisson – when you need to model the distribution of counts.
- Zero Inflated Poisson – when your count data has more zeros than you would expect.
- Negative Binomial – for an over-dispersed binomial distribution.
- Zero Inflated Negative Binomial – when you have two sources of over-dispersion, one inherent in the variability in the non-zero counts and one for count data with more zeros than you would expect.
- Gamma – when you have positively valued integer outcomes and need to estimate the degree of skew present.
By using Ridge in Generalized Regression, you can easily build ensemble models of your model predictions. Simply use the prediction columns as inputs and your response as the Y. The powerful strategy can create an aggregate model with improved prediction power while minimizing the effect of multicollinearity in the predictors.
You may want to maximize the impact of your limited marketing budget by sending offers only to individuals who are likely to respond favorably. But that task may seem daunting, especially when you have large data sets and many possible behavioral or demographic predictors. However, with JMP Pro, you can use uplift models to make this prediction. Also known as incremental modeling, true lift modeling or net modeling, this method has been developed to help optimize marketing decisions, define personalized medicine protocols, or, more generally, to identify characteristics of individuals who are likely to respond to some action.
- Uplift modeling in JMP Pro:
- Fits partition models that find splits to maximize a treatment difference.
- Helps identify groups of individuals who are most likely to respond favorably to an action.
- Helps to lead to efficient and targeted decisions that optimize resource allocation and impact on the individual.
- Uses train, test and validation methodology like other data mining methods in JMP Pro. This prevents overfitting, which helps your model generalize better to new data.
Reliability Block Diagram
Often, you are faced with analyzing the reliability of a more complex analytical system – a RAID storage array with multiple hard drives, or an airplane with four engines, for example. With JMP, you have many tools to analyze the reliability of single components within those systems. And now with JMP Pro, you can take the reliability of single components, build a complex system of multiple components and analyze the reliability of an entire system with the Reliability Block Diagram platform.
This platform allows you to better predict the reliability of the whole system and determine the expected performance based on the current performance of individual components. Easily perform what-if analyses by looking at different designs and comparing plots across multiple system designs. Determine the best places to add redundancy and decrease the probability of a system failure. Using the Reliability Block Diagram, you can easily design and fix weak spots in your system – and be better informed to prevent future system failures.
- The Reliability Block Diagram platform allows you to:
- Create a flow diagram and save designs into a library.
- Use simple, series, parallel, knot, or k out of n design elements.
- Build nested designs from connecting multiple library elements.
- Utilize CDF and remaining life overlays of all models, and determine which system design is more reliable over a given period of time.
- Investigate hazard plots of different system architectures.
Partial Least Squares (PLS)
Are you trying to model data that is wider than it is tall? Traditional techniques won’t work, but Partial Least Squares does. PLS is a powerful modeling technique to have in your toolbox, especially when you have more X variables than observations, highly correlated X variables, a large number of X variables, or several Y variables and many X variables. All of these are situations where ordinary least squares would produce unsatisfactory results.
PLS modeling fits linear models based on factors, namely, linear combinations of explanatory variables (the X’s). The factors are obtained in a way that attempts to maximize the covariance between the X’s and the response or responses (the Y’s). Advancements to PLS modeling make this technique a truly universal modeling technique – a multitool for statistical modeling, variable selection and predictive analytics.
- Advances to PLS modeling in JMP Pro include:
- NIPALS-style missing value handling.
- Support for categorical input variables.
- Variable selection by using the variable importance plot (VIP). You can highlight the significant predictors and load them into a partition or neural model. Let the PLS platform automatically perform variable selection and then build more parsimonious models in your other favorite modeling platforms.
- Automatic centering and scaling of data.
- A Standardize X option, which centers and scales individual variables that are included in a polynomial effect prior to applying the centering and scaling options.
- Support for fitting a Response Surface model with a Partial Least Squares personality in the Fit Model platform.
When presented with a large number of variables to predict an outcome, you may want to reduce the number of variables in some way to make the prediction problem easier to tackle. One possible dimension reduction technique is the well-known method of principal components analysis (PCA). The variables resulting from PCA, however, can be hard to interpret. An alternative strategy is to use variable clustering in JMP Pro to help you to understand whether your responses are grouped into clusters and to determine how many clusters the responses contain. In JMP Pro, Version 11, selecting any cluster within the report automatically selects the cluster’s most representative column in the data table, making it much faster and easier to specify model terms or perform dimension reduction.
Download a PDF of the new features in JMP and JMP Pro or view our online searchable documentation.