Do You Need JMP Pro? (PDF)
JMP Pro is the advanced analytics version of JMP, created for power users who need sophisticated modeling techniques to better anticipate the future and plan well for tomorrow. Built with scientists and engineers in mind, JMP Pro statistical analysis software from SAS provides all the superior capabilities for interactive data visualization, exploration, analysis and communication that are the hallmarks of JMP.
In addition, JMP Pro offers a multitude of sophisticated techniques: predictive modeling with cross-validation using a number of different methods, modern modeling techniques, model comparison and averaging features, advanced multivariate techniques, reliability block diagrams, covering arrays, mixed models, uplift models and advanced computational statistics methods.
Having access to all the rich advanced analytics in JMP Pro removes roadblocks to statistical discovery and enhances your ability to uncover more clues in your data. Therefore, you make breakthroughs more quickly, enabling you to become more proactive and take greater control of the future.
JMP Pro continues to amaze me. There are always new modeling techniques and tools to make my life as a data analyst easier. The Generalized Regression platform is easy to use and fast. And the ability to create test/validation variables on the fly is great.
C. Carlisle and Margaret Tippit Professor of Statistics, Williams College
At the heart of data mining are the advanced tools to fit large models that generalize well with new data. JMP Pro includes a rich set of algorithms for building better models of your data. Two of the most useful techniques for predictive modeling are decision trees and neural networks.
The Partition platform in JMP Pro automates the tree-building process with modern methods. The bootstrap forest, a random-forest technique, grows dozens of decision trees using random subsets of the data and candidate variables, and then averages these trees. The boosted tree technique builds many simple trees, repeatedly fitting the residuals from the previous tree. The Partition platform in JMP Pro also fits K nearest neighbors (K-NN) models. Using these methods lets you build models that often predict better than simple decision tree models.
The advanced Neural platform in JMP Pro lets you build one- or two-layer neural networks with your choice of three activation functions and also provides automatic model construction using gradient boosting. This platform automatically handles missing values and transformation of continuous X’s, saving time and effort. In addition, it includes robust fitting options.
Both the Partition and Neural platforms in JMP Pro take advantage of using cross-validation. The advantage purpose of validation is described in the next section. In addition, stepwise regression, logistic regression (both nominal and ordinal) and discriminant analysis in JMP Pro support the use of a validation column.
For effective predictive modeling, you need sound ways to validate your model, and with a large model, you can easily get into trouble over-fitting. Large models should always be cross-validated, and JMP Pro does this through data partitioning, or holdback. Dividing the data into training, validation and test data sets has long been used to avoid over-fitting, ensuring that the models you build are not reliant on the properties of the specific sample used to build them.
The general approach to cross-validation in JMP Pro is to use a validation column. You can easily split your data into different sets for different purposes using the validation column utility (either with a purely random sample or stratified random). The training set is used to build the model(s). The validation set is used in the model-building process to help choose how complex the model should be. Finally, the test set is held out completely from the model-building process and used to assess the quality of the model(s). For smaller data sets, k-fold cross-validation also can be used in some platforms.
Cross-validation helps you build models that generalize well to tomorrow’s data – about new customers, new processes or new risks – so you can make data-driven inferences.
In the real world, some kinds of models fit well in certain situations but fit poorly in others. With JMP Pro, there are many ways to fit, and you need to find out which one is most appropriate in a given situation. A typical approach to model building is that you will try many different models: models with more or less complexity, models with or without certain factors/predictors, models built using different kinds of modeling methods or even averages of multiple models (ensemble models).
Each of these models will have common quality measures that can be used to assess the model: R2, misclassification rate, ROC curves, AUC, lift curves, etc.
Using model comparison in JMP Pro, you can compare all the saved prediction columns from various fits and pick the best combination of goodness of fit, parsimony and cross-validation. JMP Pro makes this comparison automatically. At the same time, you can interact with visual model profilers to see which important factors each model is picking up. Model comparison in JMP Pro makes it easy to compare multiple models at the same time, and also to do simple model averaging, if desired.
Generalized regression is a class of new modeling techniques well suited to building better models, even with challenging data. It fits generalized linear models using regularized or penalized regression methods.
Standard estimation techniques break down when you have predictors that are strongly correlated or more predictors than observations. And when there are many correlated predictors (as is often the case in observational data), stepwise regression or other standard techniques can yield unsatisfactory results. Such models are often over-fit and generalize poorly to new data. But how do you decide which variables to cull before modeling – or, worse, how much time do you lose manually preprocessing data sets in preparation for modeling?
Generalized regression is a complete modeling framework. It takes you from variable selection through model diagnostics to LS means comparisons, inverse prediction and profiling. And it’s only in JMP Pro.
The regularization techniques available within the Generalized Regression personality include Ridge, Lasso, adaptive Lasso, Elastic Net and the adaptive Elastic Net to help better identify X’s that may have explanatory power. Harnessing these techniques is as easy as using any other modeling personality in Fit Model – simply identify your response, construct model effects and pick the desired estimation and validation method. JMP automatically fits your data, performs variable selection when appropriate, and builds a predictive model that can be generalized to new data. You can also use a forward stepwise technique, perform quantile regression or simple fit using maximum likelihood.
Finally, Generalized Regression gives options to choose the appropriate distribution for the response you are modeling, letting you model more diverse responses such as counts, data with many outliers, or skewed data.
JMP Pro includes several advanced techniques to build better models when faced with data problems that require multivariate fitting methods.
Partial least squares. Are you trying to model data that is wider than it is tall? Traditional techniques don’t work, but partial least squares (PLS) does. PLS is a powerful modeling technique to have in your toolbox, especially when you have more X variables than observations, highly correlated X variables, a large number of X variables, or several Y variables and many X variables. All of these are situations where ordinary least squares would produce unsatisfactory results.
PLS modeling fits linear models based on factors, namely, linear combinations of explanatory variables (the X’s). The factors are obtained in a way that attempts to maximize the covariance between the X’s and the response or responses (the Y’s). In JMP Pro, you can build PLS models with either continuous or categorical responses (PLS-DA), specify curvature terms or interaction effects and perform missing value imputation.
Variable clustering. When presented with a large number of variables to predict an outcome, you may want to reduce the number of variables in some way to make the prediction problem easier to tackle. One possible dimension reduction technique is the well-known method of principal components analysis (PCA). The variables resulting from PCA, however, can be hard to interpret.
An alternative strategy is to use variable clustering in JMP Pro to help you to understand whether your responses are grouped into clusters and to determine how many clusters the responses contain. Selecting any cluster within the report automatically selects the cluster’s most representative column in the data table, making it much faster and easier to specify model terms or perform dimension reduction.
Often, you are faced with analyzing the reliability of a more complex analytical system – a RAID storage array with multiple hard drives, or an airplane with four engines, for example. With JMP, you have many tools to analyze the reliability of single components within those systems. But with JMP Pro, you can take the reliability of single components, build a complex system of multiple components and analyze the reliability of an entire system with the Reliability Block Diagram platform.
This platform allows you to better predict the reliability of the whole system and determine the expected performance based on the current performance of individual components. You can easily perform what-if analyses by looking at different designs and comparing plots across multiple system designs. You can also determine the best places to add redundancy and decrease the probability of a system failure. Using the Reliability Block Diagram, you can easily design and fix weak spots in your system – and be better informed to prevent future system failures.
Covering arrays are used in testing applications where factor interactions may lead to failures. Each experimental run may be costly. As a result, you need to design an experiment to maximize the probability of finding defects while also minimizing cost and time. Covering arrays let you do just that. JMP Pro lets you design an experiment to test deterministic systems and cover all possible combinations of factors up to a certain order.
And when there are combinations of factors that create implausible conditions, you can use the interactive Disallowed Combinations filter to automatically exclude these combinations of factor settings from the design.
One of the huge advantages of covering arrays in JMP Pro is that JMP Pro is a statistical analysis tool, not just a covering arrays design tool. You can do all sorts of statistical analyses in JMP Pro. For example, there is currently no other software for covering arrays design that also lets you analyze your data using generalized regression.
JMP Pro also allows you to import any covering array design – generated by any software – and further optimize it and analyze the results. You can design the arrays yourself without having to rely on others to build experiments for you. Test smarter with covering arrays in JMP Pro.
Mixed models let you analyze data that involves both time and space. For example, you might use mixed models in a study design where multiple subjects are measured at multiple times during the course of a drug trial, or in crossover designs in the pharmaceutical, manufacturing or chemical industries.
JMP Pro lets you fit mixed models to your data, letting you specify fixed, random and repeated effects; correlate groups of variables; and set up subject and continuous effects – all with an intuitive drag-and-drop interface.
In addition, you can now calculate the covariance parameters for a wide variety of correlation structures. Such examples include when the experimental units on which the data is measured can be grouped into clusters, and the data from a common cluster is correlated. Another example is when repeated measurements are taken on the same experimental unit, and these repeated measurements are correlated or exhibit variability that changes.
It is also easy to visually determine which, if any, spatial covariance structure is appropriate to utilize in your model specification when building mixed models in JMP Pro.
You may want to maximize the impact of your limited marketing budget by sending offers only to individuals who are likely to respond favorably. But that task may seem daunting, especially when you have large data sets and many possible behavioral or demographic predictors. However, with JMP Pro, you can use uplift models to make this prediction. Also known as incremental modeling, true-lift modeling or net modeling, this method has been developed to help optimize marketing decisions, define personalized medicine protocols or, more generally, to identify characteristics of individuals who are likely to respond to some action.
Uplift modeling in JMP Pro fits partition models that find splits to maximize a treatment difference. The models help identify groups of individuals who are most likely to respond favorably to an action; they help to lead to efficient and targeted decisions that optimize resource allocation and impact on the individual.
JMP Pro includes exact statistical tests for contingency tables and exact non-parametric statistical tests for one-way ANOVA. Also, JMP Pro includes a general method for bootstrapping statistics in most JMP reports.
Bootstrapping approximates the sampling distribution of a statistic. JMP Pro is the only statistical software package that lets you bootstrap a statistic without writing a single line of code. One-click bootstrapping means you are only a click away from being able to bootstrap any quantity in a JMP report.
This technique is useful when textbook assumptions are in question or don’t exist. For example, try applying bootstrapping techniques to nonlinear model results that are being used to make predictions or determining coverage intervals around quantiles. Also, you can use bootstrapping as an alternative way to gauge the uncertainty in predictive models. Bootstrapping lets you assess the confidence in your estimates with fewer assumptions – and one-click bootstrapping in JMP Pro makes it easy.
As one of the SAS offerings for predictive analytics and data mining, JMP Pro easily connects to SAS, expanding options and giving access to the unparalleled depth of SAS Analytics and data integration. With or without an active SAS connection, JMP Pro can output SAS code to score new data quickly and easily with models built in JMP.
JMP Pro includes all of the features in JMP, plus the additional capabilities for advanced analytics listed below.
*Generates SAS code ready for use with SAS Model Manager
JMP runs on Microsoft Windows and Mac OS. It includes support for both 32- and 64-bit systems.
Back to Top