This website uses Cookies. Click Accept to agree to our website's cookie use as described in our Privacy Policy. Click Preferences to customize your cookie settings.
The content you are looking for has been archived. View related content below.
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Practice JMP using these webinar videos and resources. We hold live Mastering JMP Zoom webinars with Q&A most Fridays at 2 pm US Eastern Time. See the list and register. Local-language live Zoom webinars occur in the UK, Western Europe and Asia. See your country jmp.com/mastering site.
Created:
Apr 3, 2023 04:11 PM
| Last Modified: Oct 24, 2023 01:43 PM
Mastering March 2023.jrn
Multivariate Analysis.jmpprj
Multicollinearity is the existence of such a high degree of correlation between supposedly independent variables being used to estimate a dependent variable that the contribution of each independent variable to variation in the dependent variable cannot be determined. High multicollinearity in your data set means any model you build will almost certainly be overfit unless you use techniques to mitigate this issue. There are several modeling techniques that can be used to improve the likelihood of building a good predictive model by limiting multicollinearity. These include Principal Components Analysis and Partial Least Squares, which are both available in JMP, and Generalized Regression, which is available in JMP Pro as a Fit Model option.
See how to:
Perform Principal Components Analysis (PCA) to use a small number of components to model a set of data by limiting the variability and mitigating multicollinearity
The best data sets for PCA analysis are either tall or wide. Could be both
Does not interpret variables as inputs or outputs - deals in a single matrix
Extract linear combinations of variables that explain as much variability as possible
This all starts with the 1st Principal Component and the algorithm continues until 100% of the variation is explained. Hopefully this is a number much small than your original set of variables
Each successive PC explains as much variation as possible and is orthogonal to the loading vectors of the previously extracted
Examine and interpret Summary Plots and Eigen Values
Save Principal Components to data table and perform Principle Component regression using Graph>Profiler (Reminder: Check Expand intermediate values box)
Use Partial Least Squares to fit linear models based on factors, namely, linear combinations of the explanatory variables (Xs)
These factors are obtained in a way that attempts to maximize the covariance between the Xs and the response or responses (Ys).
PLS exploits the correlations between the Xs and the Ys to reveal underlying latent structures
PLS performs well in situations where the use of ordinary least squares (OLS) does not produce satisfactory results, including when there are more X variables than observations; highly correlated X variables; a large number of X variables; and several Y variables and many X variables.
PLS can be used when the predictors outnumber the observations
PLS is used widely in modeling high-dimensional data in areas such as spectroscopy, chemometrics, genomics, psychology, education, economics, political science, and environmental science.
See how to use JMP PLS for two model fitting algorithms are available: nonlinear iterative partial least squares (NIPALS) and a “statistically inspired modification of PLS” (SIMPLS)
For a single response, both methods give the same model; for multiple responses, there are slight differences
PLS uses the van der Voet T2 test and cross validation to help you choose the optimal number of factors to extract.
Interpret the Root Mean PRESS (predicted residual sum of squares) Plot, which shows the number of factors with the lowest PRESS value, that model being the least of the better models you'll get for the data
Use the Variance Importance Plot (VIP) to locate factors to remove
Interpret Percent Variation Plots
The van der Voet T2 statistic tests to determine whether a model with a different number of factors differs significantly from the model with the minimum PRESS value
A common practice is to extract the smallest number of factors for which the van der Voet significance level exceeds 0.10
Use JMP Pro Generalized Regression advanced penalized regression technique(s) that are especially good for highly correlated and/or non-normally distributed data and variable selection
Build LASSO model, interpret results and understand some of its limitations.
Build Elastic Net model and interpret for highly-correlated data and understand some of the benefits over LASSO
Use JMP Pro Generalized Regression for high-dimensional spectral data
Examine and interpret PCA Model Driven Multivariate Control Charts, PCA on Correlations Summary Plots and Eigen Values
Examine and interpret PLS results
Interpret Elastic Net Generalized Regression including Generalized R-squared
Use JMP Pro Generalized Regression for very wide data and interpret results
Note: the attached .jmpprj file includes scripts to run the analyses from the demo.
Questions answered by Bill Worley @Bill_Worley and Scott Allen @scott_allen at the live webinar:
Q: Is Pred Formula Y1 from PCA regression?
A: Yes, the prediction formulas for Y1 and Y2 were generated by regression analysis of the principal components. When you build the profiler with these prediction formula, you can check a box that allows you to see the individual effects instead of the principal components.
Q: What would be your main criteria for eliminating variables in the PCA example?
Q: In PLS, is it necessary to specify interactions or are the relationships among variables accommodated in the underlying model (e.g., NIPALS)?
A: For PLS, you must specify the model effects you want to test interactions or other terms. In standard JMP you can run PLS analysis with main effects only. In JMP Pro you can run PLS with other model effects, like interactions or polynomials by using the PLS personality in Fit Model.
Q: Is the gluten dataset also associated with a lot of multi-collinearity?
A: Yes. Each wavelength is dependent on the next, so there is lots of collinearity from wavelength to wavelength.
Q: Is it possible to say if PLS or Generalized Regression is generally superior with highly auto-correlated data or is it simply dataset-dependent? I was kind thinking PLS was better, but sounds like not necessarily.
A: It depends on the situation, which can be omewhat data dependent. Both give good models very close to each other.
Q: Which of these modelling options would be best for inverse prediction?
A: With wide data, you might have trouble getting the right settings. All options would be ok for inverse prediction.
Q: Could you go over how you saved the PCA formulas to the table. I only know how to save the coordinates themselves.
A: From red triangle, Save Formula to Data Table.
Q: If I use the PLS model for optimization with Prediction Profiler, will the optimization respect the correlation among the variables, or if extrapolation control is turned on, then will the correlation among the variables will be taken care of?
A: In PLS, you can do Variable Selection and then use results to build model with the important variables. Then, use Make Model with VIP and rerun. If you save these models to data table, they are big and have every variable in them.
Q: Would Model Screening be useful?
A: Yes,. It will take a time. Select Additional Methods. Be aware that by selecting that, it will go through Elastic Net. It will also go through Ridge Regression and will take time if you have K-folds and cross validation . And, in Generalized Regression and PLS Model Screening, you can add 2-way interactions and it will fit with and without the 2-way interactions.
JMP documentation and example sequence for using Principal Components to reduce the dimensionality of your data. The purpose is to derive a small number of independent linear combinations (principalcomponents) of a set of measured variables that capture as much of the variability in the original variables as possible. Principal component analysis is a dimension-reduction technique, as well as an exploratory data analysis tool.
JMP documentation and example sequence for using Partial Leasts Squares to develop models using correlations between Ys and Xs. PLS fits linear models based on factors, namely, linear combinations of the explanatory variables (Xs). These factors are obtained in a way that attempts to maximize the covariance between theXs and the response or responses (Ys). PLS exploits the correlations between theXs and theYs to reveal underlying latent structures.
Recommended Articles
'
var data = div.getElementsByClassName("video-js");
var script = document.createElement('script');
script.src = "https://players.brightcove.net/" + data_account + "/" + data_palyer + "_default/index.min.js";
for(var i=0;i< data.length;i++){
videodata.push(data[i]);
}
}
}
for(var i=0;i< videodata.length;i++){
document.getElementsByClassName('lia-vid-container')[i].innerHTML = videodata[i].outerHTML;
document.body.appendChild(script);
}
}
catch(e){
}
/* Re compile html */
$compile(rootElement.querySelectorAll('div.lia-message-body-content')[0])($scope);
}
if (code_l.toLowerCase() != newBody.getAttribute("slang").toLowerCase()) {
/* Adding Translation flag */
var tr_obj = $filter('filter')($scope.sourceLangList, function (obj_l) {
return obj_l.code.toLowerCase() === newBody.getAttribute("slang").toLowerCase()
});
if (tr_obj.length > 0) {
tr_text = "This post originally written in lilicon-trans-text has been computer translated for you. When you reply, it will also be translated back to lilicon-trans-text.".replace(/lilicon-trans-text/g, tr_obj[0].title);
try {
if ($scope.wootMessages[$rootScope.profLang] != undefined) {
tr_text = $scope.wootMessages[$rootScope.profLang].replace(/lilicon-trans-text/g, tr_obj[0].title);
}
} catch (e) {
}
} else {
//tr_text = "This message was translated for your convenience!";
tr_text = "This message was translated for your convenience!";
}
try {
if (!document.getElementById("tr-msz-" + value)) {
var tr_para = document.createElement("P");
tr_para.setAttribute("id", "tr-msz-" + value);
tr_para.setAttribute("class", "tr-msz");
tr_para.style.textAlign = 'justify';
var tr_fTag = document.createElement("IMG");
tr_fTag.setAttribute("class", "tFlag");
tr_fTag.setAttribute("src", "/html/assets/lingoTrFlag.PNG");
tr_fTag.style.marginRight = "5px";
tr_fTag.style.height = "14px";
tr_para.appendChild(tr_fTag);
var tr_textNode = document.createTextNode(tr_text);
tr_para.appendChild(tr_textNode);
/* Woot message only for multi source */
if(rootElement.querySelector(".lia-quilt-forum-message")){
rootElement.querySelector(".lia-quilt-forum-message").appendChild(tr_para);
} else if(rootElement.querySelector(".lia-message-view-blog-topic-message")) {
rootElement.querySelector(".lia-message-view-blog-topic-message").appendChild(tr_para);
} else if(rootElement.querySelector(".lia-quilt-blog-reply-message")){
rootElement.querySelector(".lia-quilt-blog-reply-message").appendChild(tr_para);
} else if(rootElement.querySelector(".lia-quilt-tkb-message")){
rootElement.querySelector(".lia-quilt-tkb-message").appendChild(tr_para);
} else if(rootElement.querySelector(".lia-quilt-tkb-reply-message")){
rootElement.querySelector(".lia-quilt-tkb-reply-message").insertBefore(tr_para,rootElement.querySelector(".lia-quilt-row.lia-quilt-row-footer"));
} else if(rootElement.querySelector(".lia-quilt-idea-message")){
rootElement.querySelector(".lia-quilt-idea-message").appendChild(tr_para);
}else if(rootElement.querySelector(".lia-quilt-column-alley-left")){
rootElement.querySelector(".lia-quilt-column-alley-left").appendChild(tr_para);
}
else {
if (rootElement.querySelectorAll('div.lia-quilt-row-footer').length > 0) {
rootElement.querySelectorAll('div.lia-quilt-row-footer')[0].appendChild(tr_para);
} else {
rootElement.querySelectorAll('div.lia-quilt-column-message-footer')[0].appendChild(tr_para);
}
}
}
} catch (e) {
}
}
} else {
/* Do not display button for same language */
// syncList.remove(value);
var index = $scope.syncList.indexOf(value);
if (index > -1) {
$scope.syncList.splice(index, 1);
}
}
}
}
}
}
angular.forEach(mszList_l, function (value) {
if (document.querySelectorAll('div.lia-js-data-messageUid-' + value).length > 0) {
var rootElements = document.querySelectorAll('div.lia-js-data-messageUid-' + value);
}else if(document.querySelectorAll('.lia-occasion-message-view .lia-component-occasion-message-view').length >0){
var rootElements = document.querySelectorAll('.lia-occasion-message-view .lia-component-occasion-message-view')[0].querySelectorAll('.lia-occasion-description')[0];
}else {
var rootElements = document.querySelectorAll('div.message-uid-' + value);
}
angular.forEach(rootElements, function (rootElement) {
if (value == '618888' && "TkbArticlePage" == "TkbArticlePage") {
rootElement = document.querySelector('.lia-thread-topic');
}
/* V1.1 Remove from UI */
if (document.getElementById("tr-msz-" + value)) {
document.getElementById("tr-msz-" + value).remove();
}
if (document.getElementById("tr-sync-" + value)) {
document.getElementById("tr-sync-" + value).remove();
}
/* XPath expression for subject and Body */
var lingoRBExp = "//lingo-body[@id = " + "'lingo-body-" + value + "'" + "]";
lingoRSExp = "//lingo-sub[@id = " + "'lingo-sub-" + value + "'" + "]";
/* Get translated subject of the message */
lingoRSXML = doc.evaluate(lingoRSExp, doc, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
for (var i = 0; i < lingoRSXML.snapshotLength; i++) {
/* Replace Reply/Comment subject with transalted subject */
var newSub = lingoRSXML.snapshotItem(i);
/*** START : extracting subject from source if selected language and source language is same **/
var sub_L = "";
if (newSub.getAttribute("slang").toLowerCase() == code_l.toLowerCase()) {
if (value == '618888') {
sub_L = decodeURIComponent($scope.sourceContent[value].subject);
}
else{
sub_L = decodeURIComponent($scope.sourceContent[value].subject);
}
} else {
sub_L = newSub.innerHTML;
}
/*** End : extracting subject from source if selected language and source language is same **/
/* This code is placed to remove the extra meta tag adding in the UI*/
try{
sub_L = sub_L.replace('<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />','');
}
catch(e){
}
// if($scope.viewTrContentOnly || (newSub.getAttribute("slang").toLowerCase() != code_l.toLowerCase())) {
if ($scope.viewTrContentOnly) {
if ("TkbArticlePage" == "IdeaPage") {
if (value == '618888') {
if( (sub_L != "") && (sub_L != undefined) && (sub_L != "undefined") ){
document.querySelector('.MessageSubject .lia-message-subject').innerHTML = sub_L;
}
}
}
if ("TkbArticlePage" == "TkbArticlePage") {
if (value == '618888') {
if( (sub_L != "") && (sub_L != undefined) && (sub_L != "undefined") ){
var subTkbElement = document.querySelector('.lia-thread-subject');
if(subTkbElement){
document.querySelector('.lia-thread-subject').innerHTML = sub_L;
}
}
}
}
else if ("TkbArticlePage" == "BlogArticlePage") {
if (value == '618888') {
try {
if((sub_L != "") && (sub_L!= undefined) && (sub_L != "undefined")){
var subElement = rootElement.querySelector('.lia-blog-article-page-article-subject');
if(subElement) {
subElement.innerText = sub_L;
}
}
} catch (e) {
}
/* var subElement = rootElement.querySelectorAll('.lia-blog-article-page-article-subject');
for (var subI = 0; subI < subElement.length; subI++) {
if((sub_L != "") && (sub_L!= undefined) && (sub_L != "undefined")){
subElement[subI].innerHTML = sub_L;
}
} */
}
else {
try {
// rootElement.querySelectorAll('.lia-blog-article-page-article-subject').innerHTML= sub_L;
/** var subElement = rootElement.querySelectorAll('.lia-blog-article-page-article-subject');
for (var j = 0; j < subElement.length; j++) {
if( (sub_L != "") && (sub_L != undefined) && (sub_L != "undefined") ){
subElement[j].innerHTML = sub_L;
}
} **/
} catch (e) {
}
}
}
else {
if (value == '618888') {
try{
/* Start: This code is written by iTalent as part of iTrack LILICON - 98 */
if( (sub_L != "") && (sub_L != undefined) && (sub_L != "undefined") ){
if(document.querySelectorAll('.lia-quilt-forum-topic-page').length > 0){
if(rootElement.querySelector('div.lia-message-subject').querySelector('h5')){
rootElement.querySelector('div.lia-message-subject').querySelector('h5').innerText = decodeURIComponent(sub_L);
} else {
rootElement.querySelector('.MessageSubject .lia-message-subject').innerText = sub_L;
}
} else {
rootElement.querySelector('.MessageSubject .lia-message-subject').innerText = sub_L;
}
}
/* End: This code is written by iTalent as part of iTrack LILICON - 98 */
}
catch(e){
console.log("subject not available for second time. error details: " + e);
}
} else {
try {
/* Start: This code is written by iTalent as part of LILICON - 98 reported by Ian */
if ("TkbArticlePage" == "IdeaPage") {
if( (sub_L != "") && (sub_L != undefined) && (sub_L != "undefined") ){
document.querySelector('.lia-js-data-messageUid-'+ value).querySelector('.MessageSubject .lia-message-subject').innerText = sub_L;
}
}
else{
if( (sub_L != "") && (sub_L != undefined) && (sub_L != "undefined") ){
rootElement.querySelector('.MessageSubject .lia-message-subject').innerText = sub_L;
/* End: This code is written as part of LILICON - 98 reported by Ian */
}
}
} catch (e) {
console.log("Reply subject not available. error details: " + e);
}
}
}
// Label translation
var labelEle = document.querySelector("#labelsForMessage");
if (!labelEle) {
labelEle = document.querySelector(".LabelsList");
}
if (labelEle) {
var listContains = labelEle.querySelector('.label');
if (listContains) {
/* Commenting this code as bussiness want to point search with source language label */
// var tagHLink = labelEle.querySelectorAll(".label")[0].querySelector(".label-link").href.split("label-name")[0];
var lingoLabelExp = "//lingo-label/text()";
trLabels = [];
trLabelsHtml = "";
/* Get translated labels of the message */
lingoLXML = doc.evaluate(lingoLabelExp, doc, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
/* try{
for(var j=0;j,';
}
trLabelsHtml = trLabelsHtml+'