Predictive and Specialized Modeling > Statistical Details > Platforms That Support Validation
Publication date: 10/01/2019

Platforms That Support Validation

This appendix lists the types of cross validation available in each platform. The types of cross validation are defined as follows:

Use Excluded Rows as Validation Holdback

Uses the excluded rows in the data table as a validation holdback set.

Note: For platforms that support using excluded rows as a validation holdback set, the excluded rows are used only when there is no validation column or validation proportion specified in the launch window.

Random Validation Holdback

Randomly divides the original data into the training and validation sets. A test set can also be included. You can specify the proportions of the original data to use in each set.

K-Fold Cross-Validation

Divides the original data into K subsets. In turn, each of the K sets is used to validate the model fit on the rest of the data, fitting a total of K models. The model giving the best validation statistic is chosen as the final model.

Note: For some platforms, specify in model control launch. For others in launch. For still others, through validation column

Validation Role Column

Uses the column’s values to divide the data into parts. The column is assigned using the Validation role on the platform’s launch window.

Note: Different platforms treat a column with more than 3 levels differently. See notes in the following table.

Platform

Use Excluded Rows as Validation Holdback

Random Validation Holdback

K-Fold Cross-Validation

Validation Role Column

Fit Model > Fit Least Squares

No

No

No

Yes (for model evaluation only)1

Fit Model > Forward Stepwise Regression

No

No

Yes (for continuous response models only)

Yes

Fit Model > Logistic Regression

No

No

No

Yes (for model evaluation only)a

Fit Model > Generalized Regression

No

Yes

Yes

Yes

Fit Model > Partial Least Squares

No

Yes

Yes

Yes

Partition

Yes

Yes

Yes

Yes2

Bootstrap Forest

Yes

Yes

No

Yesb

Boosted Tree

Yes

Yes

No

Yesb

K Nearest Neighbors

Yes

Yes

No

Yesb

Naive Bayes

Yes

Yes

No

Yesb

Neural

Yes

Yes

Yes (through model launch or validation column with more than 3 levels)

Yes

K Nearest Neighbors

Yes

Yes

No

Yesb

Naive Bayes

Yes

Yes

No

Yesb

Support Vector Machines

No

Yes

Yes (through model launch)

Yes

Functional Data Explorer

No

No

No

Yes (must be structured as a “Grouped Random” validation column)3

Discriminant

Optional

No

No

Yesb

Partial Least Squares

No

Yes

Yes (through model launch or validation column with more than 3 levels)

Yes

Uplift

No

Yes

No

Yesb


1 If there are more than three levels, the validation column is ignored.


2 If there are more than three levels, the platform uses only rows with the three smallest values.


3 If there are more than two levels, the smallest value defines the training set and all other values define the validation set.


Want more information? Have questions? Get answers in the JMP User Community.