The training set is the part that estimates model parameters.
The validation set is the part that assesses or validates the predictive ability of the model.
The test set is a final, independent assessment of the model’s predictive ability. The test set is available only when using a validation column (see Descriptions of Launch Window).
Randomly divides the original data into the training and validation data sets. The Validation Portion (see Descriptions of Launch Window) on the platform launch window is used to specify the proportion of the original data to use as the validation data set (holdback).
Randomly divides the original data into K subsets. In turn, each of the K sets is used to validate the model fit on the rest of the data, fitting a total of K models. The final model is selected based on the cross-validation RSquare, where a constraint is imposed to avoid overfitting the model. This method is useful for small data sets, because it makes efficient use of limited amounts of data. See KFold Crossvalidation.
Note: KFold validation is available only with the Decision Tree method. To use KFold, select K Fold Crossvalidation from the platform red-triangle menu, see Platform Options.
In KFold crossvalidation, the entire set of observations is partitioned into K subsets, called folds. Each fold is treated as a holdback sample and the remaining observations serve as a training set.