Publication date: 10/01/2019

K-Fold Crossvalidation

In k-fold cross validation, the entire set of observations is partitioned into K subsets, called folds. Each fold is treated as a holdback sample with the remaining observations as a training set.

Unconstrained optimization of the cross validation RSquare value tends to overfit models. To address this tendency, the k-fold cross validation stopping rule terminates stepping when improvement in the cross validation RSquare is minimal. Specifically, the stopping rule selects a model for which none of the next ten models have a cross validation RSquare showing an improvement of more than 0.005 units.

When you select the K Fold Crossvalidation option, a Crossvalidation report appears. The results in this report update as you split the decision tree. Or, if you click Go, the outline shows the results for the final model.

Crossvalidation Report

The Crossvalidation report shows the following:

k-fold

The number of folds.

-2LogLike or SSE

Gives twice the negative log-likelihood (-2LogLikelihood) values when the response is categorical. Gives sum of squared errors (SSE) when the response is continuous. The first row gives results averaged over the folds. The second row gives results for the single model fit to all observations.

RSquare

The first row gives the RSquare value averaged over the folds. The second row gives the RSquare value for the single model fit to all observations.

Want more information? Have questions? Get answers in the JMP User Community.