Model Tuning

Style

section-padding-none

What is model tuning?

Model tuning is finding the values of model hyperparameters that give you the best model performance. Examples of hyperparameters include the number of nodes in a neural network model or the learning rate in a gradient boosted model.

Software, like JMP, will often give good starting values for hyperparameters. For some problems, finding optimal values can lead to greatly improved predictions.

Design of experiments for model tuning

To find the best values of the hyperparameters, you can run the model using different values of the hyperparameters, selected using a designed experiment. It is common to use a space filling design for the hyperparameters rather than an optimal design or classic factorial-type design. A space filling design is often used in simulation experiments or computer experiments where either no random noise is expected or the purpose is to explore the factor space. Common space filling designs used to tune predictive models are the Latin hypercube design or the fast flexible filling design.

How do I tune a boosted random forest model?

Tuning a random forest model involves finding the best parameter values used for building a tree. These parameters will be problem-specific. Possible parameters include:

Number of layers used in boosting
Splits per tree
Learning rate
Minimum size split
Row sampling rate
Column sampling rate

Let’s create a tuning table for a boosted random forest model using the Percent Recovered response in the Recovery data (using the file found in our introduction to predictive modeling lesson). In this simple example, we’ll vary the number of layers, splits per tree, and learning rate. We choose to make 30 runs.

Factor specification for space filling design.

Latin hypercube tuning design for a boosted tree. A boosted tree was fit for each row in the table. The fit statistics on the validation set are displayed. The table has been sorted by decreasing R-Square.

The R-Square and RASE on the validation data are displayed. The best fit model has four splits per tree, a learning rate of 0.048, and 402 layers. R-Square on the validation set is 0.827. This run gives the best performance according to R-Square.

The beauty of designing an experiment is that we can fit a model to these 30 runs and use the model to predict settings of the hyperparameters to give best performance. Let’s fit a Gaussian process model to the experimental data.

The Prediction Profiler in JMP for Gaussian process model fit to the Latin hypercube space filling design data. The response is R-Square. The settings shown are those which are predicted to maximize R-Square on the validation set. A prediction of 0.8553 is made.

Next, let’s run a boosted tree using the predicted best values of the hyperparameters.

Fit statistics on tuned boosted tree model. R-Square on the validation set (0.833) is slightly higher than that of the best run from the space filling design (0.827).

How do I tune a neural network model?

Tuning a neural model involves finding the best values of the model hyperparameters to optimize fit statistics on the validation set. Possible hyperparameters include:

Number of hidden layers.
Number of nodes.
Activation functions.
Number of models in boosting.
Learning rate in boosting.
Number of tours, which controls how many times the algorithm selects starting values for the coefficients.

Let’s create a tuning table for a boosted neural model using the Percent Recovered response in the Recovery data. In this simple example, we’ll vary the number of TanH nodes, number of models in boosting, and the boosting learning rate. We make 30 runs.

Hyperparameter specification for tuning design for boosted neural network model.

Fast flexible space filling design for a boosted neural network. A boosted neural model was fit for each row in the table. The fit statistics on the validation holdout set are displayed. The table has been sorted by decreasing R-square.

You’ll notice that the best run has a validation R-square of 0.879. We can fit a model to these data using Validation R2 as a response then use the model to determine the best settings of the boosted neural hyperparameters.

The Prediction Profiler in JMP for Gaussian process model fit to the fast flexible space filling design data. The response is Validation R2. The settings shown are those which are predicted to maximize R-Square on the validation set. A prediction of 0.882 is made.

Next, let’s run a boosted neural network using the predicted best values of the hyperparameters.

Fit statistics on tuned boosted neural network model. R-Square on the validation set (0.884) is slightly higher than that of the best run from the space filling design (0.879).

layout

2 column

Style

columns-75-25, section-top-padding-xsmall