Publication date: 07/30/2020

The sample data table Diabetes.jmp was used to create a report for the continuous response Y.

Figure 4.9 Partition Report for a Continuous Response

The partition plot is initialized without any splits. Each point represents an observation in the data table. If validation is used, the plot is only for the training data.

Notice the following:

• The vertical axis represents the response value of the observations.

• Horizontal lines show the mean response value for each node of the decision tree. The initial horizontal line is at the overall mean of the response.

• Vertical axis divisions represent splits in the decision tree. A text description of the most recent split appears below the horizontal axis. Observations are reorganized into their respective nodes as splits are created or removed.

Tip: To see tooltips for narrow partitions, place your cursor over the labels on the horizontal axis of the partition plot.

Figure 4.10 Summary Report for a Continuous Response

The Summary Report provides fit statistics for the training data and validation and test data (if used). The fit statistics in the Summary Panel update as you add splits or prune the decision tree.

RSquare

The current value of R2.

RMSE

The root mean square error.

N

The number of observations.

Number of Splits

The current number of splits in the decision tree.

AICc

The corrected Akaike’s Information Criterion. See Likelihood, AICc, and BIC in Fitting Linear Models.

Each node in the tree has a report and a red triangle menu with additional options. Terminal nodes also have a Candidates report.

Figure 4.11 Terminal Node Report for a Continuous Response

Count

The number of observations (rows) in the branch.

Mean

The average response for all observations in that branch.

Std Dev

The standard deviation of the response for all observations in that branch.

Candidates

For each column, the Candidates report provides details about the optimal split for that column. The optimal split over all columns is marked with an asterisk.

Term

Shows the candidate columns.

Candidate SS

Sum of squares for the best split.

LogWorth

The LogWorth statistic, defined as -log10(p-value). The optimal split is the one that maximizes the LogWorth. See Statistical Details for the Partition Platform.

Cut Point

The value of the predictor that determines the split. For a categorical term, the levels in the left-most split are listed.

The optimum split is noted by an asterisk. However, there are cases where the Candidate SS is higher for one variable, but the Logworth is higher for a different variable. In this case > and < are used to point in the best direction for each variable. The asterisk corresponds to the condition where they agree. See Statistical Details for the Partition Platform.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).