The examples in this section use the Boston Housing.jmp data table. Your goal is to fit a model to predict the median home value as a function of several demographic characteristics. Build a tree using all three partitioning methods, and compare the results.
In Example of Partition, you fit a Decision Tree to the Boston Housing.jmp data. The prediction formula that you saved was placed in the column mvalue Predictor. For the Validation Portion used, Automatic Splitting resulted in five splits and a Validation RSquare of 0.569.
1.
Open the Boston Housing.jmp data table or select it as your active table.
2.
Select Analyze > Modeling > Partition.
3.
Assign mvalue to the Y, Response role.
4.
Assign the other variables (crim through lstat) to the X, Factor role.
5.
Select Bootstrap Forest from the Method menu.
6.
Enter 0.2 for the Validation Portion.
7.
8.
Click OK to accept the defaults in the Bootstrap Forest options window.
Bootstrap Forest Overall Statistics
9.
From the report’s red triangle menu, select Save Columns > Save Prediction Formula.
1.
Open the Boston Housing.jmp data table or select it as your active table.
2.
Select Analyze > Modeling > Partition.
3.
Assign mvalue to the Y, Response role.
4.
Assign the other variables (crim through lstat) to the X, Factor role.
5.
Select Boosted Tree from the Method menu.
6.
Enter 0.2 for the Validation Portion.
7.
8.
Click OK to accept the defaults in the Boosted Tree options window.
9.
Select Plot Actual by Predicted on the red-triangle menu.
Boosted Tree Overall Statistics
10.
From the report’s red triangle menu, select Save Columns > Save Prediction Formula.
Column Contributions shows a summary of the Column Contributions report from each method. For Decision Tree and Boosted Tree, rooms and lstat are the major contributors.
Column Contributions
2.
Select Analyze > Modeling > Model Comparison.
4.
The report, entitled Measures of Fit for mvalue, is shown in Measures of Fit for Three Partition Methods. The Boosted Tree method has the highest RSquare. It also has the lowest root average square error (RASE) and average absolute error (AAE).
Measures of Fit for Three Partition Methods