The
Test Set Average AUC Range
tab contains the following elements:
The plot shows the average area under the curve ((
AUC
) learning curve for each
model
as a solid line, and a shaded region around it indicating the range of individual curves. The width of the shading provides a measure of variability for the average learning curves. Curves with narrow bands are more reliable than those with wide bands.
Learning curves are constructed by using a succession of different sized subsets of the full data and assessing
cross validation
performance on each.
Sample size
is plotted on the
x
-axis while the cross validation performance metric is plotted on the
y
-axis. The primary goal of this process is to determine whether adding more samples will change performance. This is achieved by inspecting the slope of the curves, especially toward the right-hand side. If the curves have a slope similar to that show in this example, it is likely that adding more samples will improve performance. If the slopes are flat, adding more samples will likely not make much of a difference.
The AUC is the area found
below
the Receiver Operating Characteristics (
ROC
) curve, which plots true-positive predictions versus false-positive predictions for a binary-response
variable
. The greater the AUC, the better the model is at predicting true-positive responses. See
Receiver Operating Characteristics (ROC) Curves
for additional information
Refer to
AUC
for more information about this statistic.