Row Diagnostics

The row diagnostics menu addresses issues specific to rows, or observations.

Description of Row Diagnostics Options
Plot Regression	Shows a Regression Plot report, displaying a scatterplot of the data and regression lines for each level of the categorical effect. Note: This option only appears if there is exactly one continuous effect and no more than one categorical effect in the model. In that case, the Regression Plot report is provided by default.
Plot Actual by Predicted	Shows an Actual by Predicted plot, which plots the observed values of Y against the predicted values of Y. This plot is the leverage plot for the whole model. See Leverage Plots.
Plot Effect Leverage	Shows a Leverage Plot report for each effect in the model. The plot shows how observations influence the test for that effect and gives insight on multicollinearity. See Leverage Plots. Note: Effect Leverage Plots are shown by default when Effect Leverage is selected as the Emphasis in the Fit Model launch window. They appear to the right of the Whole Model report. When another Emphasis is selected, the Effect Leverage Plots appear in the Effect Details report. In all cases, the option Regression Reports > Effect Details must be selected in order for Effect Leverage plots to display.
Plot Residual By Predicted	Shows a Residual by Predicted Plot report. The plot shows the residuals plotted against the predicted values of Y. You typically want to see the residual values scattered randomly about zero.
Plot Residual By Row	Shows a Residual by Row Plot report. The residual values are plotted against the row numbers. This plot can help you detect patterns that result from the row ordering of the observations.
Press	Shows a Press Report giving the Press statistic and its root mean square error (RMSE). The Press statistic is useful when comparing multiple models. Models with lower Press statistics are favored. (For details, see Press.)
Durbin-Watson Test	Shows the Durbin-Watson report, which gives a statistic to test whether the residuals have first-order autocorrelation. The report also displays the autocorrelation of the residuals. This option is appropriate only for time series data and assumes that your observations are in time order. Note: The single report option is Significance P Value. This option computes and displays Prob<DW, the exact probability associated with the statistic. The computation of this exact probability can be memory and time-intensive if there are many observations.

Leverage Plots

An effect leverage plot for X is useful in the following ways:

•	You can see which points might be exerting influence on the hypothesis test for X.

•	You can spot unusual patterns and violations of the model assumptions.

•	You can spot multicollinearity issues.

Construction

A leverage plot for an effect shows the impact of adding this effect to the model, given the other effects already in the model. For illustration, consider the construction of an effect leverage plot for a single continuous effect X. See X Axis Scaling for information about the scaling of the x-axis in more general situations.

The response Y is regressed on all the predictors except X, and the residuals are obtained. Call these residuals the Y-residuals. Then X is regressed on all the other predictors in the model and the residuals are computed. Call these residuals the X-residuals. The X-residuals might contain information beyond what is present in the Y-residuals, which were obtained without X in the model.

The effect leverage plot for X is essentially a scatterplot of the X-residuals against the Y-residuals (Whole Model and Effect Leverage Plots). To help interpretation and comparison with other plots that you might construct, JMP adds the mean of Y to the Y-residuals and the mean of X to the X-residuals. The translated Y-residuals are called the Y Leverage Residuals and the translated X-residuals are called X Leverage values. The points on the Effect Leverage plots are these X Leverage and Y Leverage Residual pairs.

JMP fits a least squares line to these points as well as confidence bands for the mean; the line of fit is solid red and the confidence bands are shown as dashed red curves. The slope of the least squares line is precisely the estimate of the coefficient on X in the model where Y is regressed on X and the other predictors. The dashed horizontal blue line is set at the mean of the Y Leverage Residuals. This line describes a situation where the X residuals are not linearly related to the Y residuals. If the line of fit has nonzero slope, then adding X to the model can be useful in terms of explaining variation.

Illustration of a Generic Leverage Plot shows how residuals are depicted in the leverage plot. The distance from a point to the line of fit is the residual for a model that includes the effect. The distance from the point to the horizontal line is what the residual error would be without the effect in the model. In other words, the mean line in the leverage plot represents the model where the hypothesized value of the parameter (effect) is constrained to zero.

Illustration of a Generic Leverage Plot

Confidence Curves

Confidence curves for the line of fit are shown on leverage plots. These curves provide a visual indication of whether the test of interest is significant at the 5% level (or at the Set Alpha Level that you specified in the Fit Model launch window). If the confidence region between the curves contains the horizontal line representing the hypothesis, then the effect is not significant. If the curves cross the line, the effect is significant. See the examples in Comparison of Significance Shown in Leverage Plots.

Comparison of Significance Shown in Leverage Plots

X Axis Scaling

If the modeling type of a predictor X is continuous, then the x-axis is scaled in terms of the units of the X. The x-axis range mirrors the range of X values. The slope of the line of fit in the leverage plot is the parameter estimate for X. See the left illustration in Whole Model and Effect Leverage Plots.

If the effect is nominal or ordinal, or if the effect is a complex effect such as an interaction, then the x-axis cannot represent the values of the effect directly. In this case the x-axis is scaled in units of the response, and the line of fit is a diagonal with a slope of 1. The Whole Model leverage plot, where the hypothesis of interest is that all parameter values are zero, uses this scaling. (See Leverage Plot Details.) For this plot, the x-axis is scaled in terms of predicted response values for the whole model, as illustrated by the right-hand plot in Whole Model and Effect Leverage Plots.

The leverage plot for the linear effect in a simple regression is the same as the traditional plot of actual response values against the predictor.

Leverage

The term leverage is used because these plots help you visualize the influence of points on the test for including the effect in the model. A point that is horizontally distant from the center of the plot exerts more influence on the effect test than does a point that is close to the center. Recall that the test for an effect involves comparing the sum of squared residuals to the sum of squared residuals of the model with that effect removed. At the extremes, the differences of the residuals before and after being constrained by the hypothesis tend to be comparatively larger. Therefore, these residuals tend to have larger contributions to the sums of squares for that effect’s hypothesis test.

Multicollinearity

Multicollinearity is a condition where two or more predictors are highly related, or more technically, involved in a nearly linear dependent relationship. When multicollinearity is present, standard errors can be inflated and parameters estimates can be unstable. If an effect is collinear with other predictors, the y-axis values are very close to the horizontal line at the mean, because the effect brings no new information. Because of the dependency, the x-axis values also tend to cluster toward the middle of the plot. This situation indicates that the slope of the line of fit is unstable.

The Whole Model Actual by Predicted Plot

The Plot Effect Leverage option produces a leverage plot for each effect in the model. In addition, the Actual by Predicted plot can be considered to be a leverage plot. This plot lets you visualize the test that all the parameters in the model (except the intercept) are zero. The same test is conducted analytically in the Analysis of Variance report. (See Leverage Plot Details for details about this plot.)

Example of a Leverage Plot for a Linear Effect

1.	Select Help > Sample Data Library and open Big Class.jmp.

2.	Select Analyze > Fit Model.

3.	Select weight and click Y.

4.	Select height, age, and sex, and click Add.

5.	Click Run.

The Whole Model Actual by Predicted Plot and the effect Leverage Plot for height are shown in Whole Model and Effect Leverage Plots. The Whole Model plot, on the left, tests for all effects. You can infer that the model is significant because the confidence curves cross the horizontal line at the mean of the response, weight. The Leverage Plot for height, on the right, also shows that height is significant, even with age and sex in the model. Neither plot suggests concerns relative to influential points or multicollinearity.

Whole Model and Effect Leverage Plots

Press

The Press, or prediction error sum of squares, statistic is an estimate of prediction error computed using leave-one-out cross validation. In leave-one-out cross validation, each observation, in turn, is removed. Consider a specific observation. The model is fit with that observation withheld and then a predicted value is obtained for that observation. The residual for that observation is computed. This procedure is applied to all observations and the residuals are squared and summed to give the Press value.

Specifically, the Press statistic is given by

where n is the number of observations, yi is the observed response value for the ith observation, and

is the predicted response value for the ith observation. These values are based on a model fit without including that observation.

The Press RMSE is defined as