Fisher's Iris data set is the classic example of discriminant analysis. Four measurements are taken from a sample consisting of three different species. The goal is to identify the species accurately using the values of the four measurements. Open Iris.jmp, and select Analyze > Multivariate Methods > Discriminant to launch the Discriminant Analysis platform. The launch dialog in Discriminant Launch Dialog appears.
If you want to find which variables discriminate well, click the checkbox for Stepwise Variable Selection. Otherwise, the platform uses all the variables you specify. In this example, specify the four continuous variables as Y, Covariates and Species as X, Categories.
is a compromise between the linear and quadratic methods, governed by two arguments. When you choose Regularized Discriminant Analysis, a dialog appears allowing specification of these two parameters.
The first parameter (Lambda, Shrinkage to Common Covariance) specifies how to mix the individual and group covariance matrices. For this parameter, 1 corresponds to Linear Discriminant Analysis and 0 corresponds to Quadratic Discriminant Analysis.
The second parameter (Gamma, Shrinkage to Diagonal) specifies whether to deflate the nondiagonal elements, that is, the covariances across variables. If you choose 1, then the covariance matrix is forced to be diagonal.
Therefore, assigning 0,0 to these parameters is identical to requesting quadratic discriminant analysis. Similarly, a 1,0 assignment requests linear discriminant analysis. These cases, along with a Regularized Discriminant Analysis example with l=0.4 and g=0.4 are shown in Linear, Quadratic, and Regularized Discriminant Analysis.
Use Regularized Discriminant Analysis to help decide on the regularization.
If you choose Stepwise Variable Selection, a dialog appears (Stepwise Control Panel) to select variables. You can review which columns have large F ratios or small pvalues and control which columns are entered into the discriminant model. In addition, the dialog displays how many columns are currently in and out of the model, and the largest and smallest pvalues to be entered or removed.
Entered checkboxes show which columns are currently in the model. You can manually click columns in or out of the model.
Lock checkboxes are used when you want to force a column to stay in its current state regardless of any stepping by the buttons.
Step Forward adds the most significant column not already in the model.
Step Backward removes the least significant column already in the model.
Enter All enters all the columns into the model.
Remove All removes all the columns from the model.
Apply This Model is used when you are finished deciding the columns to include in the analysis, and want to proceed to estimation and scoring.
Stepped Model shows three forward steps, which add all the columns to the model except Sepal length.
Click Apply This Model to estimate the model. After estimation and scoring are done, two reports are produced: a Canonical Plot (Canonical Plot), and a Scoring Report.
•

Each row in the data set is a point, controlled by the Canonical Options > Show Points option.

•

•

The directions of the variables in the canonical space is shown by labeled rays emanating from the grand mean. This is controlled by the Canonical Options > Show Biplot Rays option. You can drag the center of the biplot rays to other places in the graph.

•

The option Show Normal 50% Contours shows areas that contain roughly 50% of the points for that group if the assumptions are correct. Under linear discriminant analysis, they are all the same size and shape.

In order to have the points colorcoded like the centroid circles, use the Color Points option or button. This is equivalent to Rows > Color or Mark by Column, coloring by the classification column.
The canonical plot can also be referred to as a biplot when both the points and the variable direction rays are shown together, as in Canonical Plot. It is identical to the Centroid plot produced in the Manova personality of the Fit Model platform.
In Show Interesting Rows Only, the option Show Interesting Rows Only option is set so that only those rows that have fitted probabilities between 0.05 and 0.95 or are misclassified are shown.