Launch the Principal Components platform by selecting Analyze > Multivariate Methods > Principal Components. Principal Component analysis is also available using the Multivariate and the Scatterplot 3D platforms.
The example described in Example of Principal Component Analysis uses all of the continuous variables from the Solubility.jmp sample data table.
Lists different methods for calculating the correlations. Several of these methods address the treatment of missing data. See Estimation Methods.
•
|
•
|
The Default option uses either the Row-wise, Pairwise, or REML methods:
•
|
Row-wise is used for data tables with no missing values.
|
•
|
Pairwise is used in the following circumstances:
|
•
|
REML is used otherwise.
|
Robust estimation is useful for data tables that might have outliers. For statistical details, see Robust.
The Wide method is useful when you have a very large number of columns in your data. It uses a computationally efficient algorithm that avoids calculating the covariance matrix. The algorithm is based on the singular value decomposition. For for additional background, see Wide Linear Methods and the Singular Value Decomposition in Statistical Details.
•
|
n = number of rows
|
•
|
p = number of variables
|
•
|
The number of nonzero eigenvalues, and consequently the number of principal components, equals the rank of the correlation matrix of X. The number of nonzero eigenvalues cannot exceed the smaller of n and p.
When you select the Wide method, the data are standardized. To standardize a value, subtract its mean and divide by its standard deviation. Denote the n by p matrix of standardized data values by Xs. Then the covariance matrix of the standardized data is the correlation matrix of X and it is given as follows:
Using the singular value decomposition, Xs is written as UDiag(Λ)V’. This representation is used to obtain the eigenvectors and eigenvalues of Xs’Xs. The principal components, or scores, are given by .
Note: When you select the Default estimation method and enter more than 500 variables as Y, Columns, a JMP Alert recommends that you switch to the Wide estimation method. This is because computation time can be considerable when you use the other methods with a large number of columns. Click Wide to switch to the Wide method. Click Continue to use the method you originally selected.