Principal Component Scoring

Processes | Predictive Modeling | Principal Component Scoring

Principal Component Scoring

The Principal Component Scoring process projects the data from a secondary data set onto the principal component axes from a primary one. This is useful for aligning data from two different experiments or for preparing data for predictive modeling.

The algorithm works as follows: Row and column principal component scores are computed from the primary data using the singular value decomposition (SVD). The row scores are added to the primary data as output. Dot products of the column scores are computed with the secondary data to compute its projected row scores, and these are added to the secondary data as output. A combined output data set is also produced for purposes of comparison, for example with plotting and clustering .

What do I need?

Two Input Data Set s, a primary input data set and a secondary input data set, are required to run this process. These data sets should be in wide form and centered or scaled as desired. They must have exactly the same set of continuous variables .

For detailed information about the files and data sets used or created by JMP Life Sciences software, see Files and Data Sets .

Output/Results

Running this process generates an output data set consisting of the secondary output data set appended to the primary output data set with the addition of three columns listing the principal component row scores.

Both a 2-D Scatterplot Matrix and a Three-Dimensional Scatterplot of the PCA row scores are generated using the data from the output data set.