Publication date: 04/12/2021

The Latent Semantic Analysis option in the Text Explorer platform produces two SVD plots and a table of the singular values from the singular value decomposition.

The first plot contains a point for each document. For a given document, the point that is plotted is defined by the document’s values in the first two singular vectors (the first two columns of the U matrix) multiplied by the diagonal singular values matrix (S). This plot is equivalent to the Score Plot in the Principal Components platform. Each point in this plot represents a document (row of the data table). You can select the points in this plot to select the corresponding rows in the data table.

The second plot contains a point for each term. For a given term, the point that is plotted is defined by the term’s values in the first two singular vectors (the first two rows of the V‘ matrix) multiplied by the diagonal singular values matrix (S). This plot is equivalent to the Loadings Plot in the Principal Components platform. In this plot, the points correspond to rows in the Term List table.

Above each of the SVD Plots, you can click a Show Text button to open a window that contains the text of the selected points in the plot.

Below the document and term SVD plots, a table of the singular values appears. These are the diagonal entries of the S matrix in the singular value decomposition of the document term matrix. The Singular Values table also contains a column of corresponding eigenvalues for the equivalent principal components analysis. Like in the Principal Components platform, there are columns for the percent and cumulative percent of variation explained by each eigenvalue (or singular value). You can use the Cum Percent column to decide what percent of variance from the DTM you want to preserve, and then use the corresponding number of singular vectors.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).