Using Multivariate Methods to Explore Data 

Presenter: Laura Higgins

Understanding and Using Multivariate Methods  (PCA, Clustering and K-Means)

See how to:    

  • Determine common variation and meaningful groups of variables
  • Group records that have a large number of variables into smaller clusters of new variables based on common characteristics
  • Interpret correlations between variable pairs using Scatterplot Matrices, Color Maps on Correlations and Parallel Coordinate Plots
  • Use Local Data Filter to examine correlations for different categories
  • Standardize variables for analysis by putting them on same scale
  • Find shared and uncorrelated variation among variables
  • Interpret Eigen Values to determine how many components to examine
  • Interpret Eigen Vectors, Partial Contribution of Variables, and Bartlett Tests to determine what new components mean
  • Save Principal Component values to data table for further exploration and analysis
  • Use K-Means Clustering to group observations that share similar values across a number of continuous variables
  • Interpret K-Means Biplots, Cluster Summaries, Cluster Means and Parallel Coordinate Plots
  • Color points by cluster
  • Save cluster formulas to data table so clusters can be updated when new data is added
  • Determine when to use Johnson Transformation to mitigate skewness
  • Use Hierarchical Clustering to group observations that share similar values across a number of categorical  or continuous variables
  • Interpret Hierarchical Cluster Dendrograms. Cluster Summaries, Cluster Means and Constellation Plots

Resources

Back to Top