Overview of the Distance Matrix Platform

The Distance Matrix process computes various measures of distance or dissimilarity between the observations (rows) of a JMP table. These proximity measures are stored as a square matrix in an output data set, which can then be used as input for downstream processes. The resulting distance matrices represent the dissimilarity between samples and are commonly interpreted as measures of beta diversity.

The input data set should contain only numeric variables, as most proximity measures require numeric input. For certain binary distance measures, such as Jaccard, Simple Matching, or Hamming, the variables should be coded by using 0s and 1s to indicate the absence (0) or presence (1) of each feature. When the table contains count data, these measures can still be applied by treating all nonzero values as 1s and zeros as 0s. For example, if the data table contains character data , representing features, create a numeric table with 0s and 1s to indicate presence or absence before using these binary distance measures.

The number of rows and columns in the output matrix equals the number of observations in the input data set. If there are By groups, an output matrix is computed for each group, and the size is determined by the maximum number of observations in any By group.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).