This section provides the formulas used in calculating distances based on the Method that you select on the launch window. For a description of the methods, see Method for Distance Calculation.

The formulas use the following notation, where lowercase symbols generally pertain to observations and uppercase symbols to clusters:

n is the number of observations

v is the number of variables

xi is the ith observation

CK is the Kth cluster, subset of {1, 2,..., n}

NK is the number of observations in CK

is the sample mean vector

is the mean vector for cluster CK

is the square root of the sum of the squares of the elements of x (the Euclidean length of the vector x)

d(xi, xj) is

Average Linkage

The distance for the average linkage cluster method is:

Centroid Method

The distance for the centroid method of clustering is:

Ward’s

The distance for Ward’s method is:

Single Linkage

The distance for the single linkage cluster method is:

Complete Linkage

Distance for the Complete linkage cluster method is:

