For a set of
objects, a distance matrix
can be calculated such that
is a measure of how ‘far apart’ objects
and
are (see
g03eac for example). Principal coordinate analysis or metric scaling starts with a distance matrix and finds points
in Euclidean space such that those points have the same distance matrix. The aim is to find a small number of dimensions,
, that provide an adequate representation of the distances.
The principal coordinates of the points are computed from the eigenvectors of the matrix where with denoting the average of over the suffix etc.. The eigenvectors are then scaled by multiplying by the square root of the value of the corresponding eigenvalue.
Provided that the ordered eigenvalues,
, of the matrix
are all positive,
shows how well the data is represented in
dimensions. The eigenvalues will be non-negative if
is positive semidefinite. This will be true provided
satisfies the inequality:
for all
. If this is not the case the size of the negative eigenvalue reflects the amount of deviation from this condition and the results should be treated cautiously in the presence of large negative eigenvalues. See
Krzanowski (1990) for further discussion.
g03fac provides the option for all eigenvalues to be computed so that the smallest eigenvalues can be checked.
Gower J C (1966) Some distance properties of latent root and vector methods used in multivariate analysis Biometrika 53 325–338
-
1:
– Nag_Eigenvalues
Input
-
On entry: indicates if all the eigenvalues are to be computed or just the
ndim largest.
- All the eigenvalues are computed.
- Only the largest ndim eigenvalues are computed.
Constraint:
or .
-
2:
– Integer
Input
-
On entry: the number of objects in the distance matrix, .
Constraint:
.
-
3:
– const double
Input
-
On entry: the lower triangle of the distance matrix stored packed by rows. That is, must contain , for and .
Constraint:
, for .
-
4:
– Integer
Input
-
On entry: the number of dimensions used to represent the data, .
Constraint:
.
-
5:
– double
Output
-
Note: the th element of the matrix is stored in .
On exit: the th row contains coordinates for the th point, .
-
6:
– Integer
Input
-
On entry: the stride separating matrix column elements in the array
x.
Constraint:
.
-
7:
– double
Output
-
On exit: if
,
eval contains the
scaled eigenvalues of the matrix
. If
,
eval contains the largest
scaled eigenvalues of the matrix
. In both cases the eigenvalues are divided by the sum of the eigenvalues (that is, the trace of
).
-
8:
– NagError *
Input/Output
-
The
NAG error argument (see
Section 7 in the Introduction to the
NAG Library CL Interface).
Alternative, non-metric, methods of scaling are provided by
g03fcc.
The relationship between principal coordinates and principal components (see
g03fcc), is discussed in
Krzanowski (1990) and
Gower (1966).
Background information to multithreading can be found in the
Multithreading documentation.
None.
The data, given by
Krzanowski (1990), are dissimilarities between water vole populations in Europe. The first two principal coordinates are computed.