g03ejc computes a cluster indicator variable from the results of
g03ecc.
Given a distance or dissimilarity matrix for
objects, cluster analysis aims to group the
objects into a number of more or less homogeneous groups or clusters. With agglomerative clustering methods (see
g03ecc), a hierarchical tree is produced by starting with
clusters each with a single object and then at each of
stages, merging two clusters to form a larger cluster until all objects are in a single cluster.
g03ejc takes the information from the tree and produces the clusters that exist at a given distance. This is equivalent to taking the dendrogram (see
g03ehc) and drawing a line across at a given distance to produce clusters.
As an alternative to giving the distance at which clusters are required, you can specify the number of clusters required and g03ejc will compute the corresponding distance. However, it may not be possible to compute the number of clusters required due to ties in the distance matrix.
- NE_2_INT_ARG_GT
-
On entry, while . These arguments must satisfy .
- NE_CLUSTER
-
The precise number of clusters requested is not possible because of
tied clustering distances. The actual number of clusters produced is .
- NE_INCOMP_ARRAYS
-
Arrays
cd and
dord are not compatible.
- NE_INT_ARG_LT
-
On entry, .
Constraint: .
- NE_INTERNAL_ERROR
-
An internal error has occurred in this function. Check the function call
and any array sizes. If the call is correct then please contact
NAG for
assistance.
- NE_NOT_INCREASING
-
The sequence
cd is not increasing:
,
.
- NE_REAL_INT
-
On entry, , .
Constraint: and .
- NW_2_INT
-
On exit, , .
Trivial solution returned.
- NW_INT
-
On exit, .
Trivial solution returned.
- NW_REAL_REALARR
-
On entry, , .
Trivial solution returned.
The accuracy will depend upon the accuracy of the distances in
cd and
dord (see
g03ecc).
Background information to multithreading can be found in the
Multithreading documentation.
A fixed number of clusters can be found using the non-hierarchical method used in
g03efc.
Data consisting of three variables on five objects are input. Euclidean squared distances are computed using
g03eac and median clustering performed using
g03ecc. A dendrogram is produced by
g03ehc and printed.
g03ejc finds two clusters and the results are printed.