# `library.mv` Submodule¶

## Module Summary¶

Interfaces for the NAG Mark 30.1 mv Chapter.

`mv` - Multivariate Methods

This module is concerned with methods for studying multivariate data. A multivariate dataset consists of several variables recorded on a number of objects or individuals. Multivariate methods can be classified as those that seek to examine the relationships between the variables (e.g., principal components), known as variable-directed methods, and those that seek to examine the relationships between the objects (e.g., cluster analysis), known as individual-directed methods.

Multiple regression is not included in this module as it involves the relationship of a single variable, known as the response variable, to the other variables in the dataset, the explanatory variables. Routines for multiple regression are provided in submodule `correg`.

`naginterfaces.library.examples.mv` :

This subpackage contains examples for the `mv` module. See also the Examples subsection.

## Functionality Index¶

Canonical correlation analysis: `canon_corr()`

Canonical variate analysis: `canon_var()`

Cluster Analysis

compute distance matrix: `distance_mat()`

compute distance matrix for two data sets: `distance_mat_2()`

construct clusters following `cluster_hier()`: `cluster_hier_indicator()`

construct dendrogram following `cluster_hier()`: `cluster_hier_dendrogram()`

Gaussian mixture model: `gaussian_mixture()`

Gaussian mixture model, submatrix outputs: `gaussian_mixture_ld()`

hierarchical: `cluster_hier()`

K-means: `cluster_kmeans()`

Discriminant Analysis

allocation of observations to groups, following `discrim()`: `discrim_group()`

Mahalanobis squared distances, following `discrim()`: `discrim_mahal()`

test for equality of within-group covariance matrices: `discrim()`

Factor Analysis

factor score coefficients, following `factor()`: `factor_score()`

maximum likelihood estimates of parameters: `factor()`

Principal component analysis: `prin_comp()`

Rotations

orthogonal rotations for loading matrix: `rot_orthomax()`

Procustes rotations: `rot_procrustes()`

ProMax rotations: `rot_promax()`

Scaling Methods

multidimensional scaling: `multidimscal_ordinal()`

principal coordinate analysis: `multidimscal_metric()`

Standardize values of a data matrix: `z_scores()`

For full information please refer to the NAG Library document

https://support.nag.com/numeric/nl/nagdoc_30.1/flhtml/g03/g03intro.html

## Examples¶

naginterfaces.library.examples.mv.cluster_kmeans_ex.main()[source]

K-means cluster analysis.

```>>> main()
naginterfaces.library.mv.cluster_kmeans Python Example Results.
K-means cluster analysis.
The cluster to which each point belongs:
[1, 1, 3, 2, 3, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3]
The number of points in each cluster:
[6, 3, 11]
The within-cluster sum of weights of each cluster:
[6.00, 3.00, 11.00]
The within-cluster sum of squares of each cluster:
[46.57, 20.38, 468.90]
The final cluster centres:
[
8.1183e+01, 1.1667e+01, 7.1500e+00, 2.0500e+00, 6.6000e+00
4.7867e+01, 3.5800e+01, 1.6333e+01, 2.4000e+00, 6.7333e+00
6.4045e+01, 2.5209e+01, 1.0745e+01, 2.8364e+00, 6.6545e+00
]
```
naginterfaces.library.examples.mv.discrim_group_ex.main()[source]

Discriminant analysis.

```>>> main()
naginterfaces.library.mv.discrim_group Python Example Results.
Discriminant analysis for diagnosis of Cushing's syndrome.
Obs       Posterior        Allocated     Atypicality
probabilities    to group      index
1      0.094 0.905 0.002     2      0.596 0.254 0.975
2      0.005 0.168 0.827     3      0.952 0.836 0.018
3      0.019 0.920 0.062     2      0.954 0.797 0.912
4      0.697 0.303 0.000     1      0.207 0.860 0.993
5      0.317 0.013 0.670     3      0.991 1.000 0.984
6      0.032 0.366 0.601     3      0.981 0.978 0.887
```
naginterfaces.library.examples.mv.gaussian_mixture_ex.main()[source]

Fits a mixture of Gaussians for a given (co)variance structure.

```>>> main()
naginterfaces.library.mv.gaussian_mixture Python Example Results.
Fits a Gaussian mixture model with pooled covariance structure to
New Haven schools test data.
The final membership probabilities are:
[
9.50176891e-01, 4.98231095e-02
3.32590884e-06, 9.99996674e-01
9.99613355e-01, 3.86644659e-04
9.99920087e-01, 7.99127116e-05
3.89990173e-02, 9.61000983e-01
9.32704894e-01, 6.72951064e-02
9.88809712e-01, 1.11902877e-02
4.12521422e-03, 9.95874786e-01
9.72521486e-01, 2.74785140e-02
9.99691952e-01, 3.08048285e-04
2.17221867e-01, 7.82778133e-01
7.69380852e-01, 2.30619148e-01
9.99973063e-01, 2.69370297e-05
6.11334389e-03, 9.93886656e-01
4.41893305e-02, 9.55810670e-01
3.50057883e-04, 9.99649942e-01
9.99902971e-01, 9.70286734e-05
4.02698414e-05, 9.99959730e-01
9.73798317e-01, 2.62016830e-02
3.02036785e-04, 9.99697963e-01
6.94705604e-02, 9.30529440e-01
4.16030240e-03, 9.95839698e-01
3.08391490e-02, 9.69160851e-01
9.91157909e-01, 8.84209120e-03
4.15339034e-04, 9.99584661e-01
]
The log-likehood is -29.683060270297.
There were 14 function evaluations required.
```
naginterfaces.library.examples.mv.prin_comp_ex.main()[source]

Example for `naginterfaces.library.mv.prin_comp()`.

Perform a principal component analysis on a data matrix; both the principal component loadings and the principal component scores are returned.

```>>> main()
naginterfaces.library.mv.prin_comp Python Example Results.
Perform an unweighted principal component analysis on a dataset
from Cooley and Lohnes (1971). The statistics of the principal
component analysis are:
Eigenvalues  Percentage  Cumulative       Chisq          DF         Sig
variation   variation
[
8.2739,     0.6515,     0.6515,     8.6127,     5.0000,     0.1255
3.6761,     0.2895,     0.9410,     4.1183,     2.0000,     0.1276
0.7499,     0.0590,     1.0000,     0.0000,     0.0000,     0.0000
]
```