# Source code for naginterfaces.library.mv

# -*- coding: utf-8 -*-
r"""
Module Summary
--------------
Interfaces for the NAG Mark 29.2 mv Chapter.

mv - Multivariate Methods

This module is concerned with methods for studying multivariate data.
A multivariate dataset consists of several variables recorded on a number of objects or individuals.
Multivariate methods can be classified as those that seek to examine the relationships between the variables (e.g., principal components), known as variable-directed methods, and those that seek to examine the relationships between the objects (e.g., cluster analysis), known as individual-directed methods.

Multiple regression is not included in this module as it involves the relationship of a single variable, known as the response variable, to the other variables in the dataset, the explanatory variables.
Routines for multiple regression are provided in submodule :mod:~naginterfaces.library.correg.

--------
naginterfaces.library.examples.mv :
This subpackage contains examples for the mv module.
See also the :ref:library_mv_ex subsection.

Functionality Index
-------------------

Canonical correlation analysis: :meth:canon_corr

Canonical variate analysis: :meth:canon_var

**Cluster Analysis**

compute distance matrix: :meth:distance_mat

construct clusters following :meth:cluster_hier: :meth:cluster_hier_indicator

construct dendrogram following :meth:cluster_hier: :meth:cluster_hier_dendrogram

Gaussian mixture model: :meth:gaussian_mixture

hierarchical: :meth:cluster_hier

K-means: :meth:cluster_kmeans

**Discriminant Analysis**

allocation of observations to groups, following :meth:discrim: :meth:discrim_group

Mahalanobis squared distances, following :meth:discrim: :meth:discrim_mahal

test for equality of within-group covariance matrices: :meth:discrim

**Factor Analysis**

factor score coefficients, following :meth:factor: :meth:factor_score

maximum likelihood estimates of parameters: :meth:factor

Principal component analysis: :meth:prin_comp

**Rotations**

orthogonal rotations for loading matrix: :meth:rot_orthomax

Procustes rotations: :meth:rot_procrustes

ProMax rotations: :meth:rot_promax

**Scaling Methods**

multidimensional scaling: :meth:multidimscal_ordinal

principal coordinate analysis: :meth:multidimscal_metric

Standardize values of a data matrix: :meth:z_scores

For full information please refer to the NAG Library document

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03intro.html
"""

[docs]def prin_comp(matrix, std, x, isx, s, nvar, wt=None):
r"""
prin_comp performs a principal component analysis on a data matrix; both the principal component loadings and the principal component scores are returned.

.. _g03aa-py2-py-doc:

For full information please refer to the NAG Library document for g03aa

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03aaf.html

.. _g03aa-py2-py-parameters:

**Parameters**
**matrix** : str, length 1
Indicates for which type of matrix the principal component analysis is to be carried out.

:math:\mathrm{matrix} = \texttt{'C'}

It is for the correlation matrix.

:math:\mathrm{matrix} = \texttt{'S'}

It is for a standardized matrix, with standardizations given by :math:\mathrm{s}.

:math:\mathrm{matrix} = \texttt{'U'}

It is for the sums of squares and cross-products matrix.

:math:\mathrm{matrix} = \texttt{'V'}

It is for the variance-covariance matrix.

**std** : str, length 1
Indicates if the principal component scores are to be standardized.

:math:\mathrm{std} = \texttt{'S'}

The principal component scores are standardized so that :math:F^{\prime }F = I, i.e., :math:F = X_sP\Lambda^{-1} = V.

:math:\mathrm{std} = \texttt{'U'}

The principal component scores are unstandardized, i.e., :math:F = X_sP = V\Lambda.

:math:\mathrm{std} = \texttt{'Z'}

The principal component scores are standardized so that they have unit variance.

:math:\mathrm{std} = \texttt{'E'}

The principal component scores are standardized so that they have variance equal to the corresponding eigenvalue.

**x** : float, array-like, shape :math:\left(n, m\right)
:math:\mathrm{x}[\textit{i}-1,\textit{j}-1] must contain the :math:\textit{i}\ th observation for the :math:\textit{j}\ th variable, for :math:\textit{j} = 1,2,\ldots,m, for :math:\textit{i} = 1,2,\ldots,n.

**isx** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isx}[j-1] indicates whether or not the :math:j\ th variable is to be included in the analysis.

If :math:\mathrm{isx}[\textit{j}-1] > 0, the variable contained in the :math:\textit{j}\ th column of :math:\mathrm{x} is included in the principal component analysis, for :math:\textit{j} = 1,2,\ldots,m.

**s** : float, array-like, shape :math:\left(m\right)
The standardizations to be used, if any.

If :math:\mathrm{matrix} = \texttt{'S'}, the first :math:m elements of :math:\mathrm{s} must contain the standardization coefficients, the diagonal elements of :math:\sigma.

**nvar** : int
:math:p, the number of variables in the principal component analysis.

**wt** : None or float, array-like, shape :math:\left(n\right), optional
If :math:\mathrm{wt}\text{ is not }\mathbf{None}, the first :math:n elements of :math:\mathrm{wt} must contain the weights to be used in the principal component analysis.

If :math:\mathrm{wt}[i-1] = 0.0, the :math:i\ th observation is not included in the analysis.

The effective number of observations is the sum of the weights.

If :math:\mathrm{wt} is **None** the effective number of observations is :math:n.

**Returns**
**s** : float, ndarray, shape :math:\left(m\right)
If :math:\mathrm{matrix} = \texttt{'S'}, :math:\mathrm{s} is unchanged on exit.

If :math:\mathrm{matrix} = \texttt{'C'}, :math:\mathrm{s} contains the variances of the selected variables. :math:\mathrm{s}[j-1] contains the variance of the variable in the :math:j\ th column of :math:\mathrm{x} if :math:\mathrm{isx}[j-1] > 0.

If :math:\mathrm{matrix} = \texttt{'U'} or :math:\texttt{'V'}, :math:\mathrm{s} is not referenced.

**e** : float, ndarray, shape :math:\left(\mathrm{nvar}, 6\right)
The statistics of the principal component analysis.

:math:\mathrm{e}[i-1,0]

The eigenvalues associated with the :math:\textit{i}\ th principal component, :math:\lambda_{\textit{i}}^2, for :math:\textit{i} = 1,2,\ldots,p.

:math:\mathrm{e}[i-1,1]

The proportion of variation explained by the :math:\textit{i}\ th principal component, for :math:\textit{i} = 1,2,\ldots,p.

:math:\mathrm{e}[\textit{i}-1,2]

The cumulative proportion of variation explained by the first :math:\textit{i}\ th principal components, for :math:\textit{i} = 1,2,\ldots,p.

:math:\mathrm{e}[\textit{i}-1,3]

The :math:\chi^2 statistics, for :math:i = 1,2,\ldots,p.

:math:\mathrm{e}[i-1,4]

The degrees of freedom for the :math:\chi^2 statistics, for :math:i = 1,2,\ldots,p.

If :math:\mathrm{matrix} \neq \texttt{'C'}, :math:\mathrm{e}[\textit{i}-1,5] contains significance level for the :math:\chi^2 statistic, for :math:\textit{i} = 1,2,\ldots,p.

If :math:\mathrm{matrix} = \texttt{'C'}, :math:\mathrm{e}[i-1,5] is returned as zero.

**p** : float, ndarray, shape :math:\left(\mathrm{nvar}, \mathrm{nvar}\right)
The first :math:\mathrm{nvar} columns of :math:\mathrm{p} contain the principal component loadings, :math:a_i. The :math:j\ th column of :math:\mathrm{p} contains the :math:\mathrm{nvar} coefficients for the :math:j\ th principal component.

**v** : float, ndarray, shape :math:\left(n, \mathrm{nvar}\right)
The first :math:\mathrm{nvar} columns of :math:\mathrm{v} contain the principal component scores. The :math:j\ th column of :math:\mathrm{v} contains the :math:\textit{n} scores for the :math:j\ th principal component.

If :math:\mathrm{wt}\text{ is not }\mathbf{None}, any rows for which :math:\mathrm{wt}[i-1] is zero will be set to zero.

.. _g03aa-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nvar}\leq m.

(errno :math:1)
On entry, :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nvar}\geq 1.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n > \mathrm{nvar}.

(errno :math:1)
On entry, :math:\mathrm{std} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{std} = \texttt{'E'}, :math:\texttt{'S'}, :math:\texttt{'U'} or :math:\texttt{'Z'}.

(errno :math:1)
On entry, :math:\mathrm{matrix} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{matrix} = \texttt{'C'}, :math:\texttt{'S'}, :math:\texttt{'U'} or :math:\texttt{'V'}.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq 2.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq 1.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{wt}[i-1] < 0.0.

Constraint: :math:\mathrm{wt}[\textit{i}-1]\geq 0.0.

(errno :math:3)
Number of selected variables :math:\text{}\geq \text{} effective number of observations.

(errno :math:3)
On entry, :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\langle\mathit{\boldsymbol{value}}\rangle values of :math:\mathrm{isx} > 0.

Constraint: exactly :math:\mathrm{nvar} elements of :math:\mathrm{isx} > 0.

(errno :math:4)
On entry, :math:j = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{s}[j-1]\leq 0.0.

Constraint: :math:\mathrm{s}[j-1] > 0.0.

(errno :math:5)
The singular value decomposition has failed to converge.

**Warns**
**NagAlgorithmicWarning**
(errno :math:6)
All eigenvalues/singular values are zero. This will be caused by all the variables being constant.

.. _g03aa-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Let :math:X be an :math:n\times p data matrix of :math:n observations on :math:p variables :math:x_1,x_2,\ldots,x_p and let the :math:p\times p variance-covariance matrix of :math:x_1,x_2,\ldots,x_p be :math:S.
A vector :math:a_1 of length :math:p is found such that:

.. math::

The variable :math:z_1 = \sum_{{i = 1}}^pa_{{1i}}x_i is known as the first principal component and gives the linear combination of the variables that gives the maximum variation.
A second principal component, :math:z_2 = \sum_{{i = 1}}^pa_{{2i}}x_i, is found such that:

.. math::
a_2^{\mathrm{T}}Sa_2\quad \text{ is maximized subject to }a_2^{\mathrm{T}}a_2 = 1\text{and }a_2^{\mathrm{T}}a_1 = 0\text{.}

This gives the linear combination of variables that is orthogonal to the first principal component that gives the maximum variation.
Further principal components are derived in a similar way.

The vectors :math:a_1,a_2,\ldots,a_p, are the eigenvectors of the matrix :math:S and associated with each eigenvector is the eigenvalue, :math:\lambda_i^2.
The value of :math:\lambda_i^2/\sum \lambda_i^2 gives the proportion of variation explained by the :math:i\ th principal component.
Alternatively, the :math:a_i's can be considered as the right singular vectors in a singular value decomposition with singular values :math:\lambda_i of the data matrix centred about its mean and scaled by :math:1/\sqrt{\left(n-1\right)}, :math:X_s.
This latter approach is used in prin_comp, with

.. math::
X_s = V\Lambda P^{\prime }

where :math:\Lambda is a diagonal matrix with elements :math:\lambda_i, :math:P is the :math:p\times p matrix with columns :math:a_i and :math:V is an :math:n\times p matrix with :math:V^{\prime }V = I, which gives the principal component scores.

Principal component analysis is often used to reduce the dimension of a dataset, replacing a large number of correlated variables with a smaller number of orthogonal variables that still contain most of the information in the original dataset.

The choice of the number of dimensions required is usually based on the amount of variation accounted for by the leading principal components.
If :math:k principal components are selected, then a test of the equality of the remaining :math:p-k eigenvalues is

.. math::
\left(n-\left(2p+5\right)/6\right)\left\{-\sum_{{i = k+1}}^p\log\left(\lambda_i^2\right)+\left(p-k\right)\log\left(\sum_{{i = k+1}}^p\lambda_i^2/\left(p-k\right)\right)\right\}

which has, asymptotically, a :math:\chi^2-distribution with :math:\frac{1}{2}\left(p-k-1\right)\left(p-k+2\right) degrees of freedom.

Equality of the remaining eigenvalues indicates that if any more principal components are to be considered then they all should be considered.

Instead of the variance-covariance matrix the correlation matrix, the sums of squares and cross-products matrix or a standardized sums of squares and cross-products matrix may be used.
In the last case :math:S is replaced by :math:\sigma^{{-\frac{1}{2}}}S\sigma^{{-\frac{1}{2}}} for a diagonal matrix :math:\sigma with positive elements.
If the correlation matrix is used, the :math:\chi^2 approximation for the statistic given above is not valid.

The principal component scores, :math:F, are the values of the principal component variables for the observations.
These can be standardized so that the variance of these scores for each principal component is :math:1.0 or equal to the corresponding eigenvalue.

Weights can be used with the analysis, in which case the matrix :math:X is first centred about the weighted means then each row is scaled by an amount :math:\sqrt{w_i}, where :math:w_i is the weight for the :math:i\ th observation.

.. _g03aa-py2-py-references:

**References**
Chatfield, C and Collins, A J, 1980, Introduction to Multivariate Analysis, Chapman and Hall

Cooley, W C and Lohnes, P R, 1971, Multivariate Data Analysis, Wiley

Hammarling, S, 1985, The singular value decomposition in multivariate statistics, SIGNUM Newsl. (20(3)), 2--25

Kendall, M G and Stuart, A, 1969, The Advanced Theory of Statistics (Volume 1), (3rd Edition), Griffin

Morrison, D F, 1967, Multivariate Statistical Methods, McGraw--Hill

--------
:meth:naginterfaces.library.examples.mv.prin_comp_ex.main
"""
raise NotImplementedError

[docs]def canon_var(weight, x, isx, ing, ng, wt, tol):
r"""
canon_var performs a canonical variate (canonical discrimination) analysis.

.. _g03ac-py2-py-doc:

For full information please refer to the NAG Library document for g03ac

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03acf.html

.. _g03ac-py2-py-parameters:

**Parameters**
**weight** : str, length 1
Indicates if weights are to be used.

:math:\mathrm{weight} = \texttt{'U'}

No weights are used.

:math:\mathrm{weight} = \texttt{'W'} or :math:\texttt{'V'}

Weights are used and must be supplied in :math:\mathrm{wt}.

If :math:\mathrm{weight} = \texttt{'W'}, the weights are treated as frequencies and the effective number of observations is the sum of the weights.

If :math:\mathrm{weight} = \texttt{'V'}, the weights are treated as being inversely proportional to the variance of the observations and the effective number of observations is the number of observations with nonzero weights.

**x** : float, array-like, shape :math:\left(n, m\right)
:math:\mathrm{x}[\textit{i}-1,\textit{j}-1] must contain the :math:\textit{i}\ th observation for the :math:\textit{j}\ th variable, for :math:\textit{j} = 1,2,\ldots,m, for :math:\textit{i} = 1,2,\ldots,n.

**isx** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isx}[j-1] indicates whether or not the :math:j\ th variable is to be included in the analysis.

If :math:\mathrm{isx}[\textit{j}-1] > 0, the variables contained in the :math:\textit{j}\ th column of :math:\mathrm{x} is included in the canonical variate analysis, for :math:\textit{j} = 1,2,\ldots,m.

**ing** : int, array-like, shape :math:\left(n\right)
:math:\mathrm{ing}[\textit{i}-1] indicates which group the :math:\textit{i}\ th observation is in, for :math:\textit{i} = 1,2,\ldots,n. The effective number of groups is the number of groups with nonzero membership.

**ng** : int
The number of groups, :math:n_g.

**wt** : float, array-like, shape :math:\left(:\right)
Note: the required length for this argument is determined as follows: if :math:\mathrm{weight}\text{ in } (\texttt{'W'}, \texttt{'V'}): :math:n; otherwise: :math:1.

If :math:\mathrm{weight} = \texttt{'W'} or :math:\texttt{'V'}, the first :math:n elements of :math:\mathrm{wt} must contain the weights to be used in the analysis.

If :math:\mathrm{wt}[i-1] = 0.0, the :math:i\ th observation is not included in the analysis.

If :math:\mathrm{weight} = \texttt{'U'}, :math:\mathrm{wt} is not referenced.

**tol** : float
The value of :math:\mathrm{tol} is used to decide if the variables are of full rank and, if not, what is the rank of the variables. The smaller the value of :math:\mathrm{tol} the stricter the criterion for selecting the singular value decomposition. If a non-negative value of :math:\mathrm{tol} less than machine precision is entered, the square root of machine precision is used instead.

**Returns**
**nig** : int, ndarray, shape :math:\left(\mathrm{ng}\right)
:math:\mathrm{nig}[\textit{j}-1] gives the number of observations in group :math:\textit{j}, for :math:\textit{j} = 1,2,\ldots,n_g.

**cvm** : float, ndarray, shape :math:\left(\mathrm{ng}, \textit{nx}\right)
:math:\mathrm{cvm}[\textit{i}-1,\textit{j}-1] contains the mean of the :math:\textit{j}\ th canonical variate for the :math:\textit{i}\ th group, for :math:\textit{j} = 1,2,\ldots,l, for :math:\textit{i} = 1,2,\ldots,n_g; the remaining columns, if any, are used as workspace.

**e** : float, ndarray, shape :math:\left(\min\left(\textit{nx},{\mathrm{ng}-1}\right), 6\right)
The statistics of the canonical variate analysis.

:math:\mathrm{e}[i-1,0]

The canonical correlations, :math:\delta_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,1]

The eigenvalues of the within-group sum of squares matrix, :math:\lambda_{\textit{i}}^2, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,2]

The proportion of variation explained by the :math:\textit{i}\ th canonical variate, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,3]

The :math:\chi^2 statistic for the :math:\textit{i}\ th canonical variate, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,4]

The degrees of freedom for :math:\chi^2 statistic for the :math:\textit{i}\ th canonical variate, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,5]

The significance level for the :math:\chi^2 statistic for the :math:\textit{i}\ th canonical variate, for :math:\textit{i} = 1,2,\ldots,l.

**ncv** : int
The number of canonical variates, :math:l. This will be the minimum of :math:n_g-1 and the rank of :math:\mathrm{x}.

**cvx** : float, ndarray, shape :math:\left(\textit{nx}, \mathrm{ng}-1\right)
The canonical variate loadings. :math:\mathrm{cvx}[\textit{i}-1,\textit{j}-1] contains the loading coefficient for the :math:\textit{i}\ th variable on the :math:\textit{j}\ th canonical variate, for :math:\textit{j} = 1,2,\ldots,l, for :math:\textit{i} = 1,2,\ldots,n_x; the remaining columns, if any, are used as workspace.

**irankx** : int
The rank of the dependent variables.

If the variables are of full rank then :math:\mathrm{irankx} = \textit{nx}.

If the variables are not of full rank then :math:\mathrm{irankx} is an estimate of the rank of the dependent variables. :math:\mathrm{irankx} is calculated as the number of singular values greater than :math:\mathrm{tol}\times \text{(largest singular value)}.

.. _g03ac-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{tol} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{tol}\geq 0.0.

(errno :math:1)
On entry, :math:\mathrm{weight} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{weight} = \texttt{'U'}, :math:\texttt{'W'} or :math:\texttt{'V'}.

(errno :math:1)
On entry, :math:\textit{lde} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{min}\left(\textit{nx}, {\mathrm{ng}-1}\right) = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{lde}\geq \mathrm{min}\left(\textit{nx}, {\mathrm{ng}-1}\right).

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nx}+\mathrm{ng} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq \textit{nx}+\mathrm{ng}.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nx} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq \textit{nx}.

(errno :math:1)
On entry, :math:\mathrm{ng} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{ng}\geq 2.

(errno :math:1)
On entry, :math:\textit{nx} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nx}\geq 1.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{wt}[i-1] < 0.0.

Constraint: :math:\mathrm{wt}[i-1]\geq 0.0.

(errno :math:3)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle, :math:\mathrm{ing}[i-1] = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{ng} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:1\leq \mathrm{ing}[i-1]\leq \mathrm{ng}.

(errno :math:4)
On entry, :math:\textit{nx} = \langle\mathit{\boldsymbol{value}}\rangle, expected :math:\text{value} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nx} must be consistent with :math:\mathrm{isx}.

(errno :math:5)
The singular value decomposition has failed to converge.

(errno :math:7)
The effective number of observations is less than the effective number of groups plus number of variables.

(errno :math:7)
Less than :math:2 groups have nonzero membership.

**Warns**
**NagAlgorithmicWarning**
(errno :math:6)
A canonical correlation is equal to :math:1.0.

(errno :math:8)
The rank of :math:\mathrm{x} is :math:0.

.. _g03ac-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Let a sample of :math:n observations on :math:n_x variables in a data matrix come from :math:n_g groups with :math:n_1,n_2,\ldots,n_{n_g} observations in each group, :math:\sum n_i = n.
Canonical variate analysis finds the linear combination of the :math:n_x variables that maximizes the ratio of between-group to within-group variation.
The variables formed, the canonical variates can then be used to discriminate between groups.

The canonical variates can be calculated from the eigenvectors of the within-group sums of squares and cross-products matrix.
However, canon_var calculates the canonical variates by means of a singular value decomposition (SVD) of a matrix :math:V.
Let the data matrix with variable (column) means subtracted be :math:X, and let its rank be :math:k; then the :math:k by (:math:n_g-1) matrix :math:V is given by:

.. math::
V = Q_X^{\mathrm{T}}Q_g\text{,}

where :math:Q_g is an :math:n\times \left(n_g-1\right) orthogonal matrix that defines the groups and :math:Q_X is the first :math:k rows of the orthogonal matrix :math:Q either from the :math:QR decomposition of :math:X:

.. math::
X = QR

if :math:X is of full column rank, i.e., :math:k = n_x, else from the SVD of :math:X:

.. math::
X = QDP^{\mathrm{T}}\text{.}

Let the SVD of :math:V be:

.. math::
V = U_x\Delta U_g^{\mathrm{T}}

then the nonzero elements of the diagonal matrix :math:\Delta, :math:\delta_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,l, are the :math:l canonical correlations associated with the :math:l = \mathrm{min}\left(k, {n_g-1}\right) canonical variates, where :math:l = \mathrm{min}\left(k, n_g\right).

The eigenvalues, :math:\lambda_i^2, of the within-group sums of squares matrix are given by:

.. math::
\lambda_i^2 = \frac{{\delta_i^2}}{{1-\delta_i^2}}

and the value of :math:\pi_i = \lambda_i^2/\sum \lambda_i^2 gives the proportion of variation explained by the :math:i\ th canonical variate.
The values of the :math:\pi_i's give an indication as to how many canonical variates are needed to adequately describe the data, i.e., the dimensionality of the problem.

To test for a significant dimensionality greater than :math:i the :math:\chi^2 statistic:

.. math::
\left(n-1-n_g-\frac{1}{2}\left(k-n_g\right)\right)\sum_{{j = i+1}}^l\log\left(1+\lambda_j^2\right)

can be used.
This is asymptotically distributed as a :math:\chi^2-distribution with :math:\left(k-i\right)\left(n_g-1-i\right) degrees of freedom.
If the test for :math:i = h is not significant, then the remaining tests for :math:i > h should be ignored.

The loadings for the canonical variates are calculated from the matrix :math:U_x.
This matrix is scaled so that the canonical variates have unit within-group variance.

In addition to the canonical variates loadings the means for each canonical variate are calculated for each group.

Weights can be used with the analysis, in which case the weighted means are subtracted from each column and then each row is scaled by an amount :math:\sqrt{w_i}, where :math:w_i is the weight for the :math:i\ th observation (row).

.. _g03ac-py2-py-references:

**References**
Chatfield, C and Collins, A J, 1980, Introduction to Multivariate Analysis, Chapman and Hall

Gnanadesikan, R, 1977, Methods for Statistical Data Analysis of Multivariate Observations, Wiley

Hammarling, S, 1985, The singular value decomposition in multivariate statistics, SIGNUM Newsl. (20(3)), 2--25

Kendall, M G and Stuart, A, 1969, The Advanced Theory of Statistics (Volume 1), (3rd Edition), Griffin
"""
raise NotImplementedError

[docs]def canon_corr(z, isz, nx, ny, mcv, tol, wt=None):
r"""
canon_corr performs canonical correlation analysis upon input data matrices.

For full information please refer to the NAG Library document for g03ad

**Parameters**
**z** : float, array-like, shape :math:\left(n, m\right)
:math:\mathrm{z}[\textit{i}-1,\textit{j}-1] must contain the :math:\textit{i}\ th observation for the :math:\textit{j}\ th variable, for :math:\textit{j} = 1,2,\ldots,m, for :math:\textit{i} = 1,2,\ldots,n.

Both :math:x and :math:y variables are to be included in :math:\mathrm{z}, the indicator array, :math:\mathrm{isz}, being used to assign the variables in :math:\mathrm{z} to the :math:x or :math:y sets as appropriate.

**isz** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isz}[j-1] indicates whether or not the :math:j\ th variable is included in the analysis and to which set of variables it belongs.

:math:\mathrm{isz}[j-1] > 0

The variable contained in the :math:j\ th column of :math:\mathrm{z} is included as an :math:x variable in the analysis.

:math:\mathrm{isz}[j-1] < 0

The variable contained in the :math:j\ th column of :math:\mathrm{z} is included as a :math:y variable in the analysis.

:math:\mathrm{isz}[j-1] = 0

The variable contained in the :math:j\ th column of :math:\mathrm{z} is not included in the analysis.

**nx** : int
The number of :math:x variables in the analysis, :math:n_x.

**ny** : int
The number of :math:y variables in the analysis, :math:n_y.

**mcv** : int
An upper limit to the number of canonical variates.

**tol** : float
The value of :math:\mathrm{tol} is used to decide if the variables are of full rank and, if not, what is the rank of the variables. The smaller the value of :math:\mathrm{tol} the stricter the criterion for selecting the singular value decomposition. If a non-negative value of :math:\mathrm{tol} less than machine precision is entered, the square root of machine precision is used instead.

**wt** : None or float, array-like, shape :math:\left(n\right), optional
The elements of :math:\mathrm{wt} must contain the weights to be used in the analysis. The effective number of observations is the sum of the weights. If :math:\mathrm{wt}[i-1] = 0.0 then the :math:i\ th observation is not included in the analysis.

If weights are not provided then :math:\mathrm{wt} must be set to **None** and the effective number of observations is :math:\textit{n}.

**Returns**
**e** : float, ndarray, shape :math:\left(\min\left(\mathrm{nx},\mathrm{ny}\right), 6\right)
The statistics of the canonical variate analysis.

:math:\mathrm{e}[i-1,0]

The canonical correlations, :math:\delta_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,1]

The eigenvalues of :math:\Sigma, :math:\lambda_{\textit{i}}^2, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,2]

The proportion of variation explained by the :math:\textit{i}\ th canonical variate, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,3]

The :math:\chi^2 statistic for the :math:\textit{i}\ th canonical variate, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,4]

The degrees of freedom for :math:\chi^2 statistic for the :math:\textit{i}\ th canonical variate, for :math:\textit{i} = 1,2,\ldots,l.

:math:\mathrm{e}[i-1,5]

The significance level for the :math:\chi^2 statistic for the :math:\textit{i}\ th canonical variate, for :math:\textit{i} = 1,2,\ldots,l.

**ncv** : int
The number of canonical correlations, :math:l. This will be the minimum of the rank of :math:\mathrm{X} and the rank of :math:\mathrm{Y}.

**cvx** : float, ndarray, shape :math:\left(\mathrm{nx}, \mathrm{mcv}\right)
The canonical variate loadings for the :math:x variables. :math:\mathrm{cvx}[i-1,j-1] contains the loading coefficient for the :math:i\ th :math:x variable on the :math:j\ th canonical variate.

**cvy** : float, ndarray, shape :math:\left(\mathrm{ny}, \mathrm{mcv}\right)
The canonical variate loadings for the :math:y variables. :math:\mathrm{cvy}[i-1,j-1] contains the loading coefficient for the :math:i\ th :math:y variable on the :math:j\ th canonical variate.

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{mcv} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{min}\left(\mathrm{nx}, \mathrm{ny}\right) = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{mcv}\geq \mathrm{min}\left(\mathrm{nx}, \mathrm{ny}\right).

(errno :math:1)
On entry, :math:\mathrm{tol} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{tol}\geq 0.0.

(errno :math:1)
On entry, :math:\textit{weight} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{weight} = \texttt{'U'} or :math:\texttt{'W'}.

(errno :math:1)
On entry, :math:\textit{lde} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{min}\left(\mathrm{nx}, \mathrm{ny}\right) = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{lde}\geq \mathrm{min}\left(\mathrm{nx}, \mathrm{ny}\right).

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{nx}+\mathrm{ny} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n > \mathrm{nx}+\mathrm{ny}.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{nx}+\mathrm{ny} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq \mathrm{nx}+\mathrm{ny}.

(errno :math:1)
On entry, :math:\mathrm{ny} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{ny}\geq 1.

(errno :math:1)
On entry, :math:\mathrm{nx} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nx}\geq 1.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{wt}[i-1] < 0.0.

Constraint: :math:\mathrm{wt}[\textit{i}-1]\geq 0.0.

(errno :math:3)
On entry, :math:\mathrm{ny} = \langle\mathit{\boldsymbol{value}}\rangle, expected :math:\text{value} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{ny} must be consistent with :math:\mathrm{isz}.

(errno :math:3)
On entry, :math:\mathrm{nx} = \langle\mathit{\boldsymbol{value}}\rangle, expected :math:\text{value} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nx} must be consistent with :math:\mathrm{isz}.

(errno :math:4)
On entry, the effective number of observations is less than :math:\mathrm{nx}+\mathrm{ny}+1.

(errno :math:5)
The singular value decomposition has failed to converge.

**Warns**
**NagAlgorithmicWarning**
(errno :math:6)
A canonical correlation is equal to :math:1.0.

(errno :math:7)
On entry, the rank of the :math:Y matrix is :math:0.

(errno :math:7)
On entry, the rank of the :math:X matrix is :math:0.

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Let there be two sets of variables, :math:x and :math:y.
For a sample of :math:n observations on :math:n_x variables in a data matrix :math:X and :math:n_y variables in a data matrix :math:Y, canonical correlation analysis seeks to find a small number of linear combinations of each set of variables in order to explain or summarise the relationships between them.
The variables thus formed are known as canonical variates.

Let the variance-covariance matrix of the two datasets be

.. math::
\begin{pmatrix}S_{{xx}}&S_{{xy}}\\S_{{yx}}&S_{{yy}}\end{pmatrix}

and let

.. math::
\Sigma = S_{{yy}}^{-1}S_{{yx}}S_{{xx}}^{-1}S_{{xy}}

then the canonical correlations can be calculated from the eigenvalues of the matrix :math:\Sigma.
However, canon_corr calculates the canonical correlations by means of a singular value decomposition (SVD) of a matrix :math:V.
If the rank of the data matrix :math:X is :math:k_x and the rank of the data matrix :math:Y is :math:k_y, and both :math:X and :math:Y have had variable (column) means subtracted then the :math:k_x\times k_y matrix :math:V is given by:

.. math::
V = Q_x^{\mathrm{T}}Q_y\text{,}

where :math:Q_x is the first :math:k_x columns of the orthogonal matrix :math:Q either from the :math:QR decomposition of :math:X if :math:X is of full column rank, i.e., :math:k_x = n_x:

.. math::
X = Q_xR_x

or from the SVD of :math:X if :math:k_x < n_x:

.. math::
X = Q_xD_xP_x^{\mathrm{T}}\text{.}

Similarly :math:Q_y is the first :math:k_y columns of the orthogonal matrix :math:Q either from the :math:QR decomposition of :math:Y if :math:Y is of full column rank, i.e., :math:k_y = n_y:

.. math::
Y = Q_yR_y

or from the SVD of :math:Y if :math:k_y < n_y:

.. math::
Y = Q_yD_yP_y^{\mathrm{T}}\text{.}

Let the SVD of :math:V be:

.. math::
V = U_x\Delta U_y^{\mathrm{T}}

then the nonzero elements of the diagonal matrix :math:\Delta, :math:\delta_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,l, are the :math:l canonical correlations associated with the :math:l canonical variates, where :math:l = \mathrm{min}\left(k_x, k_y\right).

The eigenvalues, :math:\lambda_i^2, of the matrix :math:\Sigma are given by:

.. math::
\lambda_i^2 = \delta_i^2\text{.}

The value of :math:\pi_i = \lambda_i^2/\sum \lambda_i^2 gives the proportion of variation explained by the :math:i\ th canonical variate.
The values of the :math:\pi_i's give an indication as to how many canonical variates are needed to adequately describe the data, i.e., the dimensionality of the problem.

To test for a significant dimensionality greater than :math:i the :math:\chi^2 statistic:

.. math::
\left(n-\frac{1}{2}\left(k_x+k_y+3\right)\right)\sum_{{j = i+1}}^l\log\left(1-\delta_j^2\right)

can be used.
This is asymptotically distributed as a :math:\chi^2-distribution with :math:\left(k_x-i\right)\left(k_y-i\right) degrees of freedom.
If the test for :math:i = k_{\mathrm{min}} is not significant, then the remaining tests for :math:i > k_{\mathrm{min}} should be ignored.

The loadings for the canonical variates are calculated from the matrices :math:U_x and :math:U_y respectively.
These matrices are scaled so that the canonical variates have unit variance.

**References**
Hastings, N A J and Peacock, J B, 1975, Statistical Distributions, Butterworth

Kendall, M G and Stuart, A, 1976, The Advanced Theory of Statistics (Volume 3), (3rd Edition), Griffin

Morrison, D F, 1967, Multivariate Statistical Methods, McGraw--Hill
"""
raise NotImplementedError

[docs]def rot_orthomax(stand, g, fl, acc=0.00001, maxit=30):
r"""
rot_orthomax computes orthogonal rotations for a matrix of loadings using a generalized orthomax criterion.

.. _g03ba-py2-py-doc:

For full information please refer to the NAG Library document for g03ba

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03baf.html

.. _g03ba-py2-py-parameters:

**Parameters**
**stand** : str, length 1
Indicates if the matrix of loadings is to be row standardized before rotation.

:math:\mathrm{stand} = \texttt{'S'}

:math:\mathrm{stand} = \texttt{'U'}

**g** : float
:math:\gamma, the criterion constant with :math:\gamma = 1.0 giving varimax rotations and :math:\gamma = 0.0 giving quartimax rotations.

**fl** : float, array-like, shape :math:\left(\textit{nvar}, k\right)
:math:\Lambda, the matrix of loadings. :math:\mathrm{fl}[\textit{i}-1,\textit{j}-1] must contain the loading for the :math:\textit{i}\ th variable on the :math:\textit{j}\ th factor, for :math:\textit{j} = 1,2,\ldots,k, for :math:\textit{i} = 1,2,\ldots,p.

**acc** : float, optional
Indicates the accuracy required. The iterative procedure of Cooley and Lohnes (1971) will be stopped and the final refinement computed when the change in :math:V is less than :math:\mathrm{acc}\times \mathrm{max}\left(1.0, V\right). If :math:\mathrm{acc} is greater than or equal to :math:0.0 but less than machine precision or if :math:\mathrm{acc} is greater than :math:1.0, machine precision will be used instead.

**maxit** : int, optional
The maximum number of iterations.

**Returns**
**fl** : float, ndarray, shape :math:\left(\textit{nvar}, k\right)
If :math:\mathrm{stand} = \texttt{'S'}, the elements of :math:\mathrm{fl} are standardized so that the sum of squared elements for each row is :math:1.0 and then after the computation of the rotations are rescaled; this may lead to slight differences between the input and output values of :math:\mathrm{fl}.

If :math:\mathrm{stand} = \texttt{'U'}, :math:\mathrm{fl} will be unchanged on exit.

**flr** : float, ndarray, shape :math:\left(\textit{nvar}, k\right)
The rotated matrix of loadings, :math:\Lambda^*. :math:\mathrm{flr}[\textit{i}-1,\textit{j}-1] will contain the rotated loading for the :math:\textit{i}\ th variable on the :math:\textit{j}\ th factor, for :math:\textit{j} = 1,2,\ldots,k, for :math:\textit{i} = 1,2,\ldots,p.

**r** : float, ndarray, shape :math:\left(k, k\right)
The matrix of rotations, :math:R.

**itera** : int
The number of iterations performed.

.. _g03ba-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{maxit} > 0.

(errno :math:1)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:k = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nvar}\geq k.

(errno :math:1)
On entry, :math:k = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:k\geq 2.

(errno :math:1)
On entry, :math:\mathrm{g} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{g}\geq 0.0.

(errno :math:1)
On entry, :math:\mathrm{stand} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{stand} = \texttt{'S'} or :math:\texttt{'U'}.

(errno :math:1)
On entry, :math:\mathrm{acc} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{acc}\geq 0.0.

(errno :math:2)
The singular value decomposition has failed to converge.

(errno :math:3)
The algorithm to find R has failed to reach the required accuracy in :math:\mathrm{maxit} iterations.

.. _g03ba-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Let :math:\Lambda be the :math:p\times k matrix of loadings from a variable-directed multivariate method, e.g., canonical variate analysis or factor analysis.
This matrix represents the relationship between the original :math:p variables and the :math:k orthogonal linear combinations of these variables, the canonical variates or factors.
The latter are only unique up to a rotation in the :math:k-dimensional space they define.
A rotation can then be found that simplifies the structure of the matrix of loadings, and hence the relationship between the original and the derived variables.
That is, the elements, :math:\lambda_{{ij}}^*, of the rotated matrix, :math:\Lambda^*, are either relatively large or small.
The rotations may be found by minimizing the criterion:

.. math::
V = \sum_{{j = 1}}^k\sum_{{i = 1}}^p\left(\lambda_{{ij}}^*\right)^4-\frac{\gamma }{p}\sum_{{j = 1}}^k\left[\sum_{{i = 1}}^p\left(\lambda_{{ij}}^*\right)^2\right]^2

where the constant :math:\gamma gives a family of rotations with :math:\gamma = 1 giving varimax rotations and :math:\gamma = 0 giving quartimax rotations.

It is generally advised that factor loadings should be standardized, so that the sum of squared elements for each row is one, before computing the rotations.

The matrix of rotations, :math:R, such that :math:\Lambda^* = \Lambda R, is computed using first an algorithm based on that described by Cooley and Lohnes (1971), which involves the pairwise rotation of the factors.
Then a final refinement is made using a method similar to that described by Lawley and Maxwell (1971), but instead of the eigenvalue decomposition, the algorithm has been adapted to incorporate a singular value decomposition.

.. _g03ba-py2-py-references:

**References**
Cooley, W C and Lohnes, P R, 1971, Multivariate Data Analysis, Wiley

Lawley, D N and Maxwell, A E, 1971, Factor Analysis as a Statistical Method, (2nd Edition), Butterworths
"""
raise NotImplementedError

[docs]def rot_procrustes(stand, pscale, x, y):
r"""
rot_procrustes computes Procrustes rotations in which an orthogonal rotation is found so that a transformed matrix best matches a target matrix.

.. _g03bc-py2-py-doc:

For full information please refer to the NAG Library document for g03bc

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03bcf.html

.. _g03bc-py2-py-parameters:

**Parameters**
**stand** : str, length 1
Indicates if translation/normalization is required.

:math:\mathrm{stand} = \texttt{'N'}

There is no translation or normalization.

:math:\mathrm{stand} = \texttt{'Z'}

There is translation to the origin (i.e., to zero).

:math:\mathrm{stand} = \texttt{'C'}

There is translation to origin and then to the :math:Y centroid after rotation.

:math:\mathrm{stand} = \texttt{'U'}

There is unit normalization.

:math:\mathrm{stand} = \texttt{'S'}

There is translation and normalization (i.e., there is standardization).

:math:\mathrm{stand} = \texttt{'M'}

There is translation and normalization to :math:Y scale, then translation to the :math:Y centroid after rotation (i.e., they are matched).

**pscale** : str, length 1
Indicates if least squares scaling is to be applied after rotation.

:math:\mathrm{pscale} = \texttt{'S'}

Scaling is applied.

:math:\mathrm{pscale} = \texttt{'U'}

No scaling is applied.

**x** : float, array-like, shape :math:\left(n, m\right)
:math:X, the matrix to be rotated.

**y** : float, array-like, shape :math:\left(n, m\right)
The target matrix, :math:\mathrm{y}.

**Returns**
**x** : float, ndarray, shape :math:\left(n, m\right)
If :math:\mathrm{stand} = \texttt{'N'}, :math:\mathrm{x} will be unchanged.

If :math:\mathrm{stand} = \texttt{'Z'}, :math:\texttt{'C'}, :math:\texttt{'S'} or :math:\texttt{'M'}, :math:\mathrm{x} will be translated to have zero column means.

If :math:\mathrm{stand} = \texttt{'U'} or :math:\texttt{'S'}, :math:\mathrm{x} will be scaled to have unit sum of squares.

If :math:\mathrm{stand} = \texttt{'M'}, :math:\mathrm{x} will be scaled to have the same sum of squares as :math:\mathrm{y}.

**y** : float, ndarray, shape :math:\left(n, m\right)
If :math:\mathrm{stand} = \texttt{'N'}, :math:\mathrm{y} will be unchanged.

If :math:\mathrm{stand} = \texttt{'Z'} or :math:\texttt{'S'}, :math:\mathrm{y} will be translated to have zero column means.

If :math:\mathrm{stand} = \texttt{'U'} or :math:\texttt{'S'}, :math:\mathrm{y} will be scaled to have unit sum of squares.

If :math:\mathrm{stand} = \texttt{'C'} or :math:\texttt{'M'}, :math:\mathrm{y} will be translated and then after rotation translated back.

The output :math:\mathrm{y} should be the same as the input :math:\mathrm{y} except for rounding errors.

**yhat** : float, ndarray, shape :math:\left(n, m\right)
The fitted matrix, :math:\hat{Y}.

**r** : float, ndarray, shape :math:\left(m, m\right)
The matrix of rotations, :math:R, see Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03bcf.html#fcomments>__.

**alpha** : float
If :math:\mathrm{pscale} = \texttt{'S'} the scaling factor, :math:\alpha; otherwise :math:\mathrm{alpha} is not set.

The residual sum of squares.

**res** : float, ndarray, shape :math:\left(n\right)
The residuals, :math:r_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,n.

.. _g03bc-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{pscale} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{pscale} = \texttt{'S'} or :math:\texttt{'U'}.

(errno :math:1)
On entry, :math:\mathrm{stand} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{stand} = \texttt{'N'}, :math:\texttt{'Z'}, :math:\texttt{'C'}, :math:\texttt{'U'}, :math:\texttt{'S'} or :math:\texttt{'M'}.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m > 0.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq m.

(errno :math:2)
Only one distinct point (centred at zero) in :math:\mathrm{y} array.

(errno :math:2)
Only one distinct point (centred at zero) in :math:\mathrm{x} array.

(errno :math:3)
:math:\mathrm{yhat} contains only zero-points when least squares scaling is applied.

(errno :math:4)
The singular value decomposition has failed to converge.

.. _g03bc-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Let :math:X and :math:Y be :math:n\times m matrices.
They can be considered as representing sets of :math:n points in an :math:m-dimensional space.
The :math:X matrix may be a matrix of loadings from say factor or canonical variate analysis, and the :math:Y matrix may be a postulated pattern matrix or the loadings from a different sample.
The problem is to relate the two sets of points without disturbing the relationships between the points in each set.
This can be achieved by translating, rotating and scaling the sets of points.
The :math:Y matrix is considered as the target matrix and the :math:X matrix is rotated to match that matrix.

First the two sets of points are translated so that their centroids are at the origin to give :math:X_c and :math:Y_c, i.e., the matrices will have zero column means.
Then the rotation of the translated :math:X_c matrix which minimizes the sum of squared distances between corresponding points in the two sets is found.
This is computed from the singular value decomposition of the matrix:

.. math::
X_c^{\mathrm{T}}Y_c = UDV^{\mathrm{T}}\text{,}

where :math:U and :math:V are orthogonal matrices and :math:D is a diagonal matrix.
The matrix of rotations, :math:R, is computed as:

.. math::
R = UV^{\mathrm{T}}\text{.}

After rotation, a scaling or dilation factor, :math:\alpha, may be estimated by least squares.
Thus, the final set of points that best match :math:Y_c is given by:

.. math::
\hat{Y}_c = \alpha X_cR\text{.}

Before rotation, both sets of points may be normalized to have unit sums of squares or the :math:X matrix may be normalized to have the same sum of squares as the :math:Y matrix.
After rotation, the results may be translated to the original :math:Y centroid.

The :math:i\ th residual, :math:r_i, is given by the distance between the point given in the :math:i\ th row of :math:Y and the point given in the :math:i\ th row of :math:\hat{Y}.
The residual sum of squares is also computed.

.. _g03bc-py2-py-references:

**References**
Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press

Lawley, D N and Maxwell, A E, 1971, Factor Analysis as a Statistical Method, (2nd Edition), Butterworths
"""
raise NotImplementedError

[docs]def rot_promax(stand, x, ro, power, io_manager=None):
r"""
rot_promax calculates a ProMax rotation, given information following an orthogonal rotation.

.. _g03bd-py2-py-doc:

For full information please refer to the NAG Library document for g03bd

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03bdf.html

.. _g03bd-py2-py-parameters:

**Parameters**
**stand** : str, length 1

:math:\mathrm{stand} = \texttt{'S'}

Rows of :math:Y are (Kaiser) normalized by the communalities of the loadings.

:math:\mathrm{stand} = \texttt{'U'}

Rows are not normalized.

**x** : float, array-like, shape :math:\left(n, m\right)
The loadings matrix following an orthogonal rotation, :math:X.

**ro** : float, array-like, shape :math:\left(m, m\right)
The orthogonal rotation matrix, :math:O.

**power** : float
:math:p, the value of exponent.

**io_manager** : FileObjManager, optional
Manager for I/O in this routine.

**Returns**
**fp** : float, ndarray, shape :math:\left(n, m\right)
The factor pattern matrix, :math:P.

**r** : float, ndarray, shape :math:\left(m, m\right)
The ProMax rotation matrix, :math:R.

**phi** : float, ndarray, shape :math:\left(m, m\right)
The matrix of inter-factor correlations, :math:\Phi.

**fs** : float, ndarray, shape :math:\left(n, m\right)
The factor structure matrix, :math:S.

.. _g03bd-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{stand} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{stand} = \texttt{'U'} or :math:\texttt{'S'}.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq 1.

(errno :math:1)
On entry, :math:\mathrm{power} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{power} > 1.0.

(errno :math:2)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq m.

(errno :math:20)
The singular value decomposition has failed to converge.

(errno :math:100)
An internal error has occurred in this function. Check the function call and any array sizes. If the call is correct then please contact NAG <https://www.nag.com>__ for assistance.

.. _g03bd-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Let :math:X and :math:Y denote :math:n\times m matrices each representing a set of :math:n points in an :math:m-dimensional space.
The :math:X matrix is a matrix of loadings as returned by :meth:rot_orthomax, that is following an orthogonal rotation of a loadings matrix :math:Z.
The target matrix :math:Y is calculated as a power transformation of :math:X that preserves the sign of the loadings.
Let :math:X_{{ij}} and :math:Y_{{ij}} denote the :math:\left(i,j\right)\ th element of matrices :math:X and :math:Y.
Given a value greater than :math:1 for the exponent :math:p:

.. math::
Y_{{ij}} = \delta_{{ij}}\left\lVert X_{{ij}}\right\rVert^p\text{,}

for

:math:i = 1,2,\ldots,n;

:math:j = 1,2,\ldots,m;

:math:\delta_{{ij}} = \left\{\begin{array}{c}-1 \text{, if } X_{{ij}} < 0 \text{; }\\ 1 \text{, otherwise.}\end{array}\right.

The above power transformation tends to increase the difference between high and low values of loadings and is intended to increase the interpretability of a solution.

In the second step a solution of:

.. math::
XW = Y\text{,}\quad X,Y \in \mathbb{R}^{{n\times m}}\text{, }W \in \mathbb{R}^{{m\times m}}\text{,}

is found for :math:W in the least squares sense by use of singular value decomposition of the orthogonal loadings :math:X.
The ProMax rotation matrix :math:R is then given by

.. math::
R = OW\tilde{W}\text{,}\quad O\text{, }\tilde{W} \in \mathbb{R}^{{m\times m}}\text{,}

where :math:O is the rotation matrix from an orthogonal transformation, and :math:\tilde{W} is a matrix with the square root of diagonal elements of :math:\left(W^\mathrm{T}W\right)^{-1} on its diagonal and zeros elsewhere.

The ProMax factor pattern matrix :math:P is given by

.. math::
P = XW\tilde{W}\text{,}\quad P \in \mathbb{R}^{{n\times m}}\text{;}

the inter-factor correlations :math:\Phi are given by

.. math::
\Phi = \left(Q^\mathrm{T}Q\right)^{-1}\text{,}\quad \Phi \in \mathbb{R}^{{m\times m}}\text{;}

where :math:Q = W\tilde{W}; and the factor structure :math:S is given by

.. math::
S = P\Phi \text{,}\quad S \in \mathbb{R}^{{n\times m}}\text{.}

Optionally, the rows of target matrix :math:Y can be scaled by the communalities of loadings.
"""
raise NotImplementedError

[docs]def factor(matrix, n, x, nvar, isx, nfac, iop, wt=None, io_manager=None):
r"""
factor computes the maximum likelihood estimates of the parameters of a factor analysis model.
Either the data matrix or a correlation/covariance matrix may be input.

.. _g03ca-py2-py-doc:

For full information please refer to the NAG Library document for g03ca

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03caf.html

.. _g03ca-py2-py-parameters:

**Parameters**
**matrix** : str, length 1
Selects the type of matrix on which factor analysis is to be performed.

:math:\mathrm{matrix} = \texttt{'D'}

The data matrix will be input in :math:\mathrm{x} and factor analysis will be computed for the correlation matrix.

:math:\mathrm{matrix} = \texttt{'S'}

The data matrix will be input in :math:\mathrm{x} and factor analysis will be computed for the covariance matrix, i.e., the results are scaled as described in Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03caf.html#fcomments>__.

:math:\mathrm{matrix} = \texttt{'C'}

The correlation/variance-covariance matrix will be input in :math:\mathrm{x} and factor analysis computed for this matrix.

See Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03caf.html#fcomments>__.

**n** : int
If :math:\mathrm{matrix} = \texttt{'D'} or :math:\texttt{'S'} the number of observations in the data array :math:\mathrm{x}.

If :math:\mathrm{matrix} = \texttt{'C'} the (effective) number of observations used in computing the (possibly weighted) correlation/variance-covariance matrix input in :math:\mathrm{x}.

**x** : float, array-like, shape :math:\left(:, m\right)
Note: the required extent for this argument in dimension 1 is determined as follows: if :math:\mathrm{matrix}\text{ in } (\texttt{'D'}, \texttt{'S'}): :math:\mathrm{n}; if :math:\mathrm{matrix}=\texttt{'C'}: :math:m; otherwise: :math:0.

The input matrix.

If :math:\mathrm{matrix} = \texttt{'D'} or :math:\texttt{'S'}, :math:\mathrm{x} must contain the data matrix, i.e., :math:\mathrm{x}[\textit{i}-1,\textit{j}-1] must contain the :math:\textit{i}\ th observation for the :math:\textit{j}\ th variable, for :math:\textit{j} = 1,2,\ldots,m, for :math:\textit{i} = 1,2,\ldots,n.

If :math:\mathrm{matrix} = \texttt{'C'}, :math:\mathrm{x} must contain the correlation or variance-covariance matrix.

Only the upper triangular part is required.

**nvar** : int
:math:p, the number of variables in the factor analysis.

**isx** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isx}[\textit{j}-1] indicates whether or not the :math:\textit{j}\ th variable is included in the factor analysis. If :math:\mathrm{isx}[\textit{j}-1]\geq 1, the variable represented by the :math:\textit{j}\ th column of :math:\mathrm{x} is included in the analysis; otherwise it is excluded, for :math:\textit{j} = 1,2,\ldots,m.

**nfac** : int
:math:k, the number of factors.

**iop** : int, array-like, shape :math:\left(5\right)
Options for the optimization. There are four options to be set:

.. rst-class:: nag-rules-none nag-align-left

+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|:math:\textit{iprint}|controls iteration monitoring;                                                                                                                                                                                                                                                                                                                                                                                                    |
+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                       |if :math:\textit{iprint}\leq 0, there is no printing of information else if :math:\textit{iprint} > 0, information is printed at every iprint iterations. The information printed consists of the value of :math:F\left(\Psi \right) at that iteration, the number of evaluations of :math:F\left(\Psi \right), the current estimates of the communalities and an indication of whether or not they are at the boundary.|
+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|:math:\textit{maxfun}|the maximum number of function evaluations.                                                                                                                                                                                                                                                                                                                                                                                       |
+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|:math:\textit{acc}   |the required accuracy for the estimates of :math:\psi_i.                                                                                                                                                                                                                                                                                                                                                                        |
+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|:math:\textit{eps}   |a lower bound for the values of :math:\psi, see :ref:Notes <g03ca-py2-py-notes>.                                                                                                                                                                                                                                                                                                                                              |
+-----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Let :math:\epsilon = \text{machine precision} then if :math:\mathrm{iop} = 0, the following default values are used:

:math:\textit{iprint} = -1

:math:\textit{maxfun} = 100p

:math:\textit{acc} = 10\sqrt{\epsilon }

:math:\textit{eps} = \epsilon

If :math:\mathrm{iop}\neq 0, then

:math:\textit{iprint} = \mathrm{iop}

:math:\textit{maxfun} = \mathrm{iop}

:math:\textit{acc} = 10^{{-l}} where :math:l = \mathrm{iop}

:math:\textit{eps} = 10^{{-l}} where :math:l = \mathrm{iop}

**wt** : None or float, array-like, shape :math:\left(:\right), optional
Note: the required length for this argument is determined as follows: if :math:\textit{weight}=\texttt{'W'}\text{ and }\mathrm{matrix}\text{ in } (\texttt{'D'}, \texttt{'S'}): :math:\mathrm{n}; otherwise: :math:1.

If :math:\mathrm{wt}\text{ is not }\mathbf{None} and :math:\mathrm{matrix} = \texttt{'D'} or :math:\texttt{'S'}, :math:\mathrm{wt} must contain the weights to be used in the factor analysis. The effective number of observations in the analysis will then be the sum of weights. If :math:\mathrm{wt}[i-1] = 0.0, the :math:i\ th observation is not included in the analysis.

If :math:\textit{weight} = \texttt{'U'} or :math:\mathrm{matrix} = \texttt{'C'}, :math:\mathrm{wt} is not referenced and the effective number of observations is :math:n.

**io_manager** : FileObjManager, optional
Manager for I/O in this routine.

**Returns**
**e** : float, ndarray, shape :math:\left(\mathrm{nvar}\right)
The eigenvalues :math:\theta_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,p.

**stat** : float, ndarray, shape :math:\left(4\right)
The test statistics.

:math:\mathrm{stat}

Contains the value :math:F\left(\hat{\Psi }\right).

:math:\mathrm{stat}

Contains the test statistic, :math:\chi^2.

:math:\mathrm{stat}

Contains the degrees of freedom associated with the test statistic.

:math:\mathrm{stat}

Contains the significance level.

**com** : float, ndarray, shape :math:\left(\mathrm{nvar}\right)
The communalities.

**psi** : float, ndarray, shape :math:\left(\mathrm{nvar}\right)
The estimates of :math:\psi_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,p.

**res** : float, ndarray, shape :math:\left(\mathrm{nvar}\times \left(\mathrm{nvar}-1\right)/2\right)
The residual correlations. The residual correlation for the :math:i\ th and :math:j\ th variables is stored in :math:\mathrm{res}[ \left(j-1\right) \left(j-2\right) /2+i -1], :math:i < j.

**fl** : float, ndarray, shape :math:\left(\mathrm{nvar}, \mathrm{nfac}\right)
The factor loadings. :math:\mathrm{fl}[\textit{i}-1,\textit{j}-1] contains :math:\lambda_{{\textit{i}\textit{j}}}, for :math:\textit{j} = 1,2,\ldots,k, for :math:\textit{i} = 1,2,\ldots,p.

.. _g03ca-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{iop} \neq 1 and :math:\mathrm{iop} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:1\leq \textit{eps}\leq \text{machine precision}.

(errno :math:1)
On entry, :math:\mathrm{iop} \neq 1 and :math:\mathrm{iop} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:1\leq \textit{acc}\leq \text{machine precision}.

(errno :math:1)
On entry, :math:\mathrm{iop} \neq 1 and :math:\mathrm{iop} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{maxfun}\geq 1.

(errno :math:1)
On entry, :math:\mathrm{nfac} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nfac}\leq \mathrm{nvar}.

(errno :math:1)
On entry, :math:\mathrm{nfac} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nfac}\geq 1.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq \mathrm{nvar}.

(errno :math:1)
On entry, :math:\mathrm{matrix} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{matrix} = \texttt{'D'}, :math:\texttt{'S'} or :math:\texttt{'C'}.

(errno :math:1)
On entry, :math:\textit{weight} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: when :math:\mathrm{matrix} = \texttt{'D'} or :math:\texttt{'S'}, :math:\textit{weight} = \texttt{'U'} or :math:\texttt{'W'}.

(errno :math:1)
On entry, :math:\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{n} > \mathrm{nvar}.

(errno :math:1)
On entry, :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nvar} > 1.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{wt}[i-1] < 0.0.

Constraint: :math:\mathrm{wt}[i-1]\geq 0.0.

(errno :math:3)
The effective number of observations :math:\text{}\leq 1.

(errno :math:3)
The number of variables :math:\text{}\geq \text{} number of included observations.

(errno :math:3)
On entry, :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\langle\mathit{\boldsymbol{value}}\rangle values of :math:\mathrm{isx} > 0

Constraint: exactly :math:\mathrm{nvar} elements of :math:\mathrm{isx} > 0.

(errno :math:4)
Two eigenvalues of :math:S^* are equal.

(errno :math:4)
On entry, the data matrix is not of full column rank or the input correlation/covariance matrix is not positive definite.

(errno :math:5)
The singular value decomposition has failed to converge.

(errno :math:6)
The estimation procedure has failed to converge in :math:\langle\mathit{\boldsymbol{value}}\rangle iterations.

**Warns**
**NagAlgorithmicWarning**
(errno :math:7)
The convergence is not certain but a lower point could not be found.

.. _g03ca-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Let :math:p variables, :math:x_1,x_2,\ldots,x_p, with variance-covariance matrix :math:\Sigma be observed.
The aim of factor analysis is to account for the covariances in these :math:p variables in terms of a smaller number, :math:k, of hypothetical variables, or factors, :math:f_1,f_2,\ldots,f_k.
These are assumed to be independent and to have unit variance.
The relationship between the observed variables and the factors is given by the model:

.. math::
x_i = \sum_{{j = 1}}^k\lambda_{{ij}}f_j+e_i\text{, }\quad i = 1,2,\ldots,p

where :math:\lambda_{{\textit{i}\textit{j}}}, for :math:\textit{j} = 1,2,\ldots,k, for :math:\textit{i} = 1,2,\ldots,p, are the factor loadings and :math:e_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,p, are independent random variables with variances :math:\psi_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,p.
The :math:\psi_i represent the unique component of the variation of each observed variable.
The proportion of variation for each variable accounted for by the factors is known as the communality.
For this function it is assumed that both the :math:k factors and the :math:e_i's follow independent Normal distributions.

The model for the variance-covariance matrix, :math:\Sigma, can be written as:

.. math::
\Sigma = \Lambda \Lambda^{\mathrm{T}}+\Psi

where :math:\Lambda is the matrix of the factor loadings, :math:\lambda_{{ij}}, and :math:\Psi is a diagonal matrix of unique variances, :math:\psi_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,p.

The estimation of the parameters of the model, :math:\Lambda and :math:\Psi, by maximum likelihood is described by Lawley and Maxwell (1971).
The log-likelihood is:

.. math::
-\frac{1}{2}\left(n-1\right)\log\left(\left\lvert \Sigma \right\rvert \right)-\frac{1}{2}\left(n-1\right)\mathrm{trace}\left(S, \Sigma^{-1}\right)+\text{constant,}

where :math:n is the number of observations, :math:S is the sample variance-covariance matrix or, if weights are used, :math:S is the weighted sample variance-covariance matrix and :math:n is the effective number of observations, that is, the sum of the weights.
The constant is independent of the parameters of the model.
A two stage maximization is employed.
It makes use of the function :math:F\left(\Psi \right), which is, up to a constant, :math:-2/\left(n-1\right) times the log-likelihood maximized over :math:\Lambda.
This is then minimized with respect to :math:\Psi to give the estimates, :math:\hat{\Psi }, of :math:\Psi.
The function :math:F\left(\Psi \right) can be written as:

.. math::
F\left(\Psi \right) = \sum_{{j = k+1}}^p\left(\theta_j-\log\left(\theta_j\right)\right)-\left(p-k\right)

where values :math:\theta_{\textit{j}}, for :math:\textit{j} = 1,2,\ldots,p are the eigenvalues of the matrix:

.. math::
S^* = \Psi^{{-1/2}}S\Psi^{{-1/2}}\text{.}

The estimates :math:\hat{\Lambda }, of :math:\Lambda, are then given by scaling the eigenvectors of :math:S^*, which are denoted by :math:V:

.. math::
\hat{\Lambda } = \Psi^{{1/2}}V\left(\Theta -I\right)^{{1/2}}\text{.}

where :math:\Theta is the diagonal matrix with elements :math:\theta_i, and :math:I is the identity matrix.

The minimization of :math:F\left(\Psi \right) is performed using :meth:opt.bounds_mod_deriv2_comp <naginterfaces.library.opt.bounds_mod_deriv2_comp> which uses a modified Newton algorithm.
The computation of the Hessian matrix is described by Clark (1970).
However, instead of using the eigenvalue decomposition of the matrix :math:S^* as described above, the singular value decomposition of the matrix :math:R\Psi^{{-1/2}} is used, where :math:R is obtained either from the :math:QR decomposition of the (scaled) mean centred data matrix or from the Cholesky decomposition of the correlation/covariance matrix.
The function :meth:opt.bounds_mod_deriv2_comp <naginterfaces.library.opt.bounds_mod_deriv2_comp> ensures that the values of :math:\psi_i are greater than a given small positive quantity, :math:\delta, so that the communality is always less than :math:1.
This avoids the so called Heywood cases.

In addition to the values of :math:\Lambda, :math:\Psi and the communalities, factor returns the residual correlations, i.e., the off-diagonal elements of :math:C-\left(\Lambda \Lambda^{\mathrm{T}}+\Psi \right) where :math:C is the sample correlation matrix. factor also returns the test statistic:

.. math::
\chi^2 = \left[n-1-\left(2p+5\right)/6-2k/3\right]F\left(\hat{\Psi }\right)

which can be used to test the goodness-of-fit of the model (1) <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03caf.html#eqn1>__, see Lawley and Maxwell (1971) and Morrison (1967).

.. _g03ca-py2-py-references:

**References**
Clark, M R B, 1970, A rapidly convergent method for maximum likelihood factor analysis, British J. Math. Statist. Psych.

Hammarling, S, 1985, The singular value decomposition in multivariate statistics, SIGNUM Newsl. (20(3)), 2--25

Lawley, D N and Maxwell, A E, 1971, Factor Analysis as a Statistical Method, (2nd Edition), Butterworths

Morrison, D F, 1967, Multivariate Statistical Methods, McGraw--Hill
"""
raise NotImplementedError

[docs]def factor_score(method, rotate, fl, psi, e, r):
r"""
factor_score computes factor score coefficients from the result of fitting a factor analysis model by maximum likelihood as performed by :meth:factor.

.. _g03cc-py2-py-doc:

For full information please refer to the NAG Library document for g03cc

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03ccf.html

.. _g03cc-py2-py-parameters:

**Parameters**
**method** : str, length 1
Indicates which method is to be used to compute the factor score coefficients.

:math:\mathrm{method} = \texttt{'R'}

The regression method is used.

:math:\mathrm{method} = \texttt{'B'}

Bartlett's method is used.

**rotate** : str, length 1
Indicates whether a rotation is to be applied.

:math:\mathrm{rotate} = \texttt{'R'}

A rotation will be applied to the coefficients and the rotation matrix, :math:R, must be given in :math:\mathrm{r}.

:math:\mathrm{rotate} = \texttt{'U'}

No rotation is applied.

**fl** : float, array-like, shape :math:\left(\textit{nvar}, \textit{nfac}\right)
:math:\Lambda, the matrix of unrotated factor loadings as returned by :meth:factor.

**psi** : float, array-like, shape :math:\left(\textit{nvar}\right)
The diagonal elements of :math:\Psi, as returned by :meth:factor.

**e** : float, array-like, shape :math:\left(\textit{nvar}\right)
The eigenvalues of the matrix :math:S^*, as returned by :meth:factor.

**r** : float, array-like, shape :math:\left(:, :\right)
Note: the required extent for this argument in dimension 1 is determined as follows: if :math:\mathrm{rotate}=\texttt{'R'}: :math:\textit{nfac}; otherwise: :math:1.

Note: the required extent for this argument in dimension 2 is determined as follows: if :math:\mathrm{rotate}=\texttt{'U'}: :math:1; if :math:\mathrm{rotate}=\texttt{'R'}: :math:\textit{nfac}; otherwise: :math:0.

If :math:\mathrm{rotate} = \texttt{'R'}, :math:\mathrm{r} must contain the orthogonal rotation matrix, :math:R, as returned by :meth:rot_orthomax.

If :math:\mathrm{rotate} = \texttt{'U'}, :math:\mathrm{r} need not be set.

**Returns**
**fs** : float, ndarray, shape :math:\left(\textit{nvar}, \textit{nfac}\right)
The matrix of factor score coefficients, :math:\Phi. :math:\mathrm{fs}[\textit{i}-1,\textit{j}-1] contains the factor score coefficient for the :math:\textit{j}\ th factor and the :math:\textit{i}\ th observed variable, for :math:\textit{j} = 1,2,\ldots,k, for :math:\textit{i} = 1,2,\ldots,p.

.. _g03cc-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{rotate} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{rotate} = \texttt{'R'} or :math:\texttt{'U'}.

(errno :math:1)
On entry, :math:\mathrm{method} = \langle\mathit{\boldsymbol{value}}\rangle

Constraint: :math:\mathrm{method} = \texttt{'B'} or :math:\texttt{'R'}.

(errno :math:1)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nfac} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nvar}\geq \textit{nfac}.

(errno :math:1)
On entry, :math:\textit{nfac} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nfac}\geq 1.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{psi}[i-1]\leq 0.0.

Constraint: :math:\mathrm{psi}[i-1] > 0.0.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{e}[i-1]\leq 1.0.

Constraint: :math:\mathrm{e}[i-1] > 1.0.

.. _g03cc-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

A factor analysis model aims to account for the covariances among :math:p variables, observed on :math:n individuals, in terms of a smaller number, :math:k, of unobserved variables or factors.
The values of the factors for an individual are known as factor scores. :meth:factor fits the factor analysis model by maximum likelihood and returns the estimated factor loading matrix, :math:\Lambda, and the diagonal matrix of variances of the unique components, :math:\Psi.
To obtain estimates of the factors, a :math:p\times k matrix of factor score coefficients, :math:\Phi, is formed.
The estimated vector of factor scores, :math:\hat{f}, is then given by:

.. math::
\hat{f} = x^{\mathrm{T}}\Phi \text{,}

where :math:x is the vector of observed variables for an individual.

There are two commonly used methods of obtaining factor score coefficients.

The regression method:

.. math::
\Phi = \Psi^{-1}\Lambda {\left(I+\Lambda^{\mathrm{T}}\Psi^{-1}\Lambda \right)}^{-1}\text{,}

and Bartlett's method:

.. math::
\Phi = \Psi^{-1}\Lambda {\left(\Lambda^{\mathrm{T}}\Psi^{-1}\Lambda \right)}^{-1}\text{.}

See Lawley and Maxwell (1971) for details of both methods.
In the regression method as given above, it is assumed that the factors are not correlated and have unit variance; this is true for models fitted by :meth:factor.
Further, for models fitted by :meth:factor,

.. math::
\Lambda^{\mathrm{T}}\Psi^{-1}\Lambda = \Theta -I\text{,}

where :math:\Theta is the diagonal matrix of eigenvalues of the matrix :math:S^*, as described in :meth:factor.

The factors may be orthogonally rotated using an orthogonal rotation matrix, :math:R, as computed by :meth:rot_orthomax.
The factor scores for the rotated matrix are then given by :math:\Lambda R.

.. _g03cc-py2-py-references:

**References**
Lawley, D N and Maxwell, A E, 1971, Factor Analysis as a Statistical Method, (2nd Edition), Butterworths
"""
raise NotImplementedError

[docs]def discrim(x, isx, ing, ng, wt=None):
r"""
discrim computes a test statistic for the equality of within-group covariance matrices and also computes matrices for use in discriminant analysis.

.. _g03da-py2-py-doc:

For full information please refer to the NAG Library document for g03da

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03daf.html

.. _g03da-py2-py-parameters:

**Parameters**
**x** : float, array-like, shape :math:\left(n, m\right)
:math:\mathrm{x}[\textit{k}-1,\textit{l}-1] must contain the :math:\textit{k}\ th observation for the :math:\textit{l}\ th variable, for :math:\textit{l} = 1,2,\ldots,m, for :math:\textit{k} = 1,2,\ldots,n.

**isx** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isx}[l-1] indicates whether or not the :math:l\ th variable in :math:\mathrm{x} is to be included in the variance-covariance matrices.

If :math:\mathrm{isx}[\textit{l}-1] > 0 the :math:\textit{l}\ th variable is included, for :math:\textit{l} = 1,2,\ldots,m; otherwise it is not referenced.

**ing** : int, array-like, shape :math:\left(n\right)
:math:\mathrm{ing}[\textit{k}-1] indicates to which group the :math:\textit{k}\ th observation belongs, for :math:\textit{k} = 1,2,\ldots,n.

**ng** : int
The number of groups, :math:n_g.

**wt** : None or float, array-like, shape :math:\left(n\right), optional
The elements of :math:\mathrm{wt} must contain the weights to be used in the analysis and the effective number of observations for a group is the sum of the weights of the observations in that group. If :math:\mathrm{wt}[k-1] = 0.0 then the :math:k\ th observation is excluded from the calculations.

If weights are not provided then :math:\mathrm{wt} must be set to **None** and the effective number of observations for a group is the number of observations in that group.

**Returns**
**nig** : int, ndarray, shape :math:\left(\mathrm{ng}\right)
:math:\mathrm{nig}[\textit{j}-1] contains the number of observations in the :math:\textit{j}\ th group, for :math:\textit{j} = 1,2,\ldots,n_g.

**gmn** : float, ndarray, shape :math:\left(\mathrm{ng}, \textit{nvar}\right)
The :math:\textit{j}\ th row of :math:\mathrm{gmn} contains the means of the :math:p selected variables for the :math:\textit{j}\ th group, for :math:\textit{j} = 1,2,\ldots,n_g.

**det** : float, ndarray, shape :math:\left(\mathrm{ng}\right)
The logarithm of the determinants of the within-group variance-covariance matrices.

**gc** : float, ndarray, shape :math:\left(\left(\mathrm{ng}+1\right)\times \textit{nvar}\times \left(\textit{nvar}+1\right)/2\right)
The first :math:p\left(p+1\right)/2 elements of :math:\mathrm{gc} contain :math:R and the remaining :math:n_g blocks of :math:p\left(p+1\right)/2 elements contain the :math:R_j matrices. All are stored in packed form by columns.

**stat** : float
The likelihood-ratio test statistic, :math:G.

**df** : float
The degrees of freedom for the distribution of :math:G.

**sig** : float
The significance level for :math:G.

.. _g03da-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\textit{weight} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{weight} = \texttt{'U'} or :math:\texttt{'W'}.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq \textit{nvar}.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq 1.

(errno :math:1)
On entry, :math:\mathrm{ng} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{ng}\geq 2.

(errno :math:1)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nvar}\geq 1.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{wt}[i-1] < 0.0.

Constraint: :math:\mathrm{wt}[i-1]\geq 0.0.

(errno :math:3)
The number of observations for group :math:\langle\mathit{\boldsymbol{value}}\rangle is less than :math:\textit{nvar}.

(errno :math:3)
The effective number of observations for group :math:\langle\mathit{\boldsymbol{value}}\rangle is less than :math:1.

(errno :math:3)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle, :math:\mathrm{ing}[i-1] = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{ng} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:1\leq \mathrm{ing}[i-1]\leq \mathrm{ng}.

(errno :math:3)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\langle\mathit{\boldsymbol{value}}\rangle values of :math:\mathrm{isx} > 0

Constraint: exactly :math:\textit{nvar} elements of :math:\mathrm{isx} > 0.

(errno :math:4)
:math:R is not of full rank.

(errno :math:4)
:math:R_j is not of full rank for :math:j = \langle\mathit{\boldsymbol{value}}\rangle.

.. _g03da-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Let a sample of :math:n observations on :math:p variables come from :math:n_g groups with :math:n_j observations in the :math:j\ th group and :math:\sum n_j = n.
If the data is assumed to follow a multivariate Normal distribution with the variance-covariance matrix of the :math:j\ th group :math:\Sigma_j, then to test for equality of the variance-covariance matrices between groups, that is, :math:\Sigma_1 = \Sigma_2 = \cdots = \Sigma_{n_g} = \Sigma, the following likelihood-ratio test statistic, :math:G, can be used;

.. math::
G = C\left\{\left(n-n_g\right)\log\left(\left\lvert S\right\rvert \right)-\sum_{{j = 1}}^{n_g}\left(n_j-1\right)\log\left(\left\lvert S_j\right\rvert \right)\right\}\text{,}

where

.. math::
C = 1-\frac{{2p^2+3p-1}}{{6\left(p+1\right)\left(n_g-1\right)}}\left(\sum_{{j = 1}}^{n_g}\frac{1}{\left(n_j-1\right)}-\frac{1}{\left(n-n_g\right)}\right)\text{,}

and :math:S_j are the within-group variance-covariance matrices and :math:S is the pooled variance-covariance matrix given by

.. math::
S = \frac{{\sum_{{j = 1}}^{n_g}\left(n_j-1\right)S_j}}{\left(n-n_g\right)}\text{.}

For large :math:n, :math:G is approximately distributed as a :math:\chi^2 variable with :math:\frac{1}{2}p\left(p+1\right)\left(n_g-1\right) degrees of freedom, see Morrison (1967) for further comments.
If weights are used, then :math:S and :math:S_j are the weighted pooled and within-group variance-covariance matrices and :math:n is the effective number of observations, that is, the sum of the weights.

Instead of calculating the within-group variance-covariance matrices and then computing their determinants in order to calculate the test statistic, discrim uses a :math:QR decomposition.
The group means are subtracted from the data and then for each group, a :math:QR decomposition is computed to give an upper triangular matrix :math:R_j^*.
This matrix can be scaled to give a matrix :math:R_j such that :math:S_j = R_j^{\mathrm{T}}R_j.
The pooled :math:R matrix is then computed from the :math:R_j matrices.
The values of :math:\left\lvert S\right\rvert and the :math:\left\lvert S_j\right\rvert can then be calculated from the diagonal elements of :math:R and the :math:R_j.

This approach means that the Mahalanobis squared distances for a vector observation :math:x can be computed as :math:z^{\mathrm{T}}z, where :math:R_jz = \left(x-\bar{x}_j\right), :math:\bar{x}_j being the vector of means of the :math:j\ th group.
These distances can be calculated by :meth:discrim_mahal.
The distances are used in discriminant analysis and :meth:discrim_group uses the results of discrim to perform several different types of discriminant analysis.
The differences between the discriminant methods are, in part, due to whether or not the within-group variance-covariance matrices are equal.

.. _g03da-py2-py-references:

**References**
Aitchison, J and Dunsmore, I R, 1975, Statistical Prediction Analysis, Cambridge

Kendall, M G and Stuart, A, 1976, The Advanced Theory of Statistics (Volume 3), (3rd Edition), Griffin

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press

Morrison, D F, 1967, Multivariate Statistical Methods, McGraw--Hill

--------
:meth:naginterfaces.library.examples.mv.discrim_group_ex.main
"""
raise NotImplementedError

[docs]def discrim_mahal(equal, mode, gmn, gc, nobs, isx, x):
r"""
discrim_mahal computes Mahalanobis squared distances for group or pooled variance-covariance matrices.
It is intended for use after :meth:discrim.

.. _g03db-py2-py-doc:

For full information please refer to the NAG Library document for g03db

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03dbf.html

.. _g03db-py2-py-parameters:

**Parameters**
**equal** : str, length 1
Indicates whether or not the within-group variance-covariance matrices are assumed to be equal and the pooled variance-covariance matrix used.

:math:\mathrm{equal} = \texttt{'E'}

The within-group variance-covariance matrices are assumed equal and the matrix :math:R stored in the first :math:p\left(p+1\right)/2 elements of :math:\mathrm{gc} is used.

:math:\mathrm{equal} = \texttt{'U'}

The within-group variance-covariance matrices are assumed to be unequal and the matrices :math:R_{\textit{j}}, for :math:\textit{j} = 1,2,\ldots,n_g, stored in the remainder of :math:\mathrm{gc} are used.

**mode** : str, length 1
Indicates whether distances from sample points are to be calculated or distances between the group means.

:math:\mathrm{mode} = \texttt{'S'}

The distances between the sample points given in :math:\mathrm{x} and the group means are calculated.

:math:\mathrm{mode} = \texttt{'M'}

The distances between the group means will be calculated.

**gmn** : float, array-like, shape :math:\left(\textit{ng}, \textit{nvar}\right)
The :math:\textit{j}\ th row of :math:\mathrm{gmn} contains the means of the :math:p selected variables for the :math:\textit{j}\ th group, for :math:\textit{j} = 1,2,\ldots,n_g. These are returned by :meth:discrim.

**gc** : float, array-like, shape :math:\left(\left(\textit{ng}+1\right)\times \textit{nvar}\times \left(\textit{nvar}+1\right)/2\right)
The first :math:p\left(p+1\right)/2 elements of :math:\mathrm{gc} should contain the upper triangular matrix :math:R and the next :math:n_g blocks of :math:p\left(p+1\right)/2 elements should contain the upper triangular matrices :math:R_j. All matrices must be stored packed by column. These matrices are returned by :meth:discrim. If :math:\mathrm{equal} = \texttt{'E'} only the first :math:p\left(p+1\right)/2 elements are referenced, if :math:\mathrm{equal} = \texttt{'U'} only the elements :math:p\left(p+1\right)/2+1 to :math:\left(n_g+1\right)p\left(p+1\right)/2 are referenced.

**nobs** : int
If :math:\mathrm{mode} = \texttt{'S'}, the number of sample points in :math:\mathrm{x} for which distances are to be calculated.

If :math:\mathrm{mode} = \texttt{'M'}, :math:\mathrm{nobs} is not referenced.

**isx** : int, array-like, shape :math:\left(m\right)
If :math:\mathrm{mode} = \texttt{'S'}, :math:\mathrm{isx}[\textit{l}-1] indicates if the :math:\textit{l}\ th variable in :math:\mathrm{x} is to be included in the distance calculations. If :math:\mathrm{isx}[\textit{l}-1] > 0 the :math:\textit{l}\ th variable is included, for :math:\textit{l} = 1,2,\ldots,m; otherwise the :math:\textit{l}\ th variable is not referenced.

If :math:\mathrm{mode} = \texttt{'M'}, :math:\mathrm{isx} is not referenced.

**x** : float, array-like, shape :math:\left(:, m\right)
Note: the required extent for this argument in dimension 1 is determined as follows: if :math:\mathrm{mode}=\texttt{'S'}: :math:\mathrm{nobs}; otherwise: :math:1.

If :math:\mathrm{mode} = \texttt{'S'} the :math:\textit{k}\ th row of :math:\mathrm{x} must contain :math:x_{\textit{k}}. That is :math:\mathrm{x}[\textit{k}-1,\textit{l}-1] must contain the :math:\textit{k}\ th sample value for the :math:\textit{l}\ th variable, for :math:\textit{l} = 1,2,\ldots,m, for :math:\textit{k} = 1,2,\ldots,\mathrm{nobs}. Otherwise :math:\mathrm{x} is not referenced.

**Returns**
**d** : float, ndarray, shape :math:\left(:, \textit{ng}\right)
The squared distances.

If :math:\mathrm{mode} = \texttt{'S'}, :math:\mathrm{d}[\textit{k}-1,\textit{j}-1] contains the squared distance of the :math:\textit{k}\ th sample point from the :math:\textit{j}\ th group mean, :math:D_{{\textit{k}\textit{j}}}^2, for :math:\textit{j} = 1,2,\ldots,n_g, for :math:\textit{k} = 1,2,\ldots,\mathrm{nobs}.

If :math:\mathrm{mode} = \texttt{'M'} and :math:\mathrm{equal} = \texttt{'U'}, :math:\mathrm{d}[\textit{i}-1,\textit{j}-1] contains the squared distance between the :math:\textit{i}\ th mean and the :math:\textit{j}\ th mean, :math:D_{{\textit{i}\textit{j}}}^2, for :math:\textit{j} = 1,2,\ldots,\textit{i}-1,\textit{i}+1,\ldots,n_g, for :math:\textit{i} = 1,2,\ldots,n_g.

The elements :math:\mathrm{d}[\textit{i}-1,\textit{i}-1] are not referenced, for :math:\textit{i} = 1,2,\ldots,n_g.

If :math:\mathrm{mode} = \texttt{'M'} and :math:\mathrm{equal} = \texttt{'E'}, :math:\mathrm{d}[\textit{i}-1,\textit{j}-1] contains the squared distance between the :math:\textit{i}\ th mean and the :math:\textit{j}\ th mean, :math:D_{{\textit{i}\textit{j}}}^2, for :math:\textit{j} = 1,2,\ldots,\textit{i}-1, for :math:\textit{i} = 1,2,\ldots,n_g.

Since :math:D_{{\textit{i}\textit{j}}} = D_{{\textit{j}\textit{i}}} the elements :math:\mathrm{d}[\textit{i}-1,\textit{j}-1] are not referenced, for :math:\textit{j} = \textit{i}+1,\ldots,n_g, for :math:\textit{i} = 1,2,\ldots,n_g.

.. _g03db-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{mode} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{mode} = \texttt{'M'} or :math:\texttt{'S'}.

(errno :math:1)
On entry, :math:\mathrm{equal} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{equal} = \texttt{'E'} or :math:\texttt{'U'}.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq \textit{nvar}.

(errno :math:1)
On entry, :math:\mathrm{nobs} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nobs}\geq 1.

(errno :math:1)
On entry, :math:\textit{ng} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{ng}\geq 2.

(errno :math:1)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nvar}\geq 1.

(errno :math:2)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\langle\mathit{\boldsymbol{value}}\rangle values of :math:\mathrm{isx} > 0.

Constraint: exactly :math:\textit{nvar} elements of :math:\mathrm{isx} > 0.

(errno :math:2)
On entry, diagonal element :math:\langle\mathit{\boldsymbol{value}}\rangle of :math:R_j = 0 for :math:j = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:2)
On entry, diagonal element :math:\langle\mathit{\boldsymbol{value}}\rangle of :math:R = 0.

.. _g03db-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Consider :math:p variables observed on :math:n_g populations or groups.
Let :math:\bar{x}_j be the sample mean and :math:S_j the within-group variance-covariance matrix for the :math:j\ th group and let :math:x_k be the :math:k\ th sample point in a dataset.
A measure of the distance of the point from the :math:j\ th population or group is given by the Mahalanobis distance, :math:D_{{kj}}:

.. math::
D_{{kj}}^2 = \left(x_k-\bar{x}_j\right)^\mathrm{T}S_j^{-1}\left(x_k-\bar{x}_j\right)\text{.}

If the pooled estimated of the variance-covariance matrix :math:S is used rather than the within-group variance-covariance matrices, then the distance is:

.. math::
D_{{kj}}^2 = \left(x_k-\bar{x}_j\right)^\mathrm{T}S^{-1}\left(x_k-\bar{x}_j\right)\text{.}

Instead of using the variance-covariance matrices :math:S and :math:S_j, discrim_mahal uses the upper triangular matrices :math:R and :math:R_j supplied by :meth:discrim such that :math:S = R^{\mathrm{T}}R and :math:S_j = R_j^{\mathrm{T}}R_j. :math:D_{{kj}}^2 can then be calculated as :math:z^{\mathrm{T}}z where :math:R_jz = \left(x_k-\bar{x}_j\right) or :math:Rz = \left(x_k-\bar{x}_j\right) as appropriate.

A particular case is when the distance between the group or population means is to be estimated.
The Mahalanobis squared distance between the :math:i\ th and :math:j\ th groups is:

.. math::
D_{{ij}}^2 = \left(\bar{x}_i-\bar{x}_j\right)^\mathrm{T}S_j^{-1}\left(\bar{x}_i-\bar{x}_j\right)

or

.. math::
D_{{ij}}^2 = \left(\bar{x}_i-\bar{x}_j\right)^\mathrm{T}S^{-1}\left(\bar{x}_i-\bar{x}_j\right)\text{.}

**Note:**  :math:D_{{jj}}^2 = 0 and that in the case when the pooled variance-covariance matrix is used :math:D_{{ij}}^2 = D_{{ji}}^2 so in this case only the lower triangular values of :math:D_{{ij}}^2, :math:i > j, are computed.

.. _g03db-py2-py-references:

**References**
Aitchison, J and Dunsmore, I R, 1975, Statistical Prediction Analysis, Cambridge

Kendall, M G and Stuart, A, 1976, The Advanced Theory of Statistics (Volume 3), (3rd Edition), Griffin

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press
"""
raise NotImplementedError

[docs]def discrim_group(typ, equal, priors, nig, gmn, gc, det, isx, x, prior, atiq):
r"""
discrim_group allocates observations to groups according to selected rules.
It is intended for use after :meth:discrim.

.. _g03dc-py2-py-doc:

For full information please refer to the NAG Library document for g03dc

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03dcf.html

.. _g03dc-py2-py-parameters:

**Parameters**
**typ** : str, length 1
Whether the estimative or predictive approach is used.

:math:\mathrm{typ} = \texttt{'E'}

The estimative approach is used.

:math:\mathrm{typ} = \texttt{'P'}

The predictive approach is used.

**equal** : str, length 1
Indicates whether or not the within-group variance-covariance matrices are assumed to be equal and the pooled variance-covariance matrix used.

:math:\mathrm{equal} = \texttt{'E'}

The within-group variance-covariance matrices are assumed equal and the matrix :math:R stored in the first :math:p\left(p+1\right)/2 elements of :math:\mathrm{gc} is used.

:math:\mathrm{equal} = \texttt{'U'}

The within-group variance-covariance matrices are assumed to be unequal and the matrices :math:R_{\textit{i}}, for :math:\textit{i} = 1,2,\ldots,n_g, stored in the remainder of :math:\mathrm{gc} are used.

**priors** : str, length 1
Indicates the form of the prior probabilities to be used.

:math:\mathrm{priors} = \texttt{'E'}

Equal prior probabilities are used.

:math:\mathrm{priors} = \texttt{'P'}

Prior probabilities proportional to the group sizes in the training set, :math:n_j, are used.

:math:\mathrm{priors} = \texttt{'I'}

The prior probabilities are input in :math:\mathrm{prior}.

**nig** : int, array-like, shape :math:\left(\textit{ng}\right)
The number of observations in each group in the training set, :math:n_j.

**gmn** : float, array-like, shape :math:\left(\textit{ng}, \textit{nvar}\right)
The :math:\textit{j}\ th row of :math:\mathrm{gmn} contains the means of the :math:p variables for the :math:\textit{j}\ th group, for :math:\textit{j} = 1,2,\ldots,n_j. These are returned by :meth:discrim.

**gc** : float, array-like, shape :math:\left(\left(\textit{ng}+1\right)\times \textit{nvar}\times \left(\textit{nvar}+1\right)/2\right)
The first :math:p\left(p+1\right)/2 elements of :math:\mathrm{gc} should contain the upper triangular matrix :math:R and the next :math:n_g blocks of :math:p\left(p+1\right)/2 elements should contain the upper triangular matrices :math:R_j.

All matrices must be stored packed by column.

These matrices are returned by :meth:discrim.

If :math:\mathrm{equal} = \texttt{'E'} only the first :math:p\left(p+1\right)/2 elements are referenced, if :math:\mathrm{equal} = \texttt{'U'} only the elements :math:p\left(p+1\right)/2+1 to :math:\left(n_g+1\right)p\left(p+1\right)/2 are referenced.

**det** : float, array-like, shape :math:\left(\textit{ng}\right)
If :math:\mathrm{equal} = \texttt{'U'}. the logarithms of the determinants of the within-group variance-covariance matrices as returned by :meth:discrim. Otherwise :math:\mathrm{det} is not referenced.

**isx** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isx}[l-1] indicates if the :math:l\ th variable in :math:\mathrm{x} is to be included in the distance calculations.

If :math:\mathrm{isx}[\textit{l}-1] > 0, the :math:\textit{l}\ th variable is included, for :math:\textit{l} = 1,2,\ldots,m; otherwise the :math:\textit{l}\ th variable is not referenced.

**x** : float, array-like, shape :math:\left(\textit{nobs}, m\right)
:math:\mathrm{x}[\textit{k}-1,\textit{l}-1] must contain the :math:\textit{k}\ th observation for the :math:\textit{l}\ th variable, for :math:\textit{l} = 1,2,\ldots,m, for :math:\textit{k} = 1,2,\ldots,\textit{nobs}.

**prior** : float, array-like, shape :math:\left(\textit{ng}\right)
If :math:\mathrm{priors} = \texttt{'I'}, the prior probabilities for the :math:n_g groups.

**atiq** : bool
:math:\mathrm{atiq} must be :math:\mathbf{True} if atypicality indices are required. If :math:\mathrm{atiq} is :math:\mathbf{False} the array :math:\mathrm{ati} is not set.

**Returns**
**prior** : float, ndarray, shape :math:\left(\textit{ng}\right)
If :math:\mathrm{priors} = \texttt{'P'}, the computed prior probabilities in proportion to group sizes for the :math:n_g groups.

If :math:\mathrm{priors} = \texttt{'I'}, the input prior probabilities will be unchanged.

If :math:\mathrm{priors} = \texttt{'E'}, :math:\mathrm{prior} is not set.

**p** : float, ndarray, shape :math:\left(\textit{nobs}, \textit{ng}\right)
:math:\mathrm{p}[\textit{k}-1,\textit{j}-1] contains the posterior probability :math:p_{{\textit{k}\textit{j}}} for allocating the :math:\textit{k}\ th observation to the :math:\textit{j}\ th group, for :math:\textit{j} = 1,2,\ldots,n_g, for :math:\textit{k} = 1,2,\ldots,\textit{nobs}.

**iag** : int, ndarray, shape :math:\left(\textit{nobs}\right)
The groups to which the observations have been allocated.

**ati** : float, ndarray, shape :math:\left(\textit{nobs}, :\right)
If :math:\mathrm{atiq} is :math:\mathbf{True}, :math:\mathrm{ati}[\textit{k}-1,\textit{j}-1] will contain the predictive atypicality index for the :math:\textit{k}\ th observation with respect to the :math:\textit{j}\ th group, for :math:\textit{j} = 1,2,\ldots,n_g, for :math:\textit{k} = 1,2,\ldots,\textit{nobs}.

If :math:\mathrm{atiq} is :math:\mathbf{False}, :math:\mathrm{ati} is not set.

.. _g03dc-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{typ} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{typ} = \texttt{'E'} or :math:\texttt{'P'}.

(errno :math:1)
On entry, :math:\mathrm{priors} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{priors} = \texttt{'E'}, :math:\texttt{'I'} or :math:\texttt{'P'}.

(errno :math:1)
On entry, :math:\mathrm{equal} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{equal} = \texttt{'E'} or :math:\texttt{'U'}.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq \textit{nvar}.

(errno :math:1)
On entry, :math:\textit{nobs} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nobs}\geq 1.

(errno :math:1)
On entry, :math:\textit{ng} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{ng}\geq 2.

(errno :math:1)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nvar}\geq 1.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{nig}[i-1] = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nig}[i-1] > 0.

(errno :math:2)
On entry, :math:\sum_i\left(\mathrm{nig}[i]\right)\leq \textit{ng}+\textit{nvar}.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle, :math:\mathrm{nig}[i-1] = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nig}[i-1] < \textit{nvar}.

(errno :math:2)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\langle\mathit{\boldsymbol{value}}\rangle values of :math:\mathrm{isx} > 0.

Constraint: exactly :math:\textit{nvar} elements of :math:\mathrm{isx} > 0.

(errno :math:3)
On entry, :math:\sum_j\left(\mathrm{prior}[j]\right)\neq 1.0.

(errno :math:3)
On entry, :math:j = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{prior}[j-1] = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{prior}[j-1] > 0.

(errno :math:4)
On entry, a diagonal element of :math:R or :math:R_j is zero.

.. _g03dc-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Discriminant analysis is concerned with the allocation of observations to groups using information from other observations whose group membership is known, :math:X_t; these are called the training set.
Consider :math:p variables observed on :math:n_g populations or groups.
Let :math:\bar{x}_j be the sample mean and :math:S_j the within-group variance-covariance matrix for the :math:j\ th group; these are calculated from a training set of :math:n observations with :math:n_j observations in the :math:j\ th group, and let :math:x_k be the :math:k\ th observation from the set of observations to be allocated to the :math:n_g groups.
The observation can be allocated to a group according to a selected rule.
The allocation rule or discriminant function will be based on the distance of the observation from an estimate of the location of the groups, usually the group means.
A measure of the distance of the observation from the :math:j\ th group mean is given by the Mahalanobis distance, :math:D_{{kj}}:

.. math::
D_{{kj}}^2 = \left(x_k-\bar{x}_j\right)^\mathrm{T}S_j^{-1}\left(x_k-\bar{x}_j\right)\text{.}

If the pooled estimate of the variance-covariance matrix :math:S is used rather than the within-group variance-covariance matrices, then the distance is:

.. math::
D_{{kj}}^2 = \left(x_k-\bar{x}_j\right)^\mathrm{T}S^{-1}\left(x_k-\bar{x}_j\right)\text{.}

Instead of using the variance-covariance matrices :math:S and :math:S_j, discrim_group uses the upper triangular matrices :math:R and :math:R_j supplied by :meth:discrim such that :math:S = R^{\mathrm{T}}R and :math:S_j = R_j^{\mathrm{T}}R_j. :math:D_{{kj}}^2 can then be calculated as :math:z^{\mathrm{T}}z where :math:R^\mathrm{T}_jz = \left(x_k-x_j\right) or :math:R^\mathrm{T}z = \left(x_k-x\right) as appropriate.

In addition to the distances, a set of prior probabilities of group membership, :math:\pi_j, for :math:j = 1,2,\ldots,n_g, may be used, with :math:\sum \pi_j = 1.
The prior probabilities reflect your view as to the likelihood of the observations coming from the different groups.
Two common cases for prior probabilities are :math:\pi_1 = \pi_2 = \cdots = \pi_{n_g}, that is, equal prior probabilities, and :math:\pi_{\textit{j}} = n_{\textit{j}}/n, for :math:\textit{j} = 1,2,\ldots,n_g, that is, prior probabilities proportional to the number of observations in the groups in the training set.

discrim_group uses one of four allocation rules.
In all four rules the :math:p variables are assumed to follow a multivariate Normal distribution with mean :math:\mu_j and variance-covariance matrix :math:\Sigma_j if the observation comes from the :math:j\ th group.
The different rules depend on whether or not the within-group variance-covariance matrices are assumed equal, i.e., :math:\Sigma_1 = \Sigma_2 = \cdots = \Sigma_{n_g}, and whether a predictive or estimative approach is used.
If :math:p\left({x_k | \mu_j}, \Sigma_j\right) is the probability of observing the observation :math:x_k from group :math:j, then the posterior probability of belonging to group :math:j is:

.. math::
p\left({j | x_k}, \mu_j, \Sigma_j\right)\propto p\left({x_k | \mu_j}, \Sigma_j\right)\pi_j\text{.}

In the estimative approach, the parameters :math:\mu_j and :math:\Sigma_j in (3) <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03dcf.html#eqn3>__ are replaced by their estimates calculated from :math:X_t.
In the predictive approach, a non-informative prior distribution is used for the parameters and a posterior distribution for the parameters, :math:p\left(\mu_j, {\Sigma_j | X_t}\right), is found.
A predictive distribution is then obtained by integrating :math:p\left({j | x_k}, \mu_j, \Sigma_j\right)p\left(\mu_j, {\Sigma_j | X}\right) over the parameter space.
This predictive distribution then replaces :math:p\left({x_k | \mu_j}, \Sigma_j\right) in (3) <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03dcf.html#eqn3>__.
See Aitchison and Dunsmore (1975), Aitchison et al. (1977) and Moran and Murphy (1979) for further details.

The observation is allocated to the group with the highest posterior probability.
Denoting the posterior probabilities, :math:p\left({j | x_k}, \mu_j, \Sigma_j\right), by :math:q_j, the four allocation rules are:

(i) Estimative with equal variance-covariance matrices -- Linear Discrimination

.. math::
\log\left(q_j\right)\propto -\frac{1}{2}D_{{kj}}^2+\log\left(\pi_j\right)

(#) Estimative with unequal variance-covariance matrices -- Quadratic Discrimination

.. math::
\log\left(q_j\right)\propto -\frac{1}{2}D_{{kj}}^2+\log\left(\pi_j\right)-\frac{1}{2}\log\left(\left\lvert S_j\right\rvert \right)

(#) Predictive with equal variance-covariance matrices

.. math::
q_j^{{-1}}\propto {\left(\left(n_j+1\right)/n_j\right)}^{{p/2}}\left\{1+\left[n_j/\left(\left(n-n_g\right)\left(n_j+1\right)\right)\right]D_{{kj}}^2\right\}^{{\left(n+1-n_g\right)/2}}

(#) Predictive with unequal variance-covariance matrices

.. math::
q_j^{{-1}}\propto C\left\{\left(\left(n_j^2-1\right)/n_j\right)\left\lvert S_j\right\rvert \right\}^{{p/2}}\left\{1+\left(n_j/\left(n_j^2-1\right)\right)D_{{kj}}^2\right\}^{{n_j/2}}\text{,}

where

.. math::
C = \frac{{\Gamma \left(\frac{1}{2}\left(n_j-p\right)\right)}}{{\Gamma \left(\frac{1}{2}n_j\right)}}\text{.}

In the above the appropriate value of :math:D_{{kj}}^2 from (1) <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03dcf.html#eqn1>__ and (2) <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03dcf.html#eqn1>__ is used.
The values of the :math:q_j are standardized so that,

.. math::
\sum_{{j = 1}}^{n_g}q_j = 1\text{.}

Moran and Murphy (1979) show the similarity between the predictive methods and methods based upon likelihood ratio tests.

In addition to allocating the observation to a group, discrim_group computes an atypicality index, :math:I_j\left(x_k\right).
The predictive atypicality index is returned, irrespective of the value of the parameter :math:\mathrm{typ}.
This represents the probability of obtaining an observation more typical of group :math:j than the observed :math:x_k (see Aitchison and Dunsmore (1975) and Aitchison et al. (1977)).
The atypicality index is computed for unequal within-group variance-covariance matrices as:

.. math::
I_j\left(x_k\right) = P\left({B\leq z:\frac{1}{2}p}, {\frac{1}{2}\left(n_j-p\right)}\right)

where :math:P\left({B\leq \beta :a}, b\right) is the lower tail probability from a beta distribution and

.. math::
z = D_{{kj}}^2/\left(D_{{kj}}^2+\left(n_j^2-1\right)/n_j\right)\text{,}

and for equal within-group variance-covariance matrices as:

.. math::
I_j\left(x_k\right) = P\left({B\leq z:\frac{1}{2}p}, {\frac{1}{2}\left(n-n_g-p+1\right)}\right)\text{,}

with

.. math::
z = D_{{kj}}^2/\left(D_{{kj}}^2+\left(n-n_g\right)\left(n_j+1\right)/n_j\right)\text{.}

If :math:I_j\left(x_k\right) is close to :math:1 for all groups it indicates that the observation may come from a grouping not represented in the training set. Moran and Murphy (1979) provide a frequentist interpretation of :math:I_j\left(x_k\right).

.. _g03dc-py2-py-references:

**References**
Aitchison, J and Dunsmore, I R, 1975, Statistical Prediction Analysis, Cambridge

Aitchison, J, Habbema, J D F and Kay, J W, 1977, A critical comparison of two methods of statistical discrimination, Appl. Statist. (26), 15--25

Kendall, M G and Stuart, A, 1976, The Advanced Theory of Statistics (Volume 3), (3rd Edition), Griffin

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press

Moran, M A and Murphy, B J, 1979, A closer look at two alternative methods of statistical discrimination, Appl. Statist. (28), 223--232

Morrison, D F, 1967, Multivariate Statistical Methods, McGraw--Hill

--------
:meth:naginterfaces.library.examples.mv.discrim_group_ex.main
"""
raise NotImplementedError

[docs]def distance_mat(update, dist, scal, x, isx, s, d):
r"""
distance_mat computes a distance (dissimilarity) matrix.

.. _g03ea-py2-py-doc:

For full information please refer to the NAG Library document for g03ea

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03eaf.html

.. _g03ea-py2-py-parameters:

**Parameters**
**update** : str, length 1
Indicates whether or not an existing matrix is to be updated.

:math:\mathrm{update} = \texttt{'U'}

The matrix :math:D is updated and distances are added to :math:D.

:math:\mathrm{update} = \texttt{'I'}

The matrix :math:D is initialized to zero before the distances are added to :math:D.

**dist** : str, length 1
Indicates which type of distances are computed.

:math:\mathrm{dist} = \texttt{'A'}

Absolute distances.

:math:\mathrm{dist} = \texttt{'E'}

Euclidean distances.

:math:\mathrm{dist} = \texttt{'S'}

Euclidean squared distances.

**scal** : str, length 1
Indicates the standardization of the variables to be used.

:math:\mathrm{scal} = \texttt{'S'}

Standard deviation.

:math:\mathrm{scal} = \texttt{'R'}

Range.

:math:\mathrm{scal} = \texttt{'G'}

Standardizations given in array :math:\mathrm{s}.

:math:\mathrm{scal} = \texttt{'U'}

Unscaled.

**x** : float, array-like, shape :math:\left(n, m\right)
:math:\mathrm{x}[\textit{i}-1,\textit{j}-1] must contain the value of the :math:\textit{j}\ th variable for the :math:\textit{i}\ th object, for :math:\textit{j} = 1,2,\ldots,m, for :math:\textit{i} = 1,2,\ldots,n.

**isx** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isx}[j-1] indicates whether or not the :math:j\ th variable in :math:\mathrm{x} is to be included in the distance computations.

If :math:\mathrm{isx}[\textit{j}-1] > 0 the :math:\textit{j}\ th variable is included, for :math:\textit{j} = 1,2,\ldots,m; otherwise it is not referenced.

**s** : float, array-like, shape :math:\left(m\right)
If :math:\mathrm{scal} = \texttt{'G'} and :math:\mathrm{isx}[\textit{j}-1] > 0 then :math:\mathrm{s}[\textit{j}-1] must contain the scaling for variable :math:\textit{j}, for :math:\textit{j} = 1,2,\ldots,m.

**d** : float, array-like, shape :math:\left(n\times \left(n-1\right)/2\right)
If :math:\mathrm{update} = \texttt{'U'}, :math:\mathrm{d} must contain the strictly lower triangle of the distance matrix :math:D to be updated. :math:D must be stored packed by rows, i.e., :math:\mathrm{d}[ \left(i-1\right) \left(i-2\right) /2+j -1], :math:i > j must contain :math:d_{{ij}}.

If :math:\mathrm{update} = \texttt{'I'}, :math:\mathrm{d} need not be set.

**Returns**
**s** : float, ndarray, shape :math:\left(m\right)
If :math:\mathrm{scal} = \texttt{'S'} and :math:\mathrm{isx}[j-1] > 0 then :math:\mathrm{s}[j-1] contains the standard deviation of the variable in the :math:j\ th column of :math:\mathrm{x}.

If :math:\mathrm{scal} = \texttt{'R'} and :math:\mathrm{isx}[j-1] > 0, :math:\mathrm{s}[j-1] contains the range of the variable in the :math:j\ th column of :math:\mathrm{x}.

If :math:\mathrm{scal} = \texttt{'U'} and :math:\mathrm{isx}[j-1] > 0, :math:\mathrm{s}[j-1] = 1.0.

If :math:\mathrm{scal} = \texttt{'G'}, :math:\mathrm{s} is unchanged.

**d** : float, ndarray, shape :math:\left(n\times \left(n-1\right)/2\right)
The strictly lower triangle of the distance matrix :math:D stored packed by rows, i.e., :math:d_{{ij}} is contained in :math:\mathrm{d}[ \left(i-1\right) \left(i-2\right) /2+j -1], :math:i > j.

.. _g03ea-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{scal} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{scal} = \texttt{'S'}, :math:\texttt{'R'}, :math:\texttt{'G'} or :math:\texttt{'U'}.

(errno :math:1)
On entry, :math:\mathrm{update} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{update} = \texttt{'U'} or :math:\texttt{'I'}.

(errno :math:1)
On entry, :math:\mathrm{dist} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{dist} = \texttt{'A'}, :math:\texttt{'E'} or :math:\texttt{'S'}

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m > 0.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq 2.

(errno :math:2)
On entry, at least one element of :math:\mathrm{s}\leq 0.0.

(errno :math:2)
Variable :math:\langle\mathit{\boldsymbol{value}}\rangle is constant.

(errno :math:2)
On entry, at least one element of :math:\mathrm{d} < 0.0.

(errno :math:2)
On entry, :math:\mathrm{isx} does not contain a positive element.

.. _g03ea-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Given :math:n objects, a distance or dissimilarity matrix is a symmetric matrix with zero diagonal elements such that the :math:ij\ th element represents how far apart or how dissimilar the :math:i\ th and :math:j\ th objects are.

Let :math:X be an :math:n\times p data matrix of observations of :math:p variables on :math:n objects, then the distance between object :math:j and object :math:k, :math:d_{{jk}}, can be defined as:

.. math::
d_{{jk}} = \left\{\sum_{{i = 1}}^pD\left({x_{{ji}}/s_i}, {x_{{ki}}/s_i}\right)\right\}^{\alpha }\text{,}

where :math:x_{{ji}} and :math:x_{{ki}} are the :math:ji\ th and :math:ki\ th elements of :math:X, :math:s_i is a standardization for the :math:i\ th variable and :math:D\left(u, v\right) is a suitable function.
Three functions are provided in distance_mat.

(a) Euclidean distance: :math:D\left(u, v\right) = \left(u-v\right)^2 and :math:\alpha = \frac{1}{2}.

(#) Euclidean squared distance: :math:D\left(u, v\right) = \left(u-v\right)^2 and :math:\alpha = 1.

(#) Absolute distance (city block metric): :math:D\left(u, v\right) = \left\lvert u-v\right\rvert and :math:\alpha = 1.

Three standardizations are available.

(a) Standard deviation: :math:s_i = \sqrt{\sum_{{j = 1}}^n\left(x_{{ji}}-\bar{x}\right)^2/\left(n-1\right)}

(#) Range: :math:s_i = \mathrm{max}\left(x_{{1i}}, x_{{2i}}, \ldots, x_{{ni}}\right)-\mathrm{min}\left(x_{{1i}}, x_{{2i}}, \ldots, x_{{ni}}\right)

(#) User-supplied values of :math:s_i.

In addition to the above distances there are a large number of other dissimilarity measures, particularly for dichotomous variables (see Krzanowski (1990) and Everitt (1974)).
For the dichotomous case these measures are simple to compute and can, if suitable scaling is used, be combined with the distances computed by distance_mat using the updating option.

Dissimilarity measures for variables can be based on the correlation coefficient for continuous variables and contingency table statistics for dichotomous data, see submodule :mod:~naginterfaces.library.correg and submodule :mod:~naginterfaces.library.contab respectively.

distance_mat returns the strictly lower triangle of the distance matrix.

.. _g03ea-py2-py-references:

**References**
Everitt, B S, 1974, Cluster Analysis, Heinemann

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press
"""
raise NotImplementedError

[docs]def distance_mat_2(update, scal, stype, p, x, isv, y, sx, sy=None, d=None):
r"""
distance_mat_2 computes a distance (dissimilarity) matrix between two sets of observations.

.. _g03eb-py2-py-doc:

For full information please refer to the NAG Library document for g03eb

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03ebf.html

.. _g03eb-py2-py-parameters:

**Parameters**
**update** : str, length 1
Indicates whether or not an existing matrix is to be updated.

:math:\mathrm{update} = \texttt{'U'}

The matrix :math:D is updated and distances are added to :math:D.

:math:\mathrm{update} = \texttt{'I'}

The matrix :math:D is initialized to zero before the distances are added to :math:D.

**scal** : str, length 1
Indicates the standardization of the variables to be used.

:math:\mathrm{scal} = \texttt{'S'}

Standard deviation.

:math:\mathrm{scal} = \texttt{'R'}

Range.

:math:\mathrm{scal} = \texttt{'G'}

Standardizations given in array :math:\mathrm{sx} (and posibly :math:\mathrm{sy}).

:math:\mathrm{scal} = \texttt{'U'}

Unscaled.

**stype** : str, length 1
Indicates how the standardization of the variables treats the two sets of observations.

:math:\mathrm{stype} = \texttt{'A'}

Amalgamated.

:math:\mathrm{stype} = \texttt{'I'}

Independent.

:math:\mathrm{stype} = \texttt{'X'}

Standardization is based purley on observations in :math:\mathrm{x}.

**p** : float
The order :math:p of the Minkowski distance metric.

**x** : float, array-like, shape :math:\left(m, l\right)
:math:\mathrm{x}[\textit{i}-1,\textit{k}-1] must contain the value of the :math:\textit{k}\ th variable for the :math:\textit{i}\ th observation in the first set of observations, for :math:\textit{k} = 1,2,\ldots,l, for :math:\textit{i} = 1,2,\ldots,m.

**isv** : int, array-like, shape :math:\left(l\right)
:math:\mathrm{isv}[j-1] indicates whether or not the :math:j\ th variable in :math:\mathrm{x} and :math:\mathrm{y} is to be included in the distance computations.

If :math:\mathrm{isv}[\textit{k}-1] = 0 the :math:\textit{j}\ th variable is not included., for :math:\textit{k} = 1,2,\ldots,l.

If :math:\mathrm{isv}[\textit{k}-1]\geq 1 the :math:\textit{j}\ th variable is included, for :math:\textit{k} = 1,2,\ldots,l

**y** : float, array-like, shape :math:\left(n, l\right)
:math:\mathrm{y}[\textit{j}-1,\textit{k}-1] must contain the value of the :math:\textit{k}\ th variable for the :math:\textit{j}\ th observation in the second set of observations, for :math:\textit{k} = 1,2,\ldots,l, for :math:\textit{j} = 1,2,\ldots,n.

**sx** : float, array-like, shape :math:\left(l\right)
If :math:\mathrm{scal} = \texttt{'G'} and :math:\mathrm{isv}[\textit{k}-1] > 0 then :math:\mathrm{sx}[\textit{k}-1] must contain the scaling for variable :math:\textit{k}, for :math:\textit{k} = 1,2,\ldots,l.

**sy** : None or float, array-like, shape :math:\left(:\right), optional
Note: the required length for this argument is determined as follows: if :math:\mathrm{stype}=\texttt{'I'}\text{ and }\mathrm{scal}\neq\texttt{'U'}: :math:l; otherwise: :math:0.

If :math:\mathrm{scal} = \texttt{'G'} and :math:\mathrm{stype} = \texttt{'I'} and :math:\mathrm{isv}[\textit{k}-1] > 0 then :math:\mathrm{sy}[\textit{k}-1] must contain the scaling for variable :math:\textit{k} , for :math:\textit{k} = 1,2,\ldots,l.

If :math:\mathrm{stype} \neq \texttt{'I'}, or :math:\mathrm{scal} = \texttt{'U'} then :math:\mathrm{sy} is not referenced and may be **None**.

**d** : None or float, array-like, shape :math:\left(m, n\right), optional
The :math:m\times n distance matrix :math:D.

If :math:\mathrm{update} = \texttt{'U'}, :math:\mathrm{d} need not be set.

**Returns**
**sx** : float, ndarray, shape :math:\left(l\right)
If :math:\mathrm{scal} = \texttt{'S'} and :math:\mathrm{isv}[k-1] > 0 then :math:\mathrm{sx}[k-1] contains the standard deviation of the variable in the :math:k\ th column of :math:\mathrm{x}.

If :math:\mathrm{scal} = \texttt{'R'} and :math:\mathrm{isv}[k-1] > 0, :math:\mathrm{sx}[k-1] contains the range of the variable in the :math:j\ th column of :math:\mathrm{x}.

If :math:\mathrm{scal} = \texttt{'U'} and :math:\mathrm{isv}[k-1] > 0, :math:\mathrm{sx}[k-1] = 1.0.

If :math:\mathrm{scal} = \texttt{'G'}, :math:\mathrm{sx} is unchanged.

**sy** : None or float, ndarray, shape :math:\left(:\right)
If :math:\mathrm{scal} = \texttt{'S'} and :math:\mathrm{stype} = \texttt{'I'} and :math:\mathrm{isv}[k-1] > 0 then :math:\mathrm{sy}[k-1] contains the standard deviation of the variable in the :math:k\ th column of :math:\mathrm{x}.

If :math:\mathrm{scal} = \texttt{'R'} and :math:\mathrm{stype} = \texttt{'I'} and :math:\mathrm{isv}[k-1] > 0, :math:\mathrm{sy}[k-1] contains the range of the variable in the :math:j\ th column of :math:\mathrm{x}.

If :math:\mathrm{scal} = \texttt{'U'}, :math:\mathrm{sy} is unchanged.

If :math:\mathrm{scal} = \texttt{'G'}, :math:\mathrm{sy} is unchanged.

**d** : float, ndarray, shape :math:\left(m, n\right)
The (possibly updated) distance matrix :math:D.

.. _g03eb-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{update} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{update} = \texttt{'U'} or :math:\texttt{'I'}.

(errno :math:2)
On entry, :math:\mathrm{scal} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{scal} = \texttt{'S'}, :math:\texttt{'R'}, :math:\texttt{'G'} or :math:\texttt{'U'}.

(errno :math:3)
On entry, :math:\mathrm{stype} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{stype} = \texttt{'A'}, :math:\texttt{'I'} or :math:\texttt{'X'}.

(errno :math:4)
On entry, :math:\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{p}\geq 1.0.

(errno :math:5)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq 1.

(errno :math:6)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq 1.

(errno :math:7)
On entry, :math:l = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:l\geq 1.

(errno :math:8)
Variable :math:\langle\mathit{\boldsymbol{value}}\rangle is constant.

(errno :math:10)
On entry, :math:\mathrm{isv} does not contain a positive element.

(errno :math:13)
On entry, at least one element of :math:\mathrm{sx}\leq 0.0 or :math:\mathrm{sy}\leq 0.0.

(errno :math:13)
On entry, at least one element of :math:\mathrm{sx}\leq 0.0.

(errno :math:15)
On entry, at least one element of :math:\mathrm{d} < 0.0.

.. _g03eb-py2-py-notes:

**Notes**
Given two sets of observations on :math:l variables, a distance matrix is such that the :math:ij\ th element represents how far apart or how dissimilar the :math:i\ th observation from the first set and the :math:j\ th observation from the second set are.

Let :math:X and :math:Y be :math:m\times l and :math:n\times l data matrices of :math:m and :math:n observations, respectively, on :math:l variables.
The distance between observation :math:i from :math:X and observation :math:j from :math:Y, :math:d_{{ij}}, is most commonly defined in terms of the scaled Minkowski :math:p-norm:

.. math::
d_{{ij}} = \left\{\sum_{{k = 1}}^p\left(\left\lvert x_{{ik}}/\textit{sx}_i-y_{{jk}}/\textit{sy}_j\right\rvert \right)^p\right\}^{{1/p}}\text{,}

where :math:x_{{ik}} and :math:y_{{jk}} are the :math:ik\ th and :math:jk\ th elements of :math:X and :math:Y respectively, :math:\textit{sx}_i is a standardization for the :math:i\ th variable in :math:X, :math:\textit{sy}_j is a standardization for the :math:j\ th variable in :math:Y, and :math:p is the order of the Minkowski norm.

Three standardizations (scalings) for the variables are available.

(a) Standard deviation: :math:\textit{sx}_i = \sqrt{\sum_{{k = 1}}^n\left(x_{{ji}}-\bar{x}\right)^2/\left(n-1\right)}

(#) Range: :math:\textit{sx}_i = \mathrm{max}\left(x_{{1i}}, x_{{2i}}, \ldots, x_{{ni}}\right)-\mathrm{min}\left(x_{{1i}}, x_{{2i}}, \ldots, x_{{ni}}\right)

(#) User-supplied values of :math:\textit{sx}_i.

In addition to the Minkowski measure there are a large number of other dissimilarity measures, particularly for dichotomous variables (see Krzanowski (1990) and Everitt (1974)).
For the dichotomous case these measures are simple to compute and can, if suitable scaling is used, be combined with the distances computed by distance_mat_2 using the updating option.

Dissimilarity measures for variables can be based on the correlation coefficient for continuous variables and contingency table statistics for dichotomous data, see submodule :mod:~naginterfaces.library.correg and submodule :mod:~naginterfaces.library.contab respectively.

distance_mat_2 returns the full rectangular distance matrix.

.. _g03eb-py2-py-references:

**References**
Everitt, B S, 1974, Cluster Analysis, Heinemann

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press
"""
raise NotImplementedError

[docs]def cluster_hier(method, n, d):
r"""
cluster_hier performs hierarchical cluster analysis.

.. _g03ec-py2-py-doc:

For full information please refer to the NAG Library document for g03ec

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03ecf.html

.. _g03ec-py2-py-parameters:

**Parameters**
**method** : int
Indicates which clustering method is used.

:math:\mathrm{method} = 1

:math:\mathrm{method} = 2

:math:\mathrm{method} = 3

Group average.

:math:\mathrm{method} = 4

Centroid.

:math:\mathrm{method} = 5

Median.

:math:\mathrm{method} = 6

Minimum variance.

**n** : int
:math:n, the number of objects.

**d** : float, array-like, shape :math:\left(\mathrm{n}\times \left(\mathrm{n}-1\right)/2\right)
The strictly lower triangle of the distance matrix. :math:D must be stored packed by rows, i.e., :math:\mathrm{d}[ \left(i-1\right) \left(i-2\right) /2+j -1], :math:i > j must contain :math:d_{{ij}}.

**Returns**
**d** : float, ndarray, shape :math:\left(\mathrm{n}\times \left(\mathrm{n}-1\right)/2\right)
Is overwritten.

**ilc** : int, ndarray, shape :math:\left(\mathrm{n}-1\right)
:math:\mathrm{ilc}[\textit{l}-1] contains the number, :math:j, of the cluster merged with cluster :math:k (see :math:\mathrm{iuc}), :math:j < k, at step :math:\textit{l}, for :math:\textit{l} = 1,2,\ldots,n-1.

**iuc** : int, ndarray, shape :math:\left(\mathrm{n}-1\right)
:math:\mathrm{iuc}[\textit{l}-1] contains the number, :math:k, of the cluster merged with cluster :math:j, :math:j < k, at step :math:\textit{l}, for :math:\textit{l} = 1,2,\ldots,n-1.

**cd** : float, ndarray, shape :math:\left(\mathrm{n}-1\right)
:math:\mathrm{cd}[\textit{l}-1] contains the distance :math:d_{{jk}}, between clusters :math:j and :math:k, :math:j < k, merged at step :math:\textit{l}, for :math:\textit{l} = 1,2,\ldots,n-1.

**iord** : int, ndarray, shape :math:\left(\mathrm{n}\right)
The objects in dendrogram order.

**dord** : float, ndarray, shape :math:\left(\mathrm{n}\right)
The clustering distances corresponding to the order in :math:\mathrm{iord}. :math:\mathrm{dord}[\textit{l}-1] contains the distance at which cluster :math:\mathrm{iord}[\textit{l}-1] and :math:\mathrm{iord}[\textit{l}] merge, for :math:\textit{l} = 1,2,\ldots,n-1. :math:\mathrm{dord}[n-1] contains the maximum distance.

.. _g03ec-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{n}\geq 2.

(errno :math:1)
On entry, :math:\mathrm{method} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{method} = 1, :math:2, :math:3, :math:4, :math:5 or :math:6.

(errno :math:2)
On entry, at least one element of :math:\mathrm{d} is negative.

(errno :math:3)
Minimum cluster distance not increasing, dendrogram invalid.

.. _g03ec-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Given a distance or dissimilarity matrix for :math:n objects (see :meth:distance_mat), cluster analysis aims to group the :math:n objects into a number of more or less homogeneous groups or clusters.
With agglomerative clustering methods, a hierarchical tree is produced by starting with :math:n clusters, each with a single object and then at each of :math:n-1 stages, merging two clusters to form a larger cluster, until all objects are in a single cluster.
This process may be represented by a dendrogram (see :meth:cluster_hier_dendrogram).

At each stage, the clusters that are nearest are merged, methods differ as to how the distances between the new cluster and other clusters are computed.
For three clusters :math:i, :math:j and :math:k let :math:n_i, :math:n_j and :math:n_k be the number of objects in each cluster and let :math:d_{{ij}}, :math:d_{{ik}} and :math:d_{{jk}} be the distances between the clusters.
Let clusters :math:j and :math:k be merged to give cluster :math:jk, then the distance from cluster :math:i to cluster :math:jk, :math:d_{{i.jk}} can be computed in the following ways.

(1) Single link or nearest neighbour : :math:d_{{i.jk}} = \mathrm{min}\left(d_{{ij}}, d_{{ik}}\right).

(#) Complete link or furthest neighbour : :math:d_{{i.jk}} = \mathrm{max}\left(d_{{ij}}, d_{{ik}}\right).

(#) Group average : :math:d_{{i.jk}} = \frac{n_j}{{n_j+n_k}}d_{{ij}}+\frac{n_k}{{n_j+n_k}}d_{{ik}}.

(#) Centroid : :math:d_{{i.jk}} = \frac{n_j}{{n_j+n_k}}d_{{ij}}+\frac{n_k}{{n_j+n_k}}d_{{ik}}-\frac{{n_jn_k}}{\left(n_j+n_k\right)^2}d_{{jk}}.

(#) Median : :math:d_{{i.jk}} = \frac{1}{2}d_{{ij}}+\frac{1}{2}d_{{ik}}-\frac{1}{4}d_{{jk}}.

(#) Minimum variance : :math:d_{{i.jk}} = \left\{\left(n_i+n_j\right)d_{{ij}}+\left(n_i+n_k\right)d_{{ik}}-n_id_{{jk}}\right\}/\left(n_i+n_j+n_k\right).

For further details see Everitt (1974) and Krzanowski (1990).

If the clusters are numbered :math:1,2,\ldots,n then, for convenience, if clusters :math:j and :math:k, :math:j < k, merge then the new cluster will be referred to as cluster :math:j.
Information on the clustering history is given by the values of :math:j, :math:k and :math:d_{{jk}} for each of the :math:n-1 clustering steps.
In order to produce a dendrogram, the ordering of the objects such that the clusters that merge are adjacent is required.
This ordering is computed so that the first element is :math:1.
The associated distances with this ordering are also computed.

.. _g03ec-py2-py-references:

**References**
Everitt, B S, 1974, Cluster Analysis, Heinemann

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press
"""
raise NotImplementedError

[docs]def cluster_kmeans(x, isx, cmeans, wt=None, maxit=10):
r"""
cluster_kmeans performs :math:K-means cluster analysis.

.. _g03ef-py2-py-doc:

For full information please refer to the NAG Library document for g03ef

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03eff.html

.. _g03ef-py2-py-parameters:

**Parameters**
**x** : float, array-like, shape :math:\left(n, m\right)
:math:\mathrm{x}[\textit{i}-1,\textit{j}-1] must contain the value of the :math:\textit{j}\ th variable for the :math:\textit{i}\ th object, for :math:\textit{j} = 1,2,\ldots,m, for :math:\textit{i} = 1,2,\ldots,n.

**isx** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isx}[\textit{j}-1] indicates whether or not the :math:\textit{j}\ th variable is to be included in the analysis. If :math:\mathrm{isx}[\textit{j}-1] > 0, the variable contained in the :math:\textit{j}\ th column of :math:\mathrm{x} is included, for :math:\textit{j} = 1,2,\ldots,m.

**cmeans** : float, array-like, shape :math:\left(k, \textit{nvar}\right)
:math:\mathrm{cmeans}[\textit{i}-1,\textit{j}-1] must contain the value of the :math:\textit{j}\ th variable for the :math:\textit{i}\ th initial cluster centre, for :math:\textit{j} = 1,2,\ldots,p, for :math:\textit{i} = 1,2,\ldots,K.

**wt** : None or float, array-like, shape :math:\left(n\right), optional
If :math:\mathrm{wt}\text{ is not }\mathbf{None}, the first :math:n elements of :math:\mathrm{wt} must contain the weights to be used.

If :math:\mathrm{wt}[i-1] = 0.0, the :math:i\ th observation is not included in the analysis.

The effective number of observation is the sum of the weights.

If :math:\mathrm{wt}\text{ is }\mathbf{None}, :math:\mathrm{wt} is not referenced and the effective number of observations is :math:n.

**maxit** : int, optional
The maximum number of iterations allowed in the analysis.

**Returns**
**cmeans** : float, ndarray, shape :math:\left(k, \textit{nvar}\right)
:math:\mathrm{cmeans}[\textit{i}-1,\textit{j}-1] contains the value of the :math:\textit{j}\ th variable for the :math:\textit{i}\ th computed cluster centre, for :math:\textit{j} = 1,2,\ldots,p, for :math:\textit{i} = 1,2,\ldots,K.

**inc** : int, ndarray, shape :math:\left(n\right)
:math:\mathrm{inc}[\textit{i}-1] contains the cluster to which the :math:\textit{i}\ th object has been allocated, for :math:\textit{i} = 1,2,\ldots,n.

**nic** : int, ndarray, shape :math:\left(k\right)
:math:\mathrm{nic}[\textit{i}-1] contains the number of objects in the :math:\textit{i}\ th cluster, for :math:\textit{i} = 1,2,\ldots,K.

**css** : float, ndarray, shape :math:\left(k\right)
:math:\mathrm{css}[\textit{i}-1] contains the within-cluster (weighted) sum of squares of the :math:\textit{i}\ th cluster, for :math:\textit{i} = 1,2,\ldots,K.

**csw** : float, ndarray, shape :math:\left(k\right)
:math:\mathrm{csw}[\textit{i}-1] contains the within-cluster sum of weights of the :math:\textit{i}\ th cluster, for :math:\textit{i} = 1,2,\ldots,K. If :math:\textit{weight} = \texttt{'U'}, the sum of weights is the number of objects in the cluster.

.. _g03ef-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{maxit} > 0.

(errno :math:1)
On entry, :math:k = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:k\geq 2.

(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq \textit{nvar}.

(errno :math:1)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{nvar}\geq 1.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n > 1.

(errno :math:1)
On entry, :math:\textit{weight} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{weight} = \texttt{'U'} or :math:\texttt{'W'}.

(errno :math:2)
On entry, :math:\mathrm{wt} has less than two positive values.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{wt}[i-1] < 0.0.

Constraint: :math:\mathrm{wt}[i-1]\geq 0.0.

(errno :math:3)
On entry, :math:\textit{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\langle\mathit{\boldsymbol{value}}\rangle values of :math:\mathrm{isx} > 0.

Constraint: exactly :math:\textit{nvar} elements of :math:\mathrm{isx} > 0.

(errno :math:4)
At least one cluster is empty after the initial assignment.

(errno :math:5)
Convergence has not been achieved within the maximum number of iterations :math:\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle.

.. _g03ef-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Given :math:n objects with :math:p variables measured on each object, :math:x_{{\textit{i}\textit{j}}}, for :math:\textit{j} = 1,2,\ldots,p, for :math:\textit{i} = 1,2,\ldots,n, cluster_kmeans allocates each object to one of :math:K groups or clusters to minimize the within-cluster sum of squares:

.. math::
\sum_{{k = 1}}^K\sum_{{i \in S_k}}\sum_{{j = 1}}^p\left(x_{{ij}}-\bar{x}_{{kj}}\right)^2\text{,}

where :math:S_k is the set of objects in the :math:k\ th cluster and :math:\bar{x}_{{kj}} is the mean for the variable :math:j over cluster :math:k.
This is often known as :math:K-means clustering.

In addition to the data matrix, a :math:K\times p matrix giving the initial cluster centres for the :math:K clusters is required.
The objects are then initially allocated to the cluster with the nearest cluster mean.
Given the initial allocation, the procedure is to iteratively search for the :math:K-partition with locally optimal within-cluster sum of squares by moving points from one cluster to another.

Optionally, weights for each object, :math:w_i, can be used so that the clustering is based on within-cluster weighted sums of squares:

.. math::
\sum_{{k = 1}}^K\sum_{{i \in S_k}}\sum_{{j = 1}}^pw_i\left(x_{{ij}}-\tilde{x}_{{kj}}\right)^2\text{,}

where :math:\tilde{x}_{{kj}} is the weighted mean for variable :math:j over cluster :math:k.

The function is based on the algorithm of Hartigan and Wong (1979).

.. _g03ef-py2-py-references:

**References**
Everitt, B S, 1974, Cluster Analysis, Heinemann

Hartigan, J A and Wong, M A, 1979, Algorithm AS 136: A K-means clustering algorithm, Appl. Statist. (28), 100--108

Kendall, M G and Stuart, A, 1976, The Advanced Theory of Statistics (Volume 3), (3rd Edition), Griffin

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press

--------
:meth:naginterfaces.library.examples.mv.cluster_kmeans_ex.main
"""
raise NotImplementedError

[docs]def cluster_hier_dendrogram(orient, dord, dmin, dstep, nsym):
r"""
cluster_hier_dendrogram produces a dendrogram from the results of :meth:cluster_hier.

.. _g03eh-py2-py-doc:

For full information please refer to the NAG Library document for g03eh

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03ehf.html

.. _g03eh-py2-py-parameters:

**Parameters**
**orient** : str, length 1
Indicates which orientation the dendrogram is to take.

:math:\mathrm{orient} = \texttt{'N'}

The end points of the dendrogram are to the north.

:math:\mathrm{orient} = \texttt{'S'}

The end points of the dendrogram are to the south.

:math:\mathrm{orient} = \texttt{'E'}

The end points of the dendrogram are to the east.

:math:\mathrm{orient} = \texttt{'W'}

The end points of the dendrogram are to the west.

**dord** : float, array-like, shape :math:\left(n\right)
The array :math:\mathrm{dord} as output by :meth:cluster_hier. :math:\mathrm{dord} contains the distances, in dendrogram order, at which clustering takes place.

**dmin** : float
The clustering distance at which the dendrogram begins.

**dstep** : float
The distance represented by one symbol of the dendrogram.

**nsym** : int
The number of character positions used in the dendrogram. Hence the clustering distance at which the dendrogram terminates is given by :math:\mathrm{dmin}+\mathrm{nsym}\times \mathrm{dstep}.

**Returns**
**c** : str, ndarray, shape :math:\left(:\right)
The elements of :math:\mathrm{c} contain consecutive lines of the dendrogram.

.. _g03eh-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, the number of characters that can be stored in each element of :math:\mathrm{c} is insufficient for the requested orientation.

(errno :math:1)
On entry, :math:\mathrm{orient} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{orient} = \texttt{'N'}, :math:\texttt{'S'}, :math:\texttt{'E'} or :math:\texttt{'W'}.

(errno :math:1)
On entry, :math:\mathrm{dstep} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{dstep} > 0.0.

(errno :math:1)
On entry, :math:\mathrm{dmin} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{dmin}\geq 0.0.

(errno :math:1)
On entry, :math:\mathrm{nsym} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nsym}\geq 1.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n > 2.

(errno :math:2)
On entry, :math:\mathrm{dord}[i-1] > \mathrm{dord}[n-1], :math:i < n.

.. _g03eh-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Hierarchical cluster analysis, as performed by :meth:cluster_hier, can be represented by a tree that shows at which distance the clusters merge.
Such a tree is known as a dendrogram.
See Everitt (1974) and Krzanowski (1990) for examples of dendrograms.
A simple example is,

[figure omitted]

The end points of the dendrogram represent the objects that have been clustered.
They should be in a suitable order as given by :meth:cluster_hier.
Object :math:1 is always the first object.
In the example above the height represents the distance at which the clusters merge.

The dendrogram is produced in a character array using the ordering and distances provided by :meth:cluster_hier.
Suitable characters are used to represent parts of the tree.

There are four possible orientations for the dendrogram.
The example above has the end points at the bottom of the diagram which will be referred to as south.
If the dendrogram was the other way around with the end points at the top of the diagram then the orientation would be north.
If the end points are at the left-hand or right-hand side of the diagram the orientation is west or east.
Different symbols are used for east/west and north/south orientations.

.. _g03eh-py2-py-references:

**References**
Everitt, B S, 1974, Cluster Analysis, Heinemann

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press
"""
raise NotImplementedError

[docs]def cluster_hier_indicator(cd, iord, dord, k, dlevel):
r"""
cluster_hier_indicator computes a cluster indicator variable from the results of :meth:cluster_hier.

.. _g03ej-py2-py-doc:

For full information please refer to the NAG Library document for g03ej

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03ejf.html

.. _g03ej-py2-py-parameters:

**Parameters**
**cd** : float, array-like, shape :math:\left(n-1\right)
The clustering distances in increasing order as returned by :meth:cluster_hier.

**iord** : int, array-like, shape :math:\left(n\right)
The objects in dendrogram order as returned by :meth:cluster_hier.

**dord** : float, array-like, shape :math:\left(n\right)
The clustering distances corresponding to the order in :math:\mathrm{iord}.

**k** : int
Indicates if a specified number of clusters is required.

If :math:\mathrm{k} > 0 then cluster_hier_indicator will attempt to find :math:\mathrm{k} clusters.

If :math:\mathrm{k}\leq 0 then cluster_hier_indicator will find the clusters based on the distance given in :math:\mathrm{dlevel}.

**dlevel** : float
If :math:\mathrm{k}\leq 0, :math:\mathrm{dlevel} must contain the distance at which clusters are produced. Otherwise :math:\mathrm{dlevel} need not be set.

**Returns**
**k** : int
The number of clusters produced, :math:k.

**dlevel** : float
If :math:\mathrm{k} > 0 on entry, :math:\mathrm{dlevel} contains the distance at which the required number of clusters are found. Otherwise :math:\mathrm{dlevel} remains unchanged.

**ic** : int, ndarray, shape :math:\left(n\right)
:math:\mathrm{ic}[\textit{i}-1] indicates to which of :math:k clusters the :math:\textit{i}\ th object belongs, for :math:\textit{i} = 1,2,\ldots,n.

.. _g03ej-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq 2.

(errno :math:1)
On entry, :math:\mathrm{k} < 0 and :math:\mathrm{dlevel}\leq 0.0.

(errno :math:1)
On entry, :math:\mathrm{k} = \langle\mathit{\boldsymbol{value}}\rangle and :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{k}\leq n.

(errno :math:2)
On entry the values of :math:\mathrm{dord} and :math:\mathrm{cd} are not compatible.

(errno :math:2)
On entry the values of :math:\mathrm{cd} are not in increasing order.

**Warns**
**NagAlgorithmicWarning**
(errno :math:3)
No clustering is performed when :math:\mathrm{k} = n.

(errno :math:3)
All data is merged when :math:\mathrm{k} = 1.

(errno :math:3)
All data merged into one cluster at :math:\mathrm{dlevel}, :math:\mathrm{dlevel} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:3)
No clustering takes place below :math:\mathrm{dlevel}, :math:\mathrm{dlevel} = \langle\mathit{\boldsymbol{value}}\rangle.

(errno :math:4)
The precise number of clusters requested is not possible because of tied clustering distances.

.. _g03ej-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

Given a distance or dissimilarity matrix for :math:n objects, cluster analysis aims to group the :math:n objects into a number of more or less homogeneous groups or clusters.
With agglomerative clustering methods (see :meth:cluster_hier), a hierarchical tree is produced by starting with :math:n clusters each with a single object and then at each of :math:n-1 stages, merging two clusters to form a larger cluster until all objects are in a single cluster. cluster_hier_indicator takes the information from the tree and produces the clusters that exist at a given distance.
This is equivalent to taking the dendrogram (see :meth:cluster_hier_dendrogram) and drawing a line across at a given distance to produce clusters.

As an alternative to giving the distance at which clusters are required, you can specify the number of clusters required and cluster_hier_indicator will compute the corresponding distance.
However, it may not be possible to compute the number of clusters required due to ties in the distance matrix.

If there are :math:k clusters then the indicator variable will assign a value between :math:1 and :math:k to each object to indicate to which cluster it belongs.
Object :math:1 always belongs to cluster :math:1.

.. _g03ej-py2-py-references:

**References**
Everitt, B S, 1974, Cluster Analysis, Heinemann

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press
"""
raise NotImplementedError

[docs]def multidimscal_metric(roots, n, d, ndim):
r"""
multidimscal_metric performs a principal coordinate analysis also known as classical metric scaling.

.. _g03fa-py2-py-doc:

For full information please refer to the NAG Library document for g03fa

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03faf.html

.. _g03fa-py2-py-parameters:

**Parameters**
**roots** : str, length 1
Indicates if all the eigenvalues are to be computed or just the :math:\mathrm{ndim} largest.

:math:\mathrm{roots} = \texttt{'A'}

All the eigenvalues are computed.

:math:\mathrm{roots} = \texttt{'L'}

Only the largest :math:\mathrm{ndim} eigenvalues are computed.

**n** : int
:math:n, the number of objects in the distance matrix.

**d** : float, array-like, shape :math:\left(\mathrm{n}\times \left(\mathrm{n}-1\right)/2\right)
The lower triangle of the distance matrix :math:D stored packed by rows. That is :math:\mathrm{d}[ \left(i-1\right) \times \left(i-2\right) /2+j -1] must contain :math:d_{{ij}} for :math:i = 2,3,\ldots,n\text{;}j = 1,2,\ldots,i-1.

**ndim** : int
:math:k, the number of dimensions used to represent the data.

**Returns**
**x** : float, ndarray, shape :math:\left(\mathrm{n}, \mathrm{ndim}\right)
The :math:i\ th row contains :math:k coordinates for the :math:i\ th point, :math:i = 1,2,\ldots,n.

**eigval** : float, ndarray, shape :math:\left(\mathrm{n}\right)
If :math:\mathrm{roots} = \texttt{'A'}, :math:\mathrm{eigval} contains the :math:n scaled eigenvalues of the matrix :math:E.

If :math:\mathrm{roots} = \texttt{'L'}, :math:\mathrm{eigval} contains the largest :math:k scaled eigenvalues of the matrix :math:E.

In both cases the eigenvalues are divided by the sum of the eigenvalues (that is, the trace of :math:E).

.. _g03fa-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{roots} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{roots} = \texttt{'A'} or :math:\texttt{'L'}.

(errno :math:1)
On entry, :math:\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{ndim} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{n} > \mathrm{ndim}.

(errno :math:1)
On entry, :math:\mathrm{ndim} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{ndim}\geq 1.

(errno :math:2)
On entry, all the elements of :math:\mathrm{d} = 0.0.

(errno :math:2)
On entry, :math:i = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{d}[i-1] < 0.0.

Constraint: :math:\mathrm{d}[i-1]\geq 0.0.

(errno :math:3)
There are less than :math:\mathrm{ndim} eigenvalues greater than zero.

(errno :math:4)
The computation of the eigenvalues or eigenvectors has failed.

.. _g03fa-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

For a set of :math:n objects a distance matrix :math:D can be calculated such that :math:d_{{ij}} is a measure of how 'far apart' are objects :math:i and :math:j (see :meth:distance_mat for example).
Principal coordinate analysis or metric scaling starts with a distance matrix and finds points :math:X in Euclidean space such that those points have the same distance matrix.
The aim is to find a small number of dimensions, :math:k≪\left(n-1\right), that provide an adequate representation of the distances.

The principal coordinates of the points are computed from the eigenvectors of the matrix :math:E where :math:e_{{ij}} = -1/2\left(d_{{ij}}^2-d_{{i.}}^2-d_{{.j}}^2+d_{{..}}^2\right) with :math:d_{{i.}}^2 denoting the average of :math:d_{{ij}}^2 over the suffix :math:j, etc..
The eigenvectors are then scaled by multiplying by the square root of the value of the corresponding eigenvalue.

Provided that the ordered eigenvalues, :math:\lambda_i, of the matrix :math:E are all positive, :math:\sum_{{i = 1}}^k\lambda_i/\sum_{{i = 1}}^{{n-1}}\lambda_i shows how well the data is represented in :math:k dimensions.
The eigenvalues will be non-negative if :math:E is positive semidefinite.
This will be true provided :math:d_{{ij}} satisfies the inequality: :math:d_{{ij}}\leq d_{{ik}}+d_{{jk}} for all :math:i,j,k.
If this is not the case the size of the negative eigenvalue reflects the amount of deviation from this condition and the results should be treated cautiously in the presence of large negative eigenvalues.
See Krzanowski (1990) for further discussion. multidimscal_metric provides the option for all eigenvalues to be computed so that the smallest eigenvalues can be checked.

.. _g03fa-py2-py-references:

**References**
Chatfield, C and Collins, A J, 1980, Introduction to Multivariate Analysis, Chapman and Hall

Gower, J C, 1966, Some distance properties of latent root and vector methods used in multivariate analysis, Biometrika (53), 325--338

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press
"""
raise NotImplementedError

[docs]def multidimscal_ordinal(d, x, typ='T', itera=0, iopt=0, io_manager=None):
r"""
multidimscal_ordinal performs non-metric (ordinal) multidimensional scaling.

.. _g03fc-py2-py-doc:

For full information please refer to the NAG Library document for g03fc

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03fcf.html

.. _g03fc-py2-py-parameters:

**Parameters**
**d** : float, array-like, shape :math:\left(n\times \left(n-1\right)/2\right)
The lower triangle of the distance matrix :math:D stored packed by rows. That is :math:\mathrm{d}[ \left(\textit{i}-1\right) \times \left(\textit{i}-2\right) /2+\textit{j} -1] must contain :math:d_{{\textit{i}\textit{j}}}, for :math:\textit{j} = 1,2,\ldots,\textit{i}-1, for :math:\textit{i} = 2,3,\ldots,n. If :math:d_{{ij}} is missing then set :math:d_{{ij}} < 0; for further comments on missing values see Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03fcf.html#fcomments>__.

**x** : float, array-like, shape :math:\left(n, \textit{ndim}\right)
The :math:\textit{i}\ th row must contain an initial estimate of the coordinates for the :math:\textit{i}\ th point, for :math:\textit{i} = 1,2,\ldots,n. One method of computing these is to use :meth:multidimscal_metric.

**typ** : str, length 1, optional
Indicates whether :math:\textit{STRESS} or :math:\textit{SSTRESS} is to be used as the criterion.

:math:\mathrm{typ} = \texttt{'T'}

:math:\textit{STRESS} is used.

:math:\mathrm{typ} = \texttt{'S'}

:math:\textit{SSTRESS} is used.

**itera** : int, optional
The maximum number of iterations in the optimization process.

:math:\mathrm{itera} = 0

A default value of :math:50 is used.

:math:\mathrm{itera} < 0

A default value of :math:\mathrm{max}\left(50, {5nm}\right) (the default for :meth:opt.uncon_conjgrd_comp <naginterfaces.library.opt.uncon_conjgrd_comp>) is used.

**iopt** : int, optional
Selects the options, other than the number of iterations, that control the optimization.

:math:\mathrm{iopt} = 0

The tolerance :math:\epsilon is set to :math:0.00001 (Accuracy <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03fcf.html#accuracy>__). All other values are set as described in Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03fcf.html#fcomments>__.

:math:\mathrm{iopt} > 0

The tolerance :math:\epsilon is set to :math:10^{{-i}} where :math:i = \mathrm{iopt}. All other values are set as described in Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03fcf.html#fcomments>__.

:math:\mathrm{iopt} < 0

No values are changed, therefore, the default values of :meth:opt.uncon_conjgrd_comp <naginterfaces.library.opt.uncon_conjgrd_comp> are used.

**io_manager** : FileObjManager, optional
Manager for I/O in this routine.

**Returns**
**x** : float, ndarray, shape :math:\left(n, \textit{ndim}\right)
The :math:\textit{i}\ th row contains :math:m coordinates for the :math:\textit{i}\ th point, for :math:\textit{i} = 1,2,\ldots,n.

**stress** : float
The value of :math:\textit{STRESS} or :math:\textit{SSTRESS} at the final iteration.

**dfit** : float, ndarray, shape :math:\left(2\times n\times \left(n-1\right)\right)
Auxiliary outputs.

If :math:\mathrm{typ} = \texttt{'T'}, the first :math:n\left(n-1\right)/2 elements contain the distances, :math:\hat{d_{{ij}}}, for the points returned in :math:\mathrm{x}, the second set of :math:n\left(n-1\right)/2 contains the distances :math:\hat{d_{{ij}}} ordered by the input distances, :math:d_{{ij}}, the third set of :math:n\left(n-1\right)/2 elements contains the monotonic distances, :math:\tilde{d_{{ij}}}, ordered by the input distances, :math:d_{{ij}} and the final set of :math:n\left(n-1\right)/2 elements contains fitted monotonic distances, :math:\tilde{d_{{ij}}}, for the points in :math:\mathrm{x}.

The :math:\tilde{d_{{ij}}} corresponding to distances which are input as missing are set to zero.

If :math:\mathrm{typ} = \texttt{'S'}, the results are as above except that the squared distances are returned.

Each distance matrix is stored in lower triangular packed form in the same way as the input matrix :math:D.

.. _g03fc-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:\mathrm{typ} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{typ} = \texttt{'S'} or :math:\texttt{'T'}.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:\textit{ndim} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n > \textit{ndim}.

(errno :math:1)
On entry, :math:\textit{ndim} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\textit{ndim}\geq 1.

(errno :math:2)
On entry, all the elements of :math:\mathrm{d}\leq 0.0.

(errno :math:3)
The optimization has failed to converge in :math:\mathrm{itera} function iterations.

(errno :math:5)
The optimization cannot begin from the initial configuration.

(errno :math:6)
The optimization has failed.

**Warns**
**NagAlgorithmicWarning**
(errno :math:4)
The conditions for an acceptable solution have not been met but a lower point could not be found.

.. _g03fc-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

For a set of :math:n objects, a distance or dissimilarity matrix :math:D can be calculated such that :math:d_{{ij}} is a measure of how 'far apart' the objects :math:i and :math:j are.
If :math:p variables :math:x_k have been recorded for each observation this measure may be based on Euclidean distance, :math:d_{{ij}} = \sum_{{k = 1}}^p\left(x_{{ki}}-x_{{kj}}\right)^2, or some other calculation such as the number of variables for which :math:x_{{kj}}\neq x_{{ki}}.
Alternatively, the distances may be the result of a subjective assessment.
For a given distance matrix, multidimensional scaling produces a configuration of :math:n points in a chosen number of dimensions, :math:m, such that the distance between the points in some way best matches the distance matrix.
For some distance measures, such as Euclidean distance, the size of distance is meaningful, for other measures of distance all that can be said is that one distance is greater or smaller than another.
For the former metric scaling can be used, see :meth:multidimscal_metric, for the latter, a non-metric scaling is more appropriate.

For non-metric multidimensional scaling, the criterion used to measure the closeness of the fitted distance matrix to the observed distance matrix is known as :math:\textit{STRESS}. :math:\textit{STRESS} is given by,

.. math::
\sqrt{\frac{{\sum_{{i = 1}}^n\sum_{{j = 1}}^{{i-1}}\left(\hat{d_{{ij}}}-\tilde{d_{{ij}}}\right)^2}}{{\sum_{{i = 1}}^n\sum_{{j = 1}}^{{i-1}}\hat{d_{{ij}}}^2}}}

where :math:\hat{d_{{ij}}}^2 is the Euclidean squared distance between points :math:i and :math:j and :math:\tilde{d_{{ij}}} is the fitted distance obtained when :math:\hat{d_{{ij}}} is monotonically regressed on :math:d_{{ij}}, that is :math:\tilde{d_{{ij}}} is monotonic relative to :math:d_{{ij}} and is obtained from :math:\hat{d_{{ij}}} with the smallest number of changes.
So :math:\textit{STRESS} is a measure of by how much the set of points preserve the order of the distances in the original distance matrix.
Non-metric multidimensional scaling seeks to find the set of points that minimize the :math:\textit{STRESS}.

An alternate measure is :math:\textit{SSTRESS},

.. math::
\sqrt{\frac{{\sum_{{i = 1}}^n\sum_{{j = 1}}^{{i-1}}\left(\hat{d_{{ij}}}^2-\tilde{d_{{ij}}}^2\right)^2}}{{\sum_{{i = 1}}^n\sum_{{j = 1}}^{{i-1}}\hat{d_{{ij}}}^4}}}

in which the distances in :math:\textit{STRESS} are replaced by squared distances.

In order to perform a non-metric scaling, an initial configuration of points is required.
This can be obtained from principal coordinate analysis, see :meth:multidimscal_metric.
Given an initial configuration, multidimscal_ordinal uses the optimization function :meth:opt.uncon_conjgrd_comp <naginterfaces.library.opt.uncon_conjgrd_comp> to find the configuration of points that minimizes :math:\textit{STRESS} or :math:\textit{SSTRESS}.
The function :meth:opt.uncon_conjgrd_comp <naginterfaces.library.opt.uncon_conjgrd_comp> uses a conjugate gradient algorithm. multidimscal_ordinal will find an optimum that may only be a local optimum, to be more sure of finding a global optimum several different initial configurations should be used; these can be obtained by randomly perturbing the original initial configuration using functions from submodule :mod:~naginterfaces.library.rand.

.. _g03fc-py2-py-references:

**References**
Chatfield, C and Collins, A J, 1980, Introduction to Multivariate Analysis, Chapman and Hall

Krzanowski, W J, 1990, Principles of Multivariate Analysis, Oxford University Press
"""
raise NotImplementedError

[docs]def gaussian_mixture(x, nvar, ng, sopt, isx=None, prob=None, niter=15, riter=0, tol=0.0):
r"""
gaussian_mixture performs a mixture of Normals (Gaussians) for a given (co)variance structure.

.. _g03ga-py2-py-doc:

For full information please refer to the NAG Library document for g03ga

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03gaf.html

.. _g03ga-py2-py-parameters:

**Parameters**
**x** : float, array-like, shape :math:\left(n, m\right)
:math:\mathrm{x}[\textit{i}-1,\textit{j}-1] must contain the value of the :math:\textit{j}\ th variable for the :math:\textit{i}\ th object, for :math:\textit{j} = 1,2,\ldots,m, for :math:\textit{i} = 1,2,\ldots,n.

**nvar** : int
:math:p, the number of variables included in the calculations.

**ng** : int
:math:k, the number of groups in the mixture model.

**sopt** : int
Determines the (co)variance structure:

:math:\mathrm{sopt} = 1

Groupwise covariance matrices.

:math:\mathrm{sopt} = 2

Pooled covariance matrix.

:math:\mathrm{sopt} = 3

Groupwise variances.

:math:\mathrm{sopt} = 4

Pooled variances.

:math:\mathrm{sopt} = 5

Overall variance.

**isx** : None or int, array-like, shape :math:\left(:\right), optional
Note: the required length for this argument is determined as follows: if :math:\mathrm{nvar} < m: :math:m; otherwise: :math:0.

If :math:\mathrm{nvar} = m all available variables are included in the model and :math:\mathrm{isx} is not referenced; otherwise the :math:j\ th variable will be included in the analysis if :math:\mathrm{isx}[\textit{j}-1] = 1 and excluded if :math:\mathrm{isx}[\textit{j}-1] = 0, for :math:\textit{j} = 1,2,\ldots,m.

**prob** : None or float, array-like, shape :math:\left(n, \mathrm{ng}\right), optional
If :math:\mathrm{prob} is not **None**, :math:\mathrm{prob}[i-1,j-1] is the probability that the :math:i\ th object belongs to the :math:j\ th group. (These probabilities are normalised internally.) Otherwise, the initial membership probabilities are set internally.

**niter** : int, optional
The maximum number of iterations.

**riter** : int, optional
If :math:\mathrm{riter} > 0, membership probabilities are rounded to :math:0.0 or :math:1.0 after the completion of the first :math:\mathrm{riter} iterations.

**tol** : float, optional
Iterations cease the first time an improvement in log-likelihood is less than :math:\mathrm{tol}. If :math:\mathrm{tol}\leq 0 a value of :math:10^{-3} is used.

**Returns**
**prob** : float, ndarray, shape :math:\left(n, \mathrm{ng}\right)
:math:\mathrm{prob}[i-1,j-1] is the probability of membership of the :math:i\ th object to the :math:j\ th group for the fitted model.

**niter** : int
The number of completed iterations.

**w** : float, ndarray, shape :math:\left(\mathrm{ng}\right)
:math:w_j, the mixing probability for the :math:j\ th group.

**g** : float, ndarray, shape :math:\left(\mathrm{nvar}, \mathrm{ng}\right)
:math:\mathrm{g}[i-1,j-1] gives the estimated mean of the :math:i\ th variable in the :math:j\ th group.

**s** : float, ndarray, shape :math:\left(:, :, :\right)
If :math:\mathrm{sopt} = 1, :math:\mathrm{s}[i-1,j-1,k-1] gives the :math:\left(i, j\right)\ th element of the :math:k\ th group.

If :math:\mathrm{sopt} = 2, :math:\mathrm{s}[i-1,j-1,0] gives the :math:\left(i, j\right)\ th element of the pooled covariance.

If :math:\mathrm{sopt} = 3, :math:\mathrm{s}[j-1,k-1,0] gives the :math:j\ th variance in the :math:k\ th group.

If :math:\mathrm{sopt} = 4, :math:\mathrm{s}[j-1,0,0] gives the :math:j\ th pooled variance.

If :math:\mathrm{sopt} = 5, :math:\mathrm{s}[0,0,0] gives the overall variance.

**f** : float, ndarray, shape :math:\left(n, \mathrm{ng}\right)
:math:\mathrm{f}[i-1,j-1] gives the :math:p-variate Normal (Gaussian) density of the :math:i\ th object in the :math:j\ th group.

**loglik** : float
The log-likelihood for the fitted mixture model.

.. _g03ga-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle and :math:p = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n > p, the number of parameters, i.e., too few objects have been supplied for the model.

(errno :math:2)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq 1.

(errno :math:5)
On entry, :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:m = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:1\leq \mathrm{nvar}\leq m.

(errno :math:6)
On entry, :math:\mathrm{nvar}\neq m and :math:\mathrm{isx} is invalid.

(errno :math:7)
On entry, :math:\mathrm{ng} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{ng}\geq 1.

(errno :math:9)
On entry, row :math:\langle\mathit{\boldsymbol{value}}\rangle of supplied :math:\mathrm{prob} does not sum to :math:1.

(errno :math:11)
On entry, :math:\mathrm{niter} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{niter}\geq 1.

(errno :math:16)
On entry, :math:\mathrm{sopt} \neq 1, :math:2, :math:3, :math:4 or :math:5.

(errno :math:44)
A covariance matrix is not positive definite, try a different initial allocation.

(errno :math:45)
An iteration cannot continue due to an empty group, try a different initial allocation.

.. _g03ga-py2-py-notes:

**Notes**
A Normal (Gaussian) mixture model is a weighted sum of :math:k group Normal densities given by,

.. math::
p\left({x | w}, \mu, \Sigma \right) = \sum_{1}^{k}{w_j}g\left({x | \mu_j}, \Sigma_j\right)\text{, }\quad x \in \mathbb{R}^p

where:

:math:x is a :math:p-dimensional object of interest;

:math:w_j is the mixture weight for the :math:j\ th group and :math:\sum_{1}^{k}{w_j} = 1;

:math:\mu_j is a :math:p-dimensional vector of means for the :math:j\ th group;

:math:\Sigma_j is the covariance structure for the :math:j\ th group;

:math:g\left(·\right) is the :math:p-variate Normal density:

.. math::
g\left({x | \mu_j}, \Sigma_j\right) = \frac{1}{{\left(2\pi \right)^{{p/2}}\left\lvert \Sigma_j\right\rvert^{{1/2}}}}\mathrm{exp}\left[-\frac{1}{2}\left(x-\mu_j\right)\Sigma_j^{-1} \left(x-\mu_j\right)^\mathrm{T}\right]\text{.}

Optionally, the (co)variance structure may be pooled (common to all groups) or calculated for each group, and may be full or diagonal.

.. _g03ga-py2-py-references:

**References**
Hartigan, J A, 1975, Clustering Algorithms, Wiley

--------
:meth:naginterfaces.library.examples.mv.gaussian_mixture_ex.main
"""
raise NotImplementedError

[docs]def z_scores(x, nvar, isx, s, e):
r"""
z_scores produces standardized values (:math:z-scores) for a data matrix.

.. _g03za-py2-py-doc:

For full information please refer to the NAG Library document for g03za

https://www.nag.com/numeric/nl/nagdoc_29.2/flhtml/g03/g03zaf.html

.. _g03za-py2-py-parameters:

**Parameters**
**x** : float, array-like, shape :math:\left(n, m\right)
:math:\mathrm{x}[\textit{i}-1,\textit{j}-1] must contain the :math:\textit{i}\ th sample point for the :math:\textit{j}\ th variable, :math:x_{{\textit{i}\textit{j}}}, for :math:\textit{j} = 1,2,\ldots,m, for :math:\textit{i} = 1,2,\ldots,n.

**nvar** : int
:math:p, the number of variables to be standardized.

**isx** : int, array-like, shape :math:\left(m\right)
:math:\mathrm{isx}[j-1] indicates whether or not the observations on the :math:j\ th variable are included in the matrix of standardized values.

If :math:\mathrm{isx}[j-1]\neq 0, the observations from the :math:j\ th variable are included.

If :math:\mathrm{isx}[j-1] = 0, the observations from the :math:j\ th variable are not included.

**s** : float, array-like, shape :math:\left(m\right)
If :math:\mathrm{isx}[j-1]\neq 0, :math:\mathrm{s}[j-1] must contain the scaling (standard deviation), :math:\sigma_j, for the :math:j\ th variable.

If :math:\mathrm{isx}[j-1] = 0, :math:\mathrm{s}[j-1] is not referenced.

**e** : float, array-like, shape :math:\left(m\right)
If :math:\mathrm{isx}[j-1]\neq 0, :math:\mathrm{e}[j-1] must contain the location shift (mean), :math:\mu_j, for the :math:j\ th variable.

If :math:\mathrm{isx}[j-1] = 0, :math:\mathrm{e}[j-1] is not referenced.

**Returns**
**z** : float, ndarray, shape :math:\left(n, \mathrm{nvar}\right)
The matrix of standardized values (:math:z-scores), :math:Z.

.. _g03za-py2-py-errors:

**Raises**
**NagValueError**
(errno :math:1)
On entry, :math:m = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:m\geq \mathrm{nvar}.

(errno :math:1)
On entry, :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:\mathrm{nvar}\geq 1.

(errno :math:1)
On entry, :math:n = \langle\mathit{\boldsymbol{value}}\rangle.

Constraint: :math:n\geq 1.

(errno :math:2)
On entry, :math:\mathrm{nvar} = \langle\mathit{\boldsymbol{value}}\rangle and :math:\langle\mathit{\boldsymbol{value}}\rangle values of :math:\mathrm{isx} > 0.

Constraint: exactly :math:\mathrm{nvar} elements of :math:\mathrm{isx} > 0.

(errno :math:3)
On entry, :math:j = \langle\mathit{\boldsymbol{value}}\rangle and :math:\mathrm{s}[i-1]\leq 0.0.

Constraint: :math:\mathrm{s}[\textit{j}] > 0.0.

.. _g03za-py2-py-notes:

**Notes**
In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

For a data matrix, :math:X, consisting of :math:n observations on :math:p variables, with elements :math:x_{{ij}}, z_scores computes a matrix, :math:Z, with elements :math:z_{{ij}} such that:

.. math::
where :math:\mu_j is a location shift and :math:\sigma_j is a scaling factor.
Typically, :math:\mu_j will be the mean and :math:\sigma_j will be the standard deviation of the :math:j\ th variable and, therefore, the elements in column :math:j of :math:Z will have zero mean and unit variance.