library.mv
Submodule¶
Module Summary¶
Interfaces for the NAG Mark 30.2 mv Chapter.
mv
- Multivariate Methods
This module is concerned with methods for studying multivariate data. A multivariate dataset consists of several variables recorded on a number of objects or individuals. Multivariate methods can be classified as those that seek to examine the relationships between the variables (e.g., principal components), known as variable-directed methods, and those that seek to examine the relationships between the objects (e.g., cluster analysis), known as individual-directed methods.
Multiple regression is not included in this module as it involves the relationship of a single variable, known as the response variable, to the other variables in the dataset, the explanatory variables.
Routines for multiple regression are provided in submodule correg
.
See Also¶
naginterfaces.library.examples.mv
:This subpackage contains examples for the
mv
module. See also the Examples subsection.
Functionality Index¶
Canonical correlation analysis: canon_corr()
Canonical variate analysis: canon_var()
Cluster Analysis
compute distance matrix:
distance_mat()
compute distance matrix for two data sets:
distance_mat_2()
construct clusters following
cluster_hier()
:cluster_hier_indicator()
construct dendrogram following
cluster_hier()
:cluster_hier_dendrogram()
Gaussian mixture model:
gaussian_mixture()
Gaussian mixture model, submatrix outputs:
gaussian_mixture_ld()
hierarchical:
cluster_hier()
K-means:
cluster_kmeans()
Discriminant Analysis
allocation of observations to groups, following
discrim()
:discrim_group()
Mahalanobis squared distances, following
discrim()
:discrim_mahal()
test for equality of within-group covariance matrices:
discrim()
Factor Analysis
factor score coefficients, following
factor()
:factor_score()
maximum likelihood estimates of parameters:
factor()
Principal component analysis: prin_comp()
Rotations
orthogonal rotations for loading matrix:
rot_orthomax()
Procustes rotations:
rot_procrustes()
ProMax rotations:
rot_promax()
Scaling Methods
multidimensional scaling:
multidimscal_ordinal()
principal coordinate analysis:
multidimscal_metric()
Standardize values of a data matrix: z_scores()
For full information please refer to the NAG Library document
https://support.nag.com/numeric/nl/nagdoc_30.2/flhtml/g03/g03intro.html
Examples¶
- naginterfaces.library.examples.mv.cluster_kmeans_ex.main()[source]¶
Example for
naginterfaces.library.mv.cluster_kmeans()
.K-means cluster analysis.
>>> main() naginterfaces.library.mv.cluster_kmeans Python Example Results. K-means cluster analysis. The cluster to which each point belongs: [1, 1, 3, 2, 3, 1, 1, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 1, 1, 3] The number of points in each cluster: [6, 3, 11] The within-cluster sum of weights of each cluster: [6.00, 3.00, 11.00] The within-cluster sum of squares of each cluster: [46.57, 20.38, 468.90] The final cluster centres: [ 8.1183e+01, 1.1667e+01, 7.1500e+00, 2.0500e+00, 6.6000e+00 4.7867e+01, 3.5800e+01, 1.6333e+01, 2.4000e+00, 6.7333e+00 6.4045e+01, 2.5209e+01, 1.0745e+01, 2.8364e+00, 6.6545e+00 ]
- naginterfaces.library.examples.mv.discrim_group_ex.main()[source]¶
Example for
naginterfaces.library.mv.discrim_group()
.Discriminant analysis.
>>> main() naginterfaces.library.mv.discrim_group Python Example Results. Discriminant analysis for diagnosis of Cushing's syndrome. Obs Posterior Allocated Atypicality probabilities to group index 1 0.094 0.905 0.002 2 0.596 0.254 0.975 2 0.005 0.168 0.827 3 0.952 0.836 0.018 3 0.019 0.920 0.062 2 0.954 0.797 0.912 4 0.697 0.303 0.000 1 0.207 0.860 0.993 5 0.317 0.013 0.670 3 0.991 1.000 0.984 6 0.032 0.366 0.601 3 0.981 0.978 0.887
- naginterfaces.library.examples.mv.gaussian_mixture_ex.main()[source]¶
Example for
naginterfaces.library.mv.gaussian_mixture()
.Fits a mixture of Gaussians for a given (co)variance structure.
>>> main() naginterfaces.library.mv.gaussian_mixture Python Example Results. Fits a Gaussian mixture model with pooled covariance structure to New Haven schools test data. The final membership probabilities are: [ 9.50176891e-01, 4.98231095e-02 3.32590884e-06, 9.99996674e-01 9.99613355e-01, 3.86644659e-04 9.99920087e-01, 7.99127116e-05 3.89990173e-02, 9.61000983e-01 9.32704894e-01, 6.72951064e-02 9.88809712e-01, 1.11902877e-02 4.12521422e-03, 9.95874786e-01 9.72521486e-01, 2.74785140e-02 9.99691952e-01, 3.08048285e-04 2.17221867e-01, 7.82778133e-01 7.69380852e-01, 2.30619148e-01 9.99973063e-01, 2.69370297e-05 6.11334389e-03, 9.93886656e-01 4.41893305e-02, 9.55810670e-01 3.50057883e-04, 9.99649942e-01 9.99902971e-01, 9.70286734e-05 4.02698414e-05, 9.99959730e-01 9.73798317e-01, 2.62016830e-02 3.02036785e-04, 9.99697963e-01 6.94705604e-02, 9.30529440e-01 4.16030240e-03, 9.95839698e-01 3.08391490e-02, 9.69160851e-01 9.91157909e-01, 8.84209120e-03 4.15339034e-04, 9.99584661e-01 ] The log-likehood is -29.683060270297. There were 14 function evaluations required.
- naginterfaces.library.examples.mv.prin_comp_ex.main()[source]¶
Example for
naginterfaces.library.mv.prin_comp()
.Perform a principal component analysis on a data matrix; both the principal component loadings and the principal component scores are returned.
>>> main() naginterfaces.library.mv.prin_comp Python Example Results. Perform an unweighted principal component analysis on a dataset from Cooley and Lohnes (1971). The statistics of the principal component analysis are: Eigenvalues Percentage Cumulative Chisq DF Sig variation variation [ 8.2739, 0.6515, 0.6515, 8.6127, 5.0000, 0.1255 3.6761, 0.2895, 0.9410, 4.1183, 2.0000, 0.1276 0.7499, 0.0590, 1.0000, 0.0000, 0.0000, 0.0000 ]