library.correg
Submodule¶
Module Summary¶
Interfaces for the NAG Mark 30.2 correg Chapter.
correg
- Correlation and Regression Analysis
This module is concerned with two techniques
correlation analysis and
regression modelling,
both of which are concerned with determining the inter-relationships among two or more variables.
Other modules of the NAG Library which cover similar problems are submodule fit
and submodule opt
. Submodule fit
functions may be used to fit linear models by criteria other than least squares and also for polynomial regression; submodule opt
functions may be used to fit nonlinear models and linearly constrained linear models.
See Also¶
naginterfaces.library.examples.correg
:This subpackage contains examples for the
correg
module. See also the Examples subsection.
Functionality Index¶
Correlation-like coefficients
all variables
casewise treatment of missing values:
coeffs_zero_miss_case()
no missing values:
coeffs_zero()
pairwise treatment of missing values:
coeffs_zero_miss_pair()
subset of variables
casewise treatment of missing values:
coeffs_zero_subset_miss_case()
no missing values:
coeffs_zero_subset()
pairwise treatment of missing values:
coeffs_zero_subset_miss_pair()
Generalized linear models
binomial errors:
glm_binomial()
computes estimable function:
glm_estfunc()
gamma errors:
glm_gamma()
Normal errors:
glm_normal()
Poisson errors:
glm_poisson()
prediction:
glm_predict()
transform model parameters:
glm_constrain()
Hierarchical mixed effects regression
initiation:
mixeff_hier_init()
using maximum likelihood:
mixeff_hier_ml()
using restricted maximum likelihood:
mixeff_hier_reml()
Least angle regression (includes LASSO)
Additional parameter calculation:
lars_param()
Model fitting
Cross-product matrix:
lars_xtx()
Raw data:
lars()
Linear mixed effects regression
fitting (via REML or ML):
lmm_fit()
initiation:
lmm_init()
initiation, combine:
lmm_init_combine()
via maximum likelihood (ML):
mixeff_ml()
via restricted maximum likelihood (REML):
mixeff_reml()
Multiple linear regression
from correlation coefficients:
linregm_coeffs_const()
from correlation-like coefficients:
linregm_coeffs_noconst()
Multiple linear regression/General linear model
add/delete observation from model:
linregm_obs_edit()
add independent variable to model:
linregm_var_add()
computes estimable function:
linregm_estfunc()
delete independent variable from model:
linregm_var_del()
general linear regression model:
linregm_fit()
regression for new dependent variable:
linregm_fit_newvar()
regression parameters from updated model:
linregm_update()
transform model parameters:
linregm_constrain()
Nearest correlation matrix
fixed elements:
corrmat_fixed()
fixed submatrix:
corrmat_shrinking()
-factor structure:
corrmat_nearest_kfactor()
method of Qi and Sun
element-wise weights:
corrmat_h_weight()
unweighted, unbounded:
corrmat_nearest()
weighted norm:
corrmat_nearest_bounded()
rank-constrained:
corrmat_nearest_rank()
shrinkage method:
corrmat_target()
Non-parametric rank correlation (Kendall and/or Spearman)
missing values
casewise treatment of missing values
overwriting input data:
coeffs_kspearman_miss_case_overwrite()
preserving input data:
coeffs_kspearman_miss_case()
pairwise treatment of missing values:
coeffs_kspearman_miss_pair()
no missing values
overwriting input data:
coeffs_kspearman_overwrite()
preserving input data:
coeffs_kspearman()
Partial least squares
calculates predictions given an estimated PLS model:
pls_pred()
fits a PLS model for a given number of factors:
pls_fit()
orthogonal scores using SVD:
pls_svd()
orthogonal scores using Wold’s method:
pls_wold()
Product-moment correlation
correlation coefficients, all variables
casewise treatment of missing values:
coeffs_pearson_miss_case()
no missing values:
coeffs_pearson()
pairwise treatment of missing values:
coeffs_pearson_miss_pair()
correlation coefficients, subset of variables
casewise treatment of missing values:
coeffs_pearson_subset_miss_case()
no missing values:
coeffs_pearson_subset()
pairwise treatment of missing values:
coeffs_pearson_subset_miss_pair()
correlation matrix
compute correlation and covariance matrices:
corrmat()
compute from sum of squares matrix:
ssqmat_to_corrmat()
compute partial correlation and covariance matrices:
corrmat_partial()
sum of squares matrix
Quantile regression
linear
comprehensive:
quantile_linreg()
simple:
quantile_linreg_easy()
Residuals
Durbin–Watson test:
linregm_stat_durbwat()
standardized residuals and influence statistics:
linregm_stat_resinf()
Ridge regression
ridge parameter(s) supplied:
ridge()
ridge parameter optimized:
ridge_opt()
Robust correlation
Huber’s method:
robustm_corr_huber()
user-supplied weight function only:
robustm_corr_user()
user-supplied weight function plus derivatives:
robustm_corr_user_deriv()
Robust regression
compute weights for use with
robustm_user()
:robustm_wts()
standard -estimates:
robustm()
user-supplied weight functions:
robustm_user()
variance-covariance matrix following
robustm_user()
:robustm_user_varmat()
Selecting regression model
all possible regressions:
linregm_rssq()
forward selection:
linregm_fit_onestep()
and statistics:
linregm_rssq_stat()
Service functions
for multiple linear regression
reorder elements from vectors and matrices:
linregm_service_reorder()
select elements from vectors and matrices:
linregm_service_select()
general option getting function:
optget()
general option setting function:
optset()
Simple linear regression
no intercept:
linregs_noconst()
no intercept with missing values:
linregs_noconst_miss()
with intercept:
linregs_const()
with intercept and with missing values:
linregs_const_miss()
Stepwise linear regression
Clarke’s sweep algorithm:
linregm_fit_stepwise()
For full information please refer to the NAG Library document
https://support.nag.com/numeric/nl/nagdoc_30.2/flhtml/g02/g02intro.html
Examples¶
- naginterfaces.library.examples.correg.coeffs_kspearman_miss_case_ex.main()[source]¶
Example for
naginterfaces.library.correg.coeffs_kspearman_miss_case()
.Kendall and Spearman rank correlation coefficients.
>>> main() naginterfaces.library.correg.coeffs_kspearman_miss_case Python Example Results. Kendall and Spearman rank correlation coefficients. Observations: [ 1.70, 1.00, 0.50 2.80, 4.00, 3.00 0.60, 6.00, 2.50 1.80, 9.00, 6.00 0.99, 4.00, 2.50 1.40, 2.00, 5.50 1.80, 9.00, 7.50 2.50, 7.00, 0.00 0.99, 5.00, 3.00 ] Correlation coefficients: [ 1.0000, 0.2941, 0.4058 0.1429, 1.0000, 0.7537 0.2760, 0.5521, 1.0000 ]
- naginterfaces.library.examples.correg.corrmat_nearest_ex.main()[source]¶
Example for
naginterfaces.library.correg.corrmat_nearest()
.Find a nearest correlation matrix.
>>> main() naginterfaces.library.correg.corrmat_nearest Python Example Results. The Frobenius-nearest correlation matrix to a given square matrix. Symmetric nearest correlation matrix X: [ 1.00e+00 -8.08e-01, 1.00e+00 1.92e-01, -6.56e-01, 1.00e+00 1.07e-01, 1.92e-01, -8.08e-01, 1.00e+00 ]
- naginterfaces.library.examples.correg.glm_binomial_ex.main()[source]¶
Example for
naginterfaces.library.correg.glm_binomial()
.Use k-fold cross validation to estimate the true positive and negative rates of a prediction from a logistic regression model.
The data used in this example was simulated.
>>> main() naginterfaces.library.correg.glm_binomial Python Example Results. Use k-fold cross validation to estimate the true positive and negative rates of a prediction from a logistic regression model. Observed -------------------------- Predicted | Negative Positive Total -------------------------------------- Negative | 19 6 25 Positive | 3 12 15 Total | 22 18 40 True Positive Rate (Sensitivity): 0.67 True Negative Rate (Specificity): 0.86
- naginterfaces.library.examples.correg.glm_normal_ex.main()[source]¶
Example for
naginterfaces.library.correg.glm_normal()
.Fits a generalized linear model with Normal errors.
>>> main() naginterfaces.library.correg.glm_normal Python Example Results. Fits a generalized linear model with Normal errors. Fitted model summary: RSS is 3.872e-01 Degrees of freedom 3 Term Estimate Standard Error Variable: 0 -2.387e-02 2.779e-03 Variable: 1 6.381e-02 2.638e-03
- naginterfaces.library.examples.correg.lars_ex.main()[source]¶
Example for
naginterfaces.library.correg.lars()
.Least angle regression.
>>> main() naginterfaces.library.correg.lars Python Example Results. Least angle regression. Step Parameter Estimate ----------------------------------------------------------------- 1 0.000 0.000 3.125 0.000 0.000 0.000 2 0.000 0.000 3.792 0.000 0.000 -0.713 3 -0.446 0.000 3.998 0.000 0.000 -1.151 4 -0.628 -0.295 4.098 0.000 0.000 -1.466 5 -1.060 -1.056 4.110 -0.864 0.000 -1.948 6 -1.073 -1.132 4.118 -0.935 -0.059 -1.981 ----------------------------------------------------------------- alpha: -50.037 ----------------------------------------------------------------- Step Sum RSS df Cp Ck Step Size ----------------------------------------------------------------- 1 72.446 8929.855 2 13.355 123.227 72.446 2 103.385 6404.701 3 7.054 50.781 24.841 3 126.243 5258.247 4 5.286 30.836 16.225 4 145.277 4657.051 5 5.309 19.319 11.587 5 198.223 3959.401 6 5.016 12.266 24.520 6 203.529 3954.571 7 7.000 0.910 2.198 ----------------------------------------------------------------- sigma^2: 304.198
- naginterfaces.library.examples.correg.lars_param_ex.main()[source]¶
Example for
naginterfaces.library.correg.lars_param()
.Least angle regression, additional parameter estimates.
>>> main() naginterfaces.library.correg.lars_param Python Example Results. Parameter Estimates from lars_xtx Step Parameter Estimate ----------------------------------------------------------------- 1 0.000 0.000 3.125 0.000 0.000 0.000 2 0.000 0.000 3.792 0.000 0.000 -0.713 3 -0.446 0.000 3.998 0.000 0.000 -1.151 4 -0.628 -0.295 4.098 0.000 0.000 -1.466 5 -1.060 -1.056 4.110 -0.864 0.000 -1.948 6 -1.073 -1.132 4.118 -0.935 -0.059 -1.981 Additional Parameter Estimates from lars_param nk Parameter Estimate ----------------------------------------------------------------- 0.2 0.000 0.000 0.625 0.000 0.000 0.000 1.2 0.000 0.000 3.258 0.000 0.000 -0.143 3.2 -0.483 -0.059 4.018 0.000 0.000 -1.214 4.5 -0.844 -0.676 4.104 -0.432 0.000 -1.707 5.2 -1.062 -1.071 4.112 -0.878 -0.012 -1.955
- naginterfaces.library.examples.correg.linregm_fit_ex.main()[source]¶
Example for
naginterfaces.library.correg.linregm_fit()
.Fit a general (multiple) linear regression model.
>>> main() naginterfaces.library.correg.linregm_fit Python Example Results. Fit a general (multiple) linear regression model. Fitted model summary: Model is not of full rank Rank: 4 RSS is 2.223e+01 Degrees of freedom 8 Term Estimate Standard Error Intercept: 3.056e+01 3.849e-01 Variable: 1 5.447e+00 8.390e-01 Variable: 2 6.743e+00 8.390e-01 Variable: 3 1.105e+01 8.390e-01 Variable: 4 7.320e+00 8.390e-01
- naginterfaces.library.examples.correg.linregm_fit_stepwise_ex.main()[source]¶
Example for
naginterfaces.library.correg.linregm_fit_stepwise()
.Stepwise linear regression.
>>> main() naginterfaces.library.correg.linregm_fit_stepwise Python Example Results. Stepwise linear regression. Starting Stepwise Selection Forward Selection Variable 1 Variance ratio = 1.260E+01 Variable 2 Variance ratio = 2.196E+01 Variable 3 Variance ratio = 4.403E+00 Variable 4 Variance ratio = 2.280E+01 Adding variable 4 to model Backward Selection Variable 4 Variance ratio = 2.280E+01 Keeping all current variables Forward Selection Variable 1 Variance ratio = 1.082E+02 Variable 2 Variance ratio = 1.725E-01 Variable 3 Variance ratio = 4.029E+01 Adding variable 1 to model Backward Selection Variable 1 Variance ratio = 1.082E+02 Variable 4 Variance ratio = 1.593E+02 Keeping all current variables Forward Selection Variable 2 Variance ratio = 5.026E+00 Variable 3 Variance ratio = 4.236E+00 Adding variable 2 to model Backward Selection Variable 1 Variance ratio = 1.540E+02 Variable 2 Variance ratio = 5.026E+00 Variable 4 Variance ratio = 1.863E+00 Dropping variable 4 from model Forward Selection Variable 3 Variance ratio = 1.832E+00 Variable 4 Variance ratio = 1.863E+00 Finished Stepwise Selection Fitted model summary: Term Estimate Standard Error Intercept: 5.258e+01 2.294e+00 Variable: 1 1.468e+00 1.213e-01 Variable: 2 6.623e-01 4.585e-02 RMS is 5.790e+00
- naginterfaces.library.examples.correg.lmm_init_combine_ex.main()[source]¶
Example for
naginterfaces.library.correg.lmm_init_combine()
including calls tonaginterfaces.library.blgm.lm_submodel()
,naginterfaces.library.blgm.lm_describe_data()
,naginterfaces.library.correg.lmm_init()
,naginterfaces.library.correg.lmm_fit()
andnaginterfaces.library.blgm.handle_free()
Multi-level linear mixed effects regression model using restricted maximum likelihood.
The data used in this example was simulated.
>>> main() naginterfaces.library.correg.lmm_init_combine Python Example Results. Linear mixed effects regression model using REML. Random Parameter Estimates ========================== Estimate Standard Label Error 0.683 0.506 V7 . V12 (Lvl 1) ... 0.504 2.693 V4 (Lvl 3) . V11 (Lvl 3) . V10 (Lvl 2) . V12 (Lvl 3) Fixed Parameter Estimates ========================= Estimate Standard Label Error 1.643 2.460 Intercept -1.622 0.855 V1 (Lvl 2) -2.482 1.142 V2 (Lvl 2) 0.462 1.214 V2 (Lvl 3) Variance Components =================== Estimate Label 0.563 V7 . V12 5.820 V8 . V12 10.860 V9 . V12 19.628 V5 . V11 . V12 40.534 V6 . V11 . V12 36.323 V3 . V11 . V10 . V12 12.451 V4 . V11 . V10 . V12 Sigma^2 = 0.003 -2 Log Likelihood = 608.195
- naginterfaces.library.examples.correg.pls_ex.main()[source]¶
Example for
naginterfaces.library.correg.pls()
.Partial least squares (PLS) regression using singular value decomposition, parameter estimates, predictions.
>>> main() naginterfaces.library.correg.pls Python Example Results. PLS regression using SVD; param. estimates; predictions. Begin regression. Begin estimation. Begin prediction. Predictions: [ 0.2133 0.5153 0.1438 0.4460 0.1716 2.4808 0.0963 1.4476 -0.1546 -0.5492 0.5393 0.2685 -1.1333 1.7974 0.4972 ]
- naginterfaces.library.examples.correg.quantile_linreg_ex.main()[source]¶
Example for
naginterfaces.library.correg.quantile_linreg()
.Multiple linear quantile regression (comprehensive interface.)
>>> main() naginterfaces.library.correg.quantile_linreg Python Example Results. Quantile regression model fitted to Engels' 1857 study of household expenditure on food. Quantile: 0.100 Lower Parameter Upper Limit Estimate Limit 0 74.946 110.142 145.337 1 0.370 0.402 0.433 Covariance matrix: [ 3.191e+02 -2.541e-01, 2.587e-04 ] Quantile: 0.250 Lower Parameter Upper Limit Estimate Limit 0 64.232 95.483 126.735 1 0.446 0.474 0.502 Covariance matrix: [ 2.516e+02 -2.004e-01, 2.039e-04 ] Quantile: 0.500 Lower Parameter Upper Limit Estimate Limit 0 55.399 81.482 107.566 1 0.537 0.560 0.584 Covariance matrix: [ 1.753e+02 -1.396e-01, 1.421e-04 ] Quantile: 0.750 Lower Parameter Upper Limit Estimate Limit 0 41.372 62.396 83.421 1 0.625 0.644 0.663 Covariance matrix: [ 1.139e+02 -9.068e-02, 9.230e-05 ] Quantile: 0.900 Lower Parameter Upper Limit Estimate Limit 0 26.829 67.351 107.873 1 0.650 0.686 0.723 Covariance matrix: [ 4.230e+02 -3.369e-01, 3.429e-04 ] First 10 residuals: Quantile Obs. 0.10000 0.25000 0.50000 0.75000 0.90000 1 -23.10718 -38.84219 -61.00711 -77.14462 -99.86551 2 -16.70358 -41.20981 -73.81193 -100.11463 -127.96277 3 13.48419 -37.04518 -100.61322 -157.07478 -200.13481 4 36.09526 4.52393 -36.48522 -70.97584 -102.95390 5 83.74310 44.08476 -6.54743 -50.41028 -87.11562 6 143.66660 89.90799 22.49734 -37.70668 -82.65437 7 187.39134 142.05288 84.66171 34.21603 -5.80963 8 196.90443 140.73220 70.44951 7.44831 -38.91027 9 194.55254 114.45726 15.70761 -75.01861 -135.36147 10 105.62394 12.32563 -102.13482 -208.16238 -276.22311
- naginterfaces.library.examples.correg.ridge_opt_ex.main()[source]¶
Example for
naginterfaces.library.correg.ridge_opt()
.Ridge regression, optimizing a ridge regression parameter.
>>> main() naginterfaces.library.correg.ridge_opt Python Example Results. Ridge regression optimizing GCV prediction error for a body fat model. Value of ridge parameter: 0.0712 Sum of squares of residuals: 1.0917e+02 Degrees of freedom: 16 Number of effective parameters: 2.9059 Parameter estimates 1 20.1950 2 9.7934 3 9.9576 4 -2.0125 Number of iterations: 6 Ridge parameter minimises GCV Estimated prediction errors: GCV = 7.4718 UEV = 6.3862 FPE = 7.3141 BIC = 8.2380 LOO CV = 7.5495 Residuals 1 -1.9894 2 3.5469 3 -3.0392 4 -3.0309 5 -0.1899 6 -0.3146 7 0.9775 8 4.0157 9 2.5332 10 -2.3560 11 0.5446 12 2.3989 13 -4.0876 14 3.2778 15 0.2894 16 0.7330 17 -0.7116 18 -0.6092 19 -2.9995 20 1.0110 Variance inflation factors 1 0.2928 2 0.4162 3 0.8089