naginterfaces.library.correg.linregm_rssq_stat¶
- naginterfaces.library.correg.linregm_rssq_stat(n, sigsq, tss, nterms, rss, mean='M')[source]¶
linregm_rssq_stat
calculates and -values from the residual sums of squares for a series of linear regression models.For full information please refer to the NAG Library document for g02ec
https://support.nag.com/numeric/nl/nagdoc_30.2/flhtml/g02/g02ecf.html
- Parameters
- nint
, the number of observations used in the regression model.
- sigsqfloat
The best estimate of true variance of the errors, .
- tssfloat
The total sum of squares for the regression model.
- ntermsint, array-like, shape
must contain the number of independent variables (not counting the mean) fitted to the th model, for .
- rssfloat, array-like, shape
must contain the residual sum of squares for the th model.
- meanstr, length 1, optional
Indicates if a mean term is to be included.
A mean term, intercept, will be included in the model.
The model will pass through the origin, zero-point.
- Returns
- rsqfloat, ndarray, shape
contains the -value for the th model, for .
- cpfloat, ndarray, shape
contains the -value for the th model, for .
- Raises
- NagValueError
- (errno )
On entry, .
Constraint: or .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry, .
Constraint: .
- (errno )
On entry: the number of parameters, , is and .
Constraint: .
- (errno )
On entry, and .
Constraint: , for all .
- (errno )
A value of is less than . This may occur if is too large or if , or IP are incorrect.
- Notes
When selecting a linear regression model for a set of observations a balance has to be found between the number of independent variables in the model and fit as measured by the residual sum of squares. The more variables included the smaller will be the residual sum of squares. Two statistics can help in selecting the best model.
represents the proportion of variation in the dependent variable that is explained by the independent variables.
where
(if mean is fitted, otherwise ) and
, where .
The -values can be examined to find a model with a high -value but with small number of independent variables.
statistic.
where is the number of parameters (including the mean) in the model and is an estimate of the true variance of the errors. This can often be obtained from fitting the full model.
A well fitting model will have . is often plotted against to see which models are closest to the line.
linregm_rssq_stat
may be called afterlinregm_rssq()
which calculates the residual sums of squares for all possible linear regression models.
- References
Draper, N R and Smith, H, 1985, Applied Regression Analysis, (2nd Edition), Wiley
Weisberg, S, 1985, Applied Linear Regression, Wiley