NAG CL Interface
g02cbc (linregs_noconst)
1
Purpose
g02cbc performs a simple linear regression with or without a constant term. The data is optionally weighted, and confidence intervals are calculated for the predicted and average values of y at a given x.
2
Specification
void |
g02cbc (Nag_SumSquare mean,
Integer n,
const double x[],
const double y[],
const double wt[],
double clm,
double clp,
double yhat[],
double yml[],
double ymu[],
double yl[],
double yu[],
double h[],
double res[],
double *rms,
NagError *fail) |
|
The function may be called by the names: g02cbc, nag_correg_linregs_noconst or nag_regress_confid_interval.
3
Description
g02cbc fits a straight line model of the form,
where
is the expected value of the variable
, to the data points
such that
where the
values are independent random errors. The
th data point may have an associated weight
. The values of
and
are estimated by minimizing
(if the weights option is not selected then
). The fitted values
are calculated using
where
and the weighted means
and
are given by
The residuals of the regression are calculated using
and the residual mean square about the regression
, is determined using
where
(the number of degrees of freedom) has the following values
- where
- where .
Note: the weights should be scaled to give the required degrees of freedom.
The function calculates predicted
estimates for a value of
,
, is given by
this prediction has a standard error
The
confidence interval for this estimation of
is given by
where
refers to the
point of the
distribution with
degrees of freedom (e.g., when
and
,
). If you specify the probability
then the lower limit of this interval is
and the upper limit is
The mean value of
at
is estimated by the fitted value
. This has a standard error of
and a
confidence interval is given by
For example, if you specify the probability
then the lower limit of this interval is
and the upper limit is
The leverage,
, is a measure of the influence a value
has on the fitted line at that point,
. The leverage is given by
so it can be seen that
Similar formulae can be derived for the case when the line goes through the origin, that is .
4
References
Snedecor G W and Cochran W G (1967) Statistical Methods Iowa State University Press
5
Arguments
-
1:
– Nag_SumSquare
Input
-
On entry: indicates whether
g02cbc is to include a constant term in the regression.
- The constant term, , is included.
- The constant term, , is not included, i.e., .
Constraint:
or .
-
2:
– Integer
Input
-
On entry: , the number of observations.
Constraints:
- if , ;
- if , .
-
3:
– const double
Input
-
On entry: observations on the independent variable, .
Constraint:
all the values of must not be identical.
-
4:
– const double
Input
-
On entry: observations on the dependent variable, .
-
5:
– const double
Input
-
On entry: if weighted estimates are required then
wt must contain the weights to be used in the weighted regression. Usually
will be an integral value corresponding to the number of observations associated with the
th data point, or zero if the
th data point is to be ignored. The sum of the weights therefore represents the effective total number of observations used to create the regression line.
If weights are not provided then
wt must be set to
NULL and the effective number of observations is
n.
Constraint:
if , , for .
-
6:
– double
Input
-
On entry: the confidence level for the confidence intervals for the mean.
Constraint:
.
-
7:
– double
Input
-
On entry: the confidence level for the prediction intervals.
Constraint:
.
-
8:
– double
Output
-
On exit: the fitted values, .
-
9:
– double
Output
-
On exit: contains the lower limit of the confidence interval for the regression line at .
-
10:
– double
Output
-
On exit: contains the upper limit of the confidence interval for the regression line at .
-
11:
– double
Output
-
On exit: contains the lower limit of the confidence interval for the individual y value at .
-
12:
– double
Output
-
On exit: contains the upper limit of the confidence interval for the individual y value at .
-
13:
– double
Output
-
On exit: the leverage of each observation on the regression.
-
14:
– double
Output
-
On exit: the residuals of the regression.
-
15:
– double *
Output
-
On exit: the residual mean square about the regression.
-
16:
– NagError *
Input/Output
-
The NAG error argument (see
Section 7 in the Introduction to the NAG Library CL Interface).
6
Error Indicators and Warnings
- NE_BAD_PARAM
-
On entry, argument
mean had an illegal value.
- NE_INT_ARG_LT
-
On entry, .
Constraint: if , .
On entry, .
Constraint: if , .
- NE_NEG_WEIGHT
-
On entry, at least one of the weights is negative.
- NE_REAL_ARG_GE
-
On entry,
clm must not be greater than or equal to 1.0:
.
On entry,
clp must not be greater than or equal to 1.0:
.
- NE_REAL_ARG_LE
-
On entry,
clm must not be less than or equal to 0.0:
.
On entry,
clp must not be less than or equal to 0.0:
.
- NE_SW_LOW
-
On entry, the sum of elements of
wt must be greater than
if
and
if
.
- NE_WT_LOW
-
On entry,
wt must contain at least 1 positive element if
or at least 2 positive elements if
.
- NE_X_IDEN
-
On entry, all elements of
x are equal.
- NW_RMS_EQ_ZERO
-
Residual mean sum of squares is zero, i.e., a perfect fit was obtained.
7
Accuracy
The computations are believed to be stable.
8
Parallelism and Performance
g02cbc is not threaded in any implementation.
None.
10
Example
A program to calculate the fitted value of and the upper and lower limits of the confidence interval for the regression line as well as the individual values.
10.1
Program Text
10.2
Program Data
10.3
Program Results