NAG Library Function Document
nag_regress_confid_interval (g02cbc)
1 Purpose
nag_regress_confid_interval (g02cbc) performs a simple linear regression with or without a constant term. The data is optionally weighted, and confidence intervals are calculated for the predicted and average values of y at a given x.
2 Specification
#include <nag.h> |
#include <nagg02.h> |
void |
nag_regress_confid_interval (Nag_SumSquare mean,
Integer n,
const double x[],
const double y[],
const double wt[],
double clm,
double clp,
double yhat[],
double yml[],
double ymu[],
double yl[],
double yu[],
double h[],
double res[],
double *rms,
NagError *fail) |
|
3 Description
nag_regress_confid_interval (g02cbc) fits a straight line model of the form,
where
is the expected value of the variable
, to the data points
such that
where the
values are independent random errors. The
th data point may have an associated weight
. The values of
and
are estimated by minimizing
(if the weights option is not selected then
). The fitted values
are calculated using
where
and the weighted means
and
are given by
The residuals of the regression are calculated using
and the residual mean square about the regression
, is determined using
where
(the number of degrees of freedom) has the following values
- where
- where .
Note: the weights should be scaled to give the required degrees of freedom.
The function calculates predicted
estimates for a value of
,
, is given by
this prediction has a standard error
The
confidence interval for this estimation of
is given by
where
refers to the
point of the
distribution with
degrees of freedom (e.g., when
and
,
). If you specify the probability
then the lower limit of this interval is
and the upper limit is
The mean value of
at
is estimated by the fitted value
. This has a standard error of
and a
confidence interval is given by
For example, if you specify the probability
then the lower limit of this interval is
and the upper limit is
The leverage,
, is a measure of the influence a value
has on the fitted line at that point,
. The leverage is given by
so it can be seen that
Similar formulae can be derived for the case when the line goes through the origin, that is .
4 References
Snedecor G W and Cochran W G (1967) Statistical Methods Iowa State University Press
5 Arguments
- 1:
mean – Nag_SumSquareInput
On entry: indicates whether nag_regress_confid_interval (g02cbc) is to include a constant term in the regression.
- The constant term, , is included.
- The constant term, , is not included, i.e., .
Constraint:
or .
- 2:
n – IntegerInput
On entry: , the number of observations.
Constraints:
- if , ;
- if , .
- 3:
x[n] – const doubleInput
-
On entry: observations on the independent variable, .
Constraint:
all the values of must not be identical.
- 4:
y[n] – const doubleInput
-
On entry: observations on the dependent variable, .
- 5:
wt[n] – const doubleInput
-
On entry: if weighted estimates are required then
wt must contain the weights to be used in the weighted regression. Usually
will be an integral value corresponding to the number of observations associated with the
th data point, or zero if the
th data point is to be ignored. The sum of the weights therefore represents the effective total number of observations used to create the regression line.
If weights are not provided then
wt must be set to
NULL and the effective number of observations is
n.
Constraint:
if , , for .
- 6:
clm – doubleInput
-
On entry: the confidence level for the confidence intervals for the mean.
Constraint:
.
- 7:
clp – doubleInput
-
On entry: the confidence level for the prediction intervals.
Constraint:
.
- 8:
yhat[n] – doubleOutput
-
On exit: the fitted values, .
- 9:
yml[n] – doubleOutput
-
On exit: contains the lower limit of the confidence interval for the regression line at .
- 10:
ymu[n] – doubleOutput
-
On exit: contains the upper limit of the confidence interval for the regression line at .
- 11:
yl[n] – doubleOutput
-
On exit: contains the lower limit of the confidence interval for the individual y value at .
- 12:
yu[n] – doubleOutput
-
On exit: contains the upper limit of the confidence interval for the individual y value at .
- 13:
h[n] – doubleOutput
-
On exit: the leverage of each observation on the regression.
- 14:
res[n] – doubleOutput
-
On exit: the residuals of the regression.
- 15:
rms – double *Output
-
On exit: the residual mean square about the regression.
- 16:
fail – NagError *Input/Output
-
The NAG error argument (see
Section 3.6 in the Essential Introduction).
6 Error Indicators and Warnings
- NE_BAD_PARAM
-
On entry, argument
mean had an illegal value.
- NE_INT_ARG_LT
-
On entry, .
Constraint: if , .
On entry, .
Constraint: if , .
- NE_NEG_WEIGHT
-
On entry, at least one of the weights is negative.
- NE_REAL_ARG_GE
-
On entry,
clm must not be greater than or equal to 1.0:
.
On entry,
clp must not be greater than or equal to 1.0:
.
- NE_REAL_ARG_LE
-
On entry,
clm must not be less than or equal to 0.0:
.
On entry,
clp must not be less than or equal to 0.0:
.
- NE_SW_LOW
-
On entry, the sum of elements of
wt must be greater than 1.0 if
and 2.0 if
.
- NE_WT_LOW
-
On entry,
wt must contain at least 1 positive element if
or at least 2 positive elements if
.
- NE_X_IDEN
-
On entry, all elements of
x are equal.
- NW_RMS_EQ_ZERO
-
Residual mean sum of squares is zero, i.e., a perfect fit was obtained.
7 Accuracy
The computations are believed to be stable.
8 Parallelism and Performance
Not applicable.
None.
10 Example
A program to calculate the fitted value of and the upper and lower limits of the confidence interval for the regression line as well as the individual values.
10.1 Program Text
Program Text (g02cbce.c)
10.2 Program Data
Program Data (g02cbce.d)
10.3 Program Results
Program Results (g02cbce.r)