For the general linear regression model is defined by
where |
|
is a vector of length of the dependent variable, |
|
|
is an matrix of the independent variables, |
|
|
is a vector of length of unknown arguments, |
and |
|
is a vector of length of unknown random errors such that . |
The residuals are given by
The fitted values,
, can be written as
for an
matrix
. The
th diagonal element of
,
, gives a measure of the influence of the
th value of the independent variables on the fitted regression model. The values of
and the
are returned by
g02dac.
g02fac calculates statistics which help to indicate if an observation is extreme and having an undue influence on the fit of the regression model. Two types of standardized residual are calculated:
-
(a)The th residual is standardized by its variance when the estimate of , , is calculated from all the data; known as internal studentization.
-
(b)The th residual is standardized by its variance when the estimate of , is calculated from the data excluding the th observation; known as external studentization.
The two measures of influence are:
-
(a)Cook's
-
(b)Atkinson's
Atkinson A C (1981) Two graphical displays for outlying and influential observations in regression Biometrika 68 13–20
-
1:
– Integer
Input
-
On entry: number of observations included in the regression, .
Constraint:
.
-
2:
– Integer
Input
-
On entry: the number of linear arguments estimated in the regression model, .
Constraint:
.
-
3:
– Integer
Input
-
On entry: the number of residuals.
Constraint:
.
-
4:
– const double
Input
-
On entry: the residuals, .
-
5:
– const double
Input
-
On entry: the diagonal elements of
,
, corresponding to the residuals in
res.
Constraint:
, for .
-
6:
– double
Input
-
On entry: the estimate of based on all observations, , i.e., the residual mean square.
Constraint:
.
-
7:
– double
Output
-
On exit: the standardized residuals and influence statistics.
For the observation with residual given in
:
- is the internally studentized residual
- is the externally studentized residual
- is Cook's statistic
- is Atkinson's statistic.
-
8:
– NagError *
Input/Output
-
The NAG error argument (see
Section 7 in the Introduction to the NAG Library CL Interface).
Accuracy is sufficient for all practical purposes.
None.
A set of 24 residuals and
values from an 11 argument model fitted to the cloud seeding data considered in
Cook and Weisberg (1982) are input and the standardized residuals etc calculated and printed for the first 10 observations.