Integer, Intent (In)	::	nobs, nvar, levels(nvar), lvnames
Integer, Intent (Inout)	::	ifail
Character (*), Intent (In)	::	vnames(lvnames)
Type (c_ptr), Intent (Inout)	::	hddesc

C Header Interface

#include <nag.h>

void	g22ybf_ (void *hddesc, const Integer nobs, const Integer nvar, const Integer levels[], const Integer lvnames, const char vnames[], Integer *ifail, const Charlen length_vnames)

The routine may be called by the names g22ybf or nagf_blgm_lm_describe_data.

3 Description

Let

D

denote a data matrix with

n

observations on

m_{d}

independent variables, denoted

V_{1}, V_{2}, \dots, V_{m_{d}}

. The

j

th independent variable,

V_{j}

can be classified as either binary, categorical, ordinal or continuous, where:

Binary: $V_{j}$ can take the value $1$ or $0$ .
Categorical: $V_{j}$ can take one of $L_{j}$ distinct values or levels. Each level represents a discrete category but does not necessarily imply an ordering. The value used to represent each level is, therefore, arbitrary and, by convention and for convenience, is taken to be the integers from $1$ to $L_{j}$ .
Ordinal: As with a categorical variable $V_{j}$ can take one of $L_{j}$ distinct values or levels. However, unlike a categorical variable, the levels of an ordinal variable imply an ordering and hence the value used to represent each level is not arbitrary. For example, $V_{j} = 4$ implies a value that is twice as large as $V_{j} = 2$ .
Continuous: $V_{j}$ can take any real value.

g22ybf returns a G22 handle containing a description of a data matrix,

D

. The data matrix makes no distinction between binary, ordinal or continuous variables.

A name can also be assigned to each variable. If names are not supplied then the default vector of names,

{' V1',' V2', \dots}

is used.

4 References

None.

5 Arguments

1: $hddesc$ – Type (c_ptr) Input/Output: On entry: must be set to c_null_ptr, alternatively an existing G22 handle may be supplied in which case this routine will destroy the supplied G22 handle as if g22zaf had been called.

On exit: holds a G22 handle to the internal data structure containing a description of the data matrix, $D$ . You must not change the G22 handle other than through the routines in Chapter G22.
2: $nobs$ – Integer Input: On entry: $n$ , the number of observations in the data matrix, $D$ .

Constraint: $nobs \geq 0$ .
3: $nvar$ – Integer Input: On entry: $m_{d}$ , the number of variables in the data matrix, $D$ .

Constraint: $nvar \geq 0$ .
4: $levels (nvar)$ – Integer array Input: On entry: $levels (j)$ contains the number of levels associated with the $j$ th variable of the data matrix, for $j = 1, 2, \dots, nvar$ .
If the $j$ th variable is binary, ordinal or continuous, $levels (j)$ should be set to $1$ ; otherwise $levels (j)$ should be set to the number of levels associated with the $j$ th variable and the corresponding column of the data matrix is assumed to take the value $1$ to $levels (j)$ .

Constraint: $levels (i) \geq 1$ , for $i = 1, 2, \dots, nvar$ .
5: $lvnames$ – Integer Input: On entry: the number of variable names supplied in vnames.

Constraint: $lvnames = 0$ or $nvar$ .
6: $vnames (lvnames)$ – Character(*) array Input: On entry: if $lvnames \neq 0$ , $vnames (j)$ must contain the name of the $j$ th variable, for $j = 1, 2, \dots, nvar$ . If $lvnames = 0$ , vnames is not referenced.
The names supplied in vnames should be at most $50$ characters long and be unique. If a name longer than $50$ characters is supplied it will be truncated.

Variable names must not contain any of the characters +.*-:^()@.
7: $ifail$ – Integer Input/Output: On entry: ifail must be set to $0$ , $−1$ or $1$ to set behaviour on detection of an error; these values have no effect when no error is detected.
A value of $0$ causes the printing of an error message and program execution will be halted; otherwise program execution continues. A value of $−1$ means that an error message is printed while a value of $1$ means that it is not.

If halting is not appropriate, the value $−1$ or $1$ is recommended. If message printing is undesirable, then the value $1$ is recommended. Otherwise, the value $0$ is recommended. When the value $- 1$ or $1$ is used it is essential to test the value of ifail on exit.

On exit: $ifail = 0$ unless the routine detects an error or a warning has been flagged (see Section 6).

6 Error Indicators and Warnings

If on entry

ifail = 0

−1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

$ifail = 11$: On entry, hddesc is not c_null_ptr or a recognised G22 handle.

$ifail = 21$: On entry, $nobs = ⟨ value ⟩$ .
Constraint: $nobs \geq 0$ .

$ifail = 31$: On entry, $nvar = ⟨ value ⟩$ .
Constraint: $nvar \geq 0$ .

$ifail = 41$: On entry, $j = ⟨ value ⟩$ and $levels (j) = ⟨ value ⟩$ .
Constraint: $levels (i) \geq 1$ .

$ifail = 51$: On entry, $lvnames = ⟨ value ⟩$ and $nvar = ⟨ value ⟩$ .
Constraint: $lvnames = 0$ or $nvar$ .

$ifail = 61$: On entry, variable name $i$ contains one more invalid characters, $i = ⟨ value ⟩$ .

$ifail = 62$: On entry, variable names $i$ and $j$ are not unique, $i = ⟨ value ⟩$ and $j = ⟨ value ⟩$ .

$ifail = 63$: On entry, variable names $i$ and $j$ are not unique (possibly due to truncation), $i = ⟨ value ⟩$ and $j = ⟨ value ⟩$ .
Maximum variable name length is $50$ .

$ifail = 64$: At least one variable name was truncated to $50$ characters. Each truncated name is unique and will be used in all output.

$ifail = - 99$: An unexpected error has been triggered by this routine. Please contact NAG.
See Section 7 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 399$: Your licence key may have expired or may not have been installed correctly.
See Section 8 in the Introduction to the NAG Library FL Interface for further information.

$ifail = - 999$: Dynamic memory allocation failed.
See Section 9 in the Introduction to the NAG Library FL Interface for further information.

7 Accuracy

Not applicable.

8 Parallelism and Performance

g22ybf is not threaded in any implementation.

9 Further Comments

9.1 Internal Changes

Internal changes have been made to this routine as follows:

At Mark 27: Functionality has been expanded allowing the routine to be used with g02jhf. An additional argument has been added to the interface and some of the error exits have been renumbered.

For details of all known issues which have been reported for the NAG Library please refer to the Known Issues.

10 Example

This example performs a linear regression using g02daf. The linear regression model is defined via a text string which is parsed using g22yaf. The corresponding design matrix associated with the model and the dataset described via a call to g22ybf is generated using g22ycf.

Verbose labels for the parameters of the model are constructed using information returned in vinfo by g22ydf.

See also the examples in g22yaf, g22ycf and g22ydf.

11 Optional Parameters

As well as the optional parameters common to all G22 handles described in g22zmf and g22znf, a number of additional optional parameters can be specified for a G22 handle holding the description of a data matrix as returned by g22ybf in hddesc.

Each writeable optional parameter has an associated default value; to set any of them to a non-default value, use g22zmf. The value of an optional parameter can be queried using g22znf.

The remainder of this section can be skipped if you wish to use the default values for all optional parameters.

The following is a list of the optional parameters available. A full description of each optional parameter is provided in Section 11.1.

Number of Observations
Number of Variables
Storage Order

11.1 Description of the Optional Parameters

For each option, we give a summary line, a description of the optional parameter and details of constraints.

The summary line contains:

a parameter value, where the letters $a$ , $i$ and $r$ denote options that take character, integer and real values respectively;
the default value.

Keywords and character values are case and white space insensitive.

Number of Observations

i

n

, the number of observations in the data matrix.

Number of Variables

i

Read Only

If queried, this optional parameter will return

m_{d}

, the number of variables in the data matrix.

Storage Order

a

Default

= OBSVAR

This optional parameter states how the data matrix,

D

, will be stored in its input array.

Storage Order = OBSVAR

D_{i j}

, the value for the

j

th variable of the

i

th observation of the data matrix is stored in

dat (i, j)

Storage Order = VAROBS

D_{i j}

, the value for the

j

th variable of the

i

th observation of the data matrix is stored in

dat (j, i)

Where dat is the input parameter of the same name in g22ycf.

Constraint:

Storage Order = OBSVAR

VAROBS

NAG Library Manual, Mark 27.2

Interfaces: FL CL CPP AD

NAG FL Interface Introduction

G22 (Blgm) Chapter Contents

G22 (Blgm) Chapter Introduction

g22yb: FL CL CPP AD

NAG FL Interfaceg22ybf (lm_​describe_​data)

▸▿ Contents

1 Purpose

2 Specification