naginterfaces.library.anova.random¶

naginterfaces.library.anova.random(y, iblock, nt, it, tol, irdf)[source]¶

random computes the analysis of variance and treatment means and standard errors for a randomized block or completely randomized design.

For full information please refer to the NAG Library document for g04bb

https://support.nag.com/numeric/nl/nagdoc_30.3/flhtml/g04/g04bbf.html

Parameters

yfloat, array-like, shape $(n)$

The observations in the order as described by $i b l o c k$ and $n t$ .

iblockint

Indicates the block structure.

$a b s (i b l o c k) \leq 1$

There are no blocks, i.e., it is a completely randomized design.

$i b l o c k \geq 2$

There are $i b l o c k$ blocks and the data should be input by blocks, i.e., $y$ must contain the observations for block $1$ followed by the observations for block $2$ , etc.

$i b l o c k \leq - 2$

There are $a b s (i b l o c k)$ blocks and the data is input in parallel with respect to blocks, i.e., $y [0]$ must contain the first observation for block $1$ , $y [1]$ must contain the first observation for block $2 \dots y [a b s (i b l o c k) - 1]$ must contain the first observation for block $a b s (i b l o c k), y [a b s (i b l o c k + 1) - 1]$ must contain the second observation for block $1$ , etc.

ntint

The number of treatments. If only blocks are required in the analysis then set $n t = 1$ .

itint, array-like, shape $(:)$

Note: the required length for this argument is determined as follows: if $n t \geq 2$ : $n$ ; otherwise: $1$ .

$i t [i - 1]$ indicates which of the $n t$ treatments plot $i$ received, for $i = 1, 2, \dots, n$ .

If $n t = 1$ , $i t$ is not referenced.

tolfloat

The tolerance value used to check for zero eigenvalues of the matrix $Ω$ . If $t o l = 0.0$ a default value of $10^{- 5}$ is used.

irdfint

An adjustment to the degrees of freedom for the residual and total.

$i r d f \geq 1$

The degrees of freedom for the total is set to $n - i r d f$ and the residual degrees of freedom adjusted accordingly.

$i r d f = 0$

The total degrees of freedom for the total is set to $n - 1$ , as usual.

Returns

gmeanfloat: The grand mean, $^μ$ .
bmeanfloat, ndarray, shape $(| i b l o c k |)$: If $a b s (i b l o c k) \geq 2$ , $b m e a n [j - 1]$ contains the mean for the $j$ th block, ${^β}_{j}$ , for $j = 1, 2, \dots, b$ .
tmeanfloat, ndarray, shape $(n t)$: If $n t \geq 2$ , $t m e a n [l - 1]$ contains the (adjusted) mean for the $l$ th treatment, ${^μ}^{*} + {^τ}_{l}$ , for $l = 1, 2, \dots, t$ , where ${^μ}^{*}$ is the mean of the treatment adjusted observations, $y_{i j (l)} - {^τ}_{l}$ .
tablfloat, ndarray, shape $(4, 5)$: The analysis of variance table. Column 1 contains the degrees of freedom, column 2 the sum of squares, and where appropriate, column 3 the mean squares, column 4 the $F$ -statistic and column 5 the significance level of the $F$ -statistic. Row 1 is for Blocks, row 2 for Treatments, row 3 for Residual and row 4 for Total. Mean squares are computed for all but the Total row; $F$ -statistics and significance are computed for Treatments and Blocks, if present. Any unfilled cells are set to zero.
cfloat, ndarray, shape $(n t, n t)$: If $n t \geq 2$ , the upper triangular part of $c$ contains the variance-covariance matrix of the treatment effects, the strictly lower triangular part contains the standard errors of the difference between two treatment effects (means), i.e., $c [i - 1, j - 1]$ contains the covariance of treatment $i$ and $j$ if $j \geq i$ and the standard error of the difference between treatment $i$ and $j$ if $j < i$ , for $j = 1, 2, \dots, t$ , for $i = 1, 2, \dots, t$ .
irepint, ndarray, shape $(n t)$: If $n t \geq 2$ , the treatment replications, $R_{l l}$ , for $l = 1, 2, \dots, n t$ .
rfloat, ndarray, shape $(n)$: The residuals, $r_{i}$ , for $i = 1, 2, \dots, n$ .
effloat, ndarray, shape $(n t)$: If $n t \geq 2$ , the canonical efficiency factors.

Raises

NagValueError

(errno $1$ )

On entry, $i r d f = ⟨ v a l u e ⟩$ .

Constraint: $i r d f \geq 0$ .

(errno $1$ )

On entry, $t o l = ⟨ v a l u e ⟩$ .

Constraint: $t o l \geq 0.0$ .

(errno $1$ )

On entry, no blocks or treatments in model.

(errno $1$ )

On entry, $n t = ⟨ v a l u e ⟩$ .

Constraint: $n t \geq 1$ .

(errno $1$ )

On entry, $n = ⟨ v a l u e ⟩$ .

Constraint: $n \geq 2$ .

(errno $2$ )

On entry, $n = ⟨ v a l u e ⟩$ and $i b l o c k = ⟨ v a l u e ⟩$ .

Constraint: $n \geq 2$ and if $a b s (i b l o c k) \geq 2$ , $n$ must be a multiple of $a b s (i b l o c k)$ .

(errno $3$ )

On entry, at least one treatment is not present. Treatment $⟨ v a l u e ⟩$ does not appear in $i t$ .

(errno $3$ )

On entry, $i = ⟨ v a l u e ⟩$ , $i t [i - 1] = ⟨ v a l u e ⟩$ and $n t = ⟨ v a l u e ⟩$ .

Constraint: $1 \leq i t [i - 1] \leq n t$ .

(errno $4$ )

On entry, the values in $y$ are constant.

(errno $5$ )

The computation of the eigenvalues has failed to converge.

(errno $5$ )

A computed standard error is zero.

Warns

NagAlgorithmicWarning

(errno $6$ ): The treatments are totally confounded with blocks.
(errno $7$ ): The residual degrees of freedom is zero.
(errno $7$ ): The residual mean square is zero.
(errno $8$ ): The design is disconnected.

Notes

In the NAG Library the traditional C interface for this routine uses a different algorithmic base. Please contact NAG if you have any questions about compatibility.

In a completely randomized design, experimental material is divided into a number of units, or plots, to which a treatment can be applied. In a randomized block design the units are grouped into blocks so that the variation within blocks is less than the variation between blocks. If every treatment is applied to one plot in each block it is a complete block design. If there are fewer plots per block than treatments then the design will be an incomplete block design and may be balanced or partially balanced.

For a completely randomized design, with $t$ treatments and $n_{t}$ plots per treatment, the linear model is

y_{i j} = μ + τ_{j} + e_{i j}, j = 1, 2, \dots, t and i = 1, 2, \dots, n_{j},

where $y_{i j}$ is the $i$ th observation for the $j$ th treatment, $μ$ is the overall mean, $τ_{j}$ is the effect of the $j$ th treatment and $e_{i j}$ is the random error term. For a randomized block design, with $t$ treatments and $b$ blocks of $k$ plots, the linear model is

y_{i j (l)} = μ + β_{i} + τ_{l} + e_{i j}, i = 1, 2, \dots, b, j = 1, 2, \dots, k and l = 1, 2, \dots, t,

where $β_{i}$ is the effect of the $i$ th block and the $i j (l)$ notation indicates that the $l$ th treatment is applied to the $j$ th plot in the $i$ th block.

The completely randomized design gives rise to a one-way analysis of variance. The treatments do not have to be equally replicated, i.e., do not have to occur the same number of times. First the overall mean, $^μ$ , is computed and subtracted from the observations to give $y_{i j}^{'} = y_{i j} -^μ$ . The estimated treatment effects, ${^τ}_{j}$ are then computed as the treatment means of the mean adjusted observations, $y_{i j}^{'}$ , and the treatment sum of squares can be computed from the sum of squares of the treatment totals of the $y_{i j}^{'}$ divided by the number of observations per treatment total, $n_{j}$ . The final residuals are computed as $r_{i j} = y_{i j}^{'} - {^τ}_{j}$ , and, from the residuals, the residual sum of squares is calculated.

For the randomized block design the mean is computed and subtracted from the observations to give $y_{i j (l)}^{'} = y_{i j (l)} -^μ$ . The estimated block effects, ignoring treatment effects, ${^β}_{i}$ , are then computed using the block means of the $y_{i j (l)}^{'}$ and the unadjusted sum of squares computed as the sum of squared block totals for the $y_{i j (l)}^{'}$ divided by number of plots per block, $k$ . The block adjusted observations are then computed as $y_{i j (l)}^{''} = {y_{i}^{'} j}_{(l)} = {^β}_{i}$ . In the case of the complete block design, with the same replication for each treatment within each block, the blocks and treatments are orthogonal, and so the treatment effects are estimated as the treatment means of the block adjusted observations, $y_{i j (l)}^{''}$ . The treatment sum of squares is computed as the sum of squared treatment totals of the $y_{i j (l)}^{''}$ divided by the number of replicates to the treatments, $r = b k / t$ . Finally the residuals, and hence the residual sum of squares, are given by $r_{i j (l)} = y_{i j (l)}^{''} - {^τ}_{l}$ .

For a design without the same replication for each treatment within each block the treatments and the blocks will not be orthogonal, so the treatments adjusted for blocks need to be computed. The adjusted treatment effects are found as the solution to the equations

(R - N N^{T} / k)^τ = q,

where $q$ is the vector of the treatment totals for block adjusted observations, $y_{i j (l)}^{''}$ , $R$ is a diagonal matrix with $R_{l l}$ equal to the number of times the $l$ th treatment is replicated, and $n$ is the $t \times b$ incidence matrix, with $N_{l j}$ equal to the number of times treatment $l$ occurs in block $j$ . The solution to the equations can be written as

^τ = Ω q

where $Ω$ is a generalized inverse of $(R - N N^{T} / k)$ . The solution is found from the eigenvalue decomposition of $(R - N N^{T} / k)$ . The residuals are first calculated by subtracting the estimated treatment effects from the block adjusted observations to give $r_{i j (l)}^{'} = y_{i j (l)}^{''} - {^τ}_{l}$ . However, since only the unadjusted block effects have been removed and blocks and treatments are not orthogonal, the block means of the $r_{i j (l)}^{'}$ have to be subtracted to give the correct residuals, $r_{i j (l)}$ and residual sum of squares.

The mean squares are computed as the sum of squares divided by the degrees of freedom. The degrees of freedom for the unadjusted blocks is $b - 1$ , for the completely randomized and the complete block designs the degrees of freedom for the treatments is $t - 1$ . In the general case the degrees of freedom for treatments is the rank of the matrix $Ω$ . The $F$ -statistic given by the ratio of the treatment mean square to the residual mean square tests the hypothesis

H_{0} : τ_{1} = τ_{2} = \dots = τ_{t} = 0 .

The standard errors for the difference in treatment effects, or treatment means, for the completely randomized or the complete block designs, are given by:

s e (τ_{j} - τ_{j *}) = (\frac{1}{n_{j}} + \frac{1}{n_{j *}}) s^{2}

where $s^{2}$ is the residual mean square and $n_{j} = n_{j *} = b$ in the complete block design. In the general case the variances of the treatment effects are given by

v a r (τ) = Ω s^{2}

from which the appropriate standard errors of the difference between treatment effects or the difference between adjusted means can be calculated.

In the complete block design all the information on the treatment effects is given by the within block analysis. In other designs there may be a loss of information due to the non-orthogonality of treatments and blocks. The efficiency of the within block analysis in these cases is given by the (canonical) efficiency factors, these are the nonzero eigenvalues of the matrix $(R - N N^{T} / k)$ , divided by the number of replicates in the case of equal replication, or by the mean of the number of replicates in the unequally replicated case, see John (1987). If more than one eigenvalue is zero then the design is said to be disconnected and some treatments can only be compared using a between block analysis.

References

Cochran, W G and Cox, G M, 1957, Experimental Designs, Wiley

Davis, O L, 1978, The Design and Analysis of Industrial Experiments, Longman

John, J A, 1987, Cyclic Designs, Chapman and Hall

John, J A and Quenouille, M H, 1977, Experiments: Design and Analysis, Griffin

Searle, S R, 1971, Linear Models, Wiley

NAG and Python

Return to Front

naginterfaces.library.anova.random¶