Introduction

The NAG Library Manual is the principal documentation for the NAG AD Library and full details of how to use the NAG Library and its documentation can be found in How to Use the NAG Library. Details specific to the NAG AD Library are given in the following sections and in the X10 Chapter Introduction.

The NAG AD Library is designed to be compatible with the AD tool dco. Using the NAG AD Library in conjunction with dco provides the best user experience. However, the use of dco is not essential; the NAG AD Library can be used on its own or in conjunction with any other AD solution. Other than in Section 5, the description assumes no dco licence.

Let us assume that you regularly run numerical simulations of a scalar objective $y$ (for example, the price of a financial product) depending on $n$ uncertain arguments ${x}_{i}$ (for example, market volatilities). Suppose that a single run of the given implementation of the (pricing) function $y=f\left({x}_{1},{x}_{2},\dots ,{x}_{n}\right)$ as a program takes one minute on the available computer. In addition to the value $y$, you might be interested
in the sensitivity of the value $y$ to small changes in the input values; that is, the derivatives of $y$ with respect to all the ${x}_{i}$. The rudimentary approach to obtaining an approximation to one of these derivatives is to perturb one of the input values ${x}_{j}$, obtain a perturbed output value $\hat{y}$ and calculate finite differences approximations based on the perturbed and original values. The NAG AD Library offers an alternative approach by computing, principally, the adjoint of $f$ for retrieving the derivatives. Finite difference approximation of the corresponding $n$ gradient entries requires $\mathcal{O}\left(n\right)$ evaluations of $f$. Their accuracy suffers from truncation and/or numerical effects due to cancellation and rounding in finite precision floating-point arithmetic. Moreover, if, for example, $n=\mathrm{1000}$, the calculation of the approximated gradient with finite differences will take at least $\mathrm{1001}$ times the time for the primal calculation (normal forward evaluations with no additional derivative evaluations, also referred to as passive calculation). In adjoint mode this computation can be expected to take less than $\mathrm{20}$ (often less than $\mathrm{10}$) times the time for the accumulation of the same gradient with machine accuracy.

To introduce adjoint derivatives and their advantages, we first briefly illustrate the tangent model and a commonly used implementation of this model which approximates by finite differences. For more detailed information on adjoints in general, see Dunford et al. (1971), and for algorithmic adjoints by algorithmic differentiation in particular, see Griewank and Walther (2008) and Naumann (2012).

For a continuously differentiable function (the primal)

and assuming distinct inputs and outputs, the **tangent model** is defined as

with tangents ${x}^{\left(1\right)}\in {\mathbb{R}}^{n}$ and ${y}^{\left(1\right)}\in {\mathbb{R}}^{m}$, and the tangent function ${f}^{\left(1\right)}:{\mathbb{R}}^{n}\times {\mathbb{R}}^{n}\to {\mathbb{R}}^{m}$. The symbol ${\nabla}_{x}$ stands for the full derivative tensor of $f$ with respect to $x$, i.e., the Jacobian. The tangent model calculates a weighted sum of the columns of the Jacobian matrix, which corresponds to a directional derivative of $f$ in direction ${x}^{\left(1\right)}$.

$$y:=f\left(x\right)\text{\hspace{1em} with}f:{\mathbb{R}}^{n}\to {\mathbb{R}}^{m}\text{,}$$ |

$${y}^{\left(1\right)}:={f}^{\left(1\right)}\left(x,{x}^{\left(1\right)}\right)={\nabla}_{x}f\left(x\right){x}^{\left(1\right)}\text{,}$$ |

The tangent model can easily be approximated using finite differences by perturbing the input into a scaled direction ${x}^{\left(1\right)}$ with appropriate scaling factor $h$ and calculating the difference quotient

$$\frac{f\left(x+h{x}^{\left(1\right)}\right)-f\left(x\right)}{h}\approx {\nabla}_{x}f\left(x\right){x}^{\left(1\right)}\text{.}$$ |

Using the tangent model to compute the full Jacobian matrix, we need to calculate the directional derivatives in the direction of the $n$ Cartesian basis vectors in ${\mathbb{R}}^{n}$. This approach delivers the Jacobian **column-by-column**. Since it requires the evaluation of $f$ at the original point $x$ in addition to the $n$ perturbed points, it has a computational **complexity of $\mathcal{O}\left(n\right)\xb7\text{cost}\left(f\right)$**, which corresponds to $n$ evaluations of the tangent model.

The **adjoint model** is defined as

with adjoint variables ${x}_{\left(1\right)}\in {\mathbb{R}}^{n}$ and ${y}_{\left(1\right)}\in {\mathbb{R}}^{m}$ (corresponding to respective primal variables $x$ and $y$), and the adjoint function ${f}_{\left(1\right)}:{\mathbb{R}}^{n}\times {\mathbb{R}}^{n}\times {\mathbb{R}}^{m}\to {\mathbb{R}}^{n}\times {\mathbb{R}}^{m}$ which calculates the product of the transposed Jacobian with ${y}_{\left(1\right)}$, and which sets ${y}_{\left(1\right)}$ to zero on output. The calculated weighted sum of rows of the Jacobian matrix corresponds to an adjoint directional derivative of $f$ in the adjoint direction ${y}_{\left(1\right)}$.

$$\left(\begin{array}{c}{x}_{\left(1\right)}\\ {y}_{\left(1\right)}\end{array}\right):={f}_{\left(1\right)}\left(x,{x}_{\left(1\right)},{y}_{\left(1\right)}\right)=\left(\begin{array}{c}{x}_{\left(1\right)}+{\left[{\nabla}_{x}f\left(x\right)\right]}^{\mathrm{T}}{y}_{\left(1\right)}\\ 0\end{array}\right)\text{,}$$ |

The computation of the full Jacobian matrix requires adjoint directional derivatives in the direction of the $m$ Cartesian basis vectors in ${\mathbb{R}}^{m}$. This approach delivers the Jacobian **row-by-row** and has a computational **complexity of $\mathcal{O}\left(m\right)\xb7\text{cost}\left(f\right)$**. This is an enormous advantage over the tangent model whenever $R\xb7m<n$, where $R$ quantifies the implementation overhead of the adjoint. For example, in many optimization problems, a single scalar objective value ($m=1$) is computed from a large number of input model arguments ($n$ large).

There is no finite difference approximation for the adjoint model since computing the adjoints efficiently requires a data flow reversal of the primal. Adjoint code development is therefore a non-trivial task; however, having an adjoint library available can be a powerful advantage. As an alternative to hand-writing adjoint code, algorithmic differentiation (AD) is a widespread technique also used to generate parts of the NAG AD Library routines. For this purpose, AD by overloading is implemented in C++ by the AD tool dco/c++, and in Fortran by a module interfacing that tool. See NAG and Algorithmic Differentiation. Please note that in Fortran, the module interfacing dco/c++ has been developed to work with the Fortran code in the NAG Library which adheres to a strict set of standards using a subset of modern Fortran syntax. The module works for the set of Fortran example programs provided in Section 10 of each routine document because the example programs also adhere to these standards.

As indicated in the previous section, each real-valued variable in the domain of the functional (primal variable), e.g., $x$, has a corresponding adjoint variable, e.g., ${x}_{\left(1\right)}$. The NAG AD Library interface uses special data types (called active data types) to reference the primal variable and the adjoint variable. The respective components are accessible via functions acting on these active data types.

See Section 3.3 in the X10 Chapter Introduction for details of how routines in the NAG AD Library are named; in particular what ‘_a1w_’ means in data types and routine names.

Calculating adjoints using the NAG AD Library requires the following basic steps. Users of dco will be familiar with this procedure. In the following, we assume pure inputs and outputs (no overwriting), i.e., a function $y:=f\left(x\right)\text{.}$
(Please see Section 4.4 for further details on how to handle variables that are both input and output.)

The adjoint model then calculates

$${x}_{\left(1\right)}:={x}_{\left(1\right)}+{\left[{\nabla}_{x}f\left(x\right)\right]}^{\mathrm{T}}{y}_{\left(1\right)}\text{.}$$ |

The basic procedure around the calling of NAG AD Library routines is (note that IR = Internal Representation):

- 1.allocate global memory for internal data / IR (API call)
- 2.copy routine input data into active data types (Note: this is just an assignment statement in C++ and Fortran)
- 3.register input variables to IR (API call) (Note: only register those inputs with regard to which adjoints are required)
- 4.call library routine(s)
- 5.increment adjoints of outputs ${y}_{\left(1\right)}$ (API call)
- 6.calculate adjoints of inputs by interpreting the IR (API call)
- 7.get adjoints of inputs ${x}_{\left(1\right)}$ (API call)
- 8.free internal memory (API call)

This section describes the interfaces of routines provided to perform the above steps (except step 4.) The NAG AD Library should be considered as an extension of the NAG FL Interface, and as such the interfaces are very similar across the three programming languages considered: C, C++ and Fortran. Since the NAG AD Library is designed to be compatible with dco, there is a corresponding dco/c++ interface for C++ available; see Section 5. The dco/c++ interface can be used without having a dco/c++ licence by using dco/c++/light, which is part of the NAG AD Library. Using dco/c++/light, though not having the overloading functionality of full dco/c++, provides an easy transition to full dco/c++.

C / C++

#include <nag.h> #include <nagad.h>

Fortran

Use iso_c_binding, Only: c_ptr Use nagad_library Use nag_library, Only: nag_wp

The following basic types represent those for real-valued variables and their first order adjoints in working precision.

Type |
C/C++ |
Fortran |

Primal values | double | Real (Kind=nag_wp) |

Adjoint | nagad_a1w_w_rtype | Type (nagad_a1w_w_rtype) |

Accessing primal values from active type:

Variable |
C |
C++ |
Fortran |

x | x.value | nagad_a1w_get_value(x) | x%value |

Initialize IR

C / C++: |
void nagad_a1w_ir_create(); |

Fortran: |
Subroutine nagad_a1w_ir_create() |

Destroy IR

C / C++: |
void nagad_a1w_ir_remove(); |

Fortran: |
Subroutine nagad_a1w_ir_remove() |

Register active variable (with respect to which we want to differentiate )

C / C++: |
void nagad_a1w_ir_register_variable(nagad_a1w_w_rtype*); |

Fortran: |
Subroutine nagad_a1w_ir_register_variable(x) Type (nagad_a1w_w_rtype) :: x |

Calculate adjoints of registered variables

C: |
void nagad_a1w_ir_interpret_adjoint(Integer* ifail); |

C++: |
void nagad_a1w_ir_interpret_adjoint(Integer &ifail); |

Fortran: |
Subroutine nagad_a1w_ir_interpret_adjoint(ifail) Integer, Intent (Inout) :: ifail |

Increment the adjoint component of active type (transposed Jacobians are applied to the set of increments, which are zero by default)

C / C++: |
void nagad_a1w_inc_derivative(const nagad_a1w_w_rtype* x, double inc); |

Fortran: |
Subroutine nagad_a1w_inc_derivative(x, ax) Type (nagad_a1w_w_rtype) :: x Real (Kind=nag_wp), Intent (In) :: ax |

Extract the derivatives of active types with respect to a given registered variable

C / C++: |
double nagad_a1w_get_derivative(const nagad_a1w_w_rtype x); |

Fortran: |
Function nagad_a1w_get_derivative(x) Real (Kind=nag_wp) :: nagad_a1w_get_derivative Type (nagad_a1w_w_rtype), Intent (In) :: x |

To use the NAG AD Library, the X10 Chapter Introduction is essential reading. This chapter contains routines for handling data objects used in the NAG AD Library. In particular, it provides a mechanism for choosing whether to compute adjoints algorithmically or, symobolically (see Section 3.2.2 in the X10 Chapter Introduction), to compute adjoints symbolically.

By default differentiation is performed algorithmically; in this case derivative calculations are performed in the algorithmic computational mode. For those routines (as listed in Section 3.2.2 in the X10 Chapter Introduction) that have symbolic adjoints available, the symbolic computational mode can be set.

A simple example program, in C, C++ and Fortran, of using the above interfaces is shown in Section 10 in **x10aa_a1w_f**. Additionally, each computational routine in the NAG AD Library contains an example program in C++ and Fortran; C examples also exist for some of these.

For an example of the interface differences between a routine in the NAG Library and its corresponding routine in the NAG AD Library, compare f07caf and f07ca_a1w_f. The example programs in Section 10 in f07ca_a1w_f show how the NAG AD Library routine is called in relation to the basic interface calls.

Many routines require user-supplied subroutines, e.g., the zero finder routine c05ay_a1w_f. Since the output value of the routine depends on the implementation of the user-supplied routine, the derivative (i.e., the adjoint) does so as well. For c05ay_a1w_f the solution and its derivative depend on the residual function. Note that for those primal routines with function arguments, those arguments have been transformed into subroutines for the NAG AD Library equivalent routine. The AD variant subroutine argument has an extra output argument (**ret**) to provide the return value. This extra return value argument is consistently placed immediately before any user workspace arrays. For example compare the primal function (with function argument) d01fbf and the NAG AD equivalent routine d01fb_a1w_f.

In the algorithmic computational mode, if the primal calculations of the supplied routine consist of simple arithmetic operations and intrinsic functions then the conversion from primal callback to adjoint callback is simply a matter (in C++ with dco/c++) of performing the same operations on the basic AD types, see Section 10 in c05ay_a1w_f. Otherwise (C or C++ without dco/c++) the procedure, enumerated below, must be implemented in the procedure argument. This involves writing an additional callback (for adjoints) which always has the same fixed interface and which is run only during the adjoint interpretation phase. This additional callback is referred to, within the NAG AD Library, as the companion callback to the supplied procedure argument.

In the symbolic computational mode, the computation needs to be broken down into four stages, where the stage to be computed is determined by a callback mode. This is described in more detail in Section 2.4 in the X10 Chapter Introduction and Section 3 in **x10bd_a1w_f**.

The following description assumes an original user-supplied function calculating
$z:=g\left(p,x\right)\text{,}$
where p is assumed to be an argument passed via the ruser variable and x is the current state. The adjoint model then calculates

and

$${p}_{\left(1\right)}:={p}_{\left(1\right)}+{\left[{\nabla}_{p}g\left(p,x\right)\right]}^{\mathrm{T}}{z}_{\left(1\right)}$$ |

$${x}_{\left(1\right)}:={x}_{\left(1\right)}+{\left[{\nabla}_{x}g\left(p,x\right)\right]}^{\mathrm{T}}{z}_{\left(1\right)}\text{.}$$ |

It is recommended that the example programs listed in Section 10 in c05ay_a1w_f and Section 10 in e04gb_a1w_f be examined to see how procedure arguments in symbolic mode are handled for a simple case; in particular it shows how to handle both algorithmic and symbolic computational modes.The basic procedure for the symbolic adjoint computation of a procedure argument using a companion adjoint callback is

- 1.create callback data object (API call in C++/Fortran)
- 2.write input arguments to callback data object e.g., $p$ and $x$ (API call in C++/Fortran)
- 3.calculate primal values $z$
- 4.register output variables (API call)
- 5.write registered variables to callback data object (API call in C++/Fortran)
- 6.insert companion callback in IR (API call).

The basic procedure of the companion adjoint callback is

- 1.read data from callback data object (API call in C++/Fortran)
- 2.get adjoints of outputs ${z}_{\left(1\right)}$ (API call)
- 3.calculate adjoint increments of inputs ${p}_{\left(1\right)}^{+}={\left[{\nabla}_{p}g\left(p,x\right)\right]}^{\mathrm{T}}{z}_{\left(1\right)}$ and ${x}_{\left(1\right)}^{+}={\left[{\nabla}_{x}g\left(p,x\right)\right]}^{\mathrm{T}}{z}_{\left(1\right)}$
- 4.increment adjoints of inputs ${p}_{\left(1\right)}+={p}_{\left(1\right)}^{+}$ and ${x}_{\left(1\right)}+={x}_{\left(1\right)}^{+}$ (API call).

As a particular example of calculating adjoint increments, if

then

and

$$z:=g\left(p,x\right)=\mathrm{sin}\left(p{x}^{2}\right)$$ |

$${\nabla}_{x}g\left(p,x\right)=2\mathrm{cos}\left(z\right)xp$$ |

$${\nabla}_{p}g\left(p,x\right)=\mathrm{cos}\left(z\right){x}^{2}\text{.}$$ |

Since we use AD (algorithmic differentiation) software internally to automatically generate the adjoint code, in standard configuration all adjoint routines compute algorithmic (or discrete) adjoints. Symbolic adjoints on the other hand cannot be generated automatically and thus have to be written by hand. The benefit of symbolic adjoints is that it may exploit mathematical properties of a NAG Library routine yielding a more efficient adjoint calculation. The possible efficiency gains through symbolic adjoints depend on the specific routine. Examples for huge benefits are the linear and the nonlinear solvers. See Giles (2008) for more on the symbolic treatment of a linear solver and Naumann et al. (2015) for more on the symbolic treatment of a nonlinear solver. Section 3.1 in f07ca_a1w_f provides a description of how the adjoint is computed symbolically for that routine. Similar descriptions are provided in each routine document for which a symbolic adjoint is available. See Section 3.2.2 in the X10 Chapter Introduction for a list of routines for which this mode is available.

In the previous sections, we assumed pure inputs and outputs, i.e., no overwriting of variables. When overwriting variables, accessing adjoints later may result in wrong values. This is due to the fact, that adjoints are accessed via a reference inside the active variable. When a variable gets overwritten, this reference changes.

See Section 10 in f07ca_a1w_f which shows how copies are made of active input/output variables to obtain correct derivative values.

The NAG AD Library uses the same error reporting mechanism as the main Library as described in Section 4 in the Introduction to the NAG Library FL Interface.
There may be additional routine specific error exits for NAG AD Library routines, relating to cases where the adjoint could not be computed accurately, and two additional error exits common to all routines in the NAG AD Library as described in the following subsections.

In the case where a routine detects a failure to dynamically allocate sufficient memory, the routine will set an error condition, setting $\mathbf{ifail}=-899$, and exit with an appropriate error message.

Internal calls to Library routines are checked for error exits even when these exits are not to be expected. Should an unexpected error exit occur the routine will set an error condition by setting **ifail** and exit with an appropriate error message. The value $\mathbf{ifail}=-89$ is set for unexpected error detection.

dco is available as the licenced tool dco/c++. It implements Algorithmic Differentiation using operator overloading with features presented in its documentation (see NAG and Algorithmic Differentiation).

As already mentioned, the interface of the NAG AD Library is compatible with a dco solution. To be more specific, the data types and the functions acting on those types are binary compatible with dco. The interfaces given in Section 4.2.1 and those from the X10 Chapter Introduction therefore have a corresponding dco/c++ interface. This interface also comes with dco/c++/light, which is part of the NAG AD Library. dco/c++/light can only be used in conjuction with the NAG AD Library.

dco/c++ has various configuration options at compile time. When using dco/c++ or dco/c++/light interfaces, you have to use the correct configuration, i.e., the default. Do not set any compile time flags as described in Section 2.2 of the dco/c++ User Guide (full documentation is available only with dco). If you require any of the flags, please contact NAG.

Algorithmic differentiation can be used within the NAG Library in two different ways, as AD of the NAG Library and AD for the NAG Library.

As described in this document, the NAG AD Library makes it possible for you to calculate derivatives of NAG Library routines. As shown, the AD interface requires use of special data types. It is advantageous to the understanding of these types if you are familiar with how dco/c++ works, i.e., have read the documentation of dco/c++. In addition, those special data types are in fact binary compatible with the dco/c++ data types. This means that the NAG AD Library routines can be seamlessly integrated into an overall dco/c++ solution.

dco/c++ is particularly convenient for those routines with supplied procedure arguments. In algorithmic computational mode, the conversion from NAG Library to NAG AD Library version of this procedure argument is implemented simply by replacing active types (e.g., double) by the corresponding dco/c++ or nagad data types. Nothing further is required in this case.

For example, if a procedure argument for a NAG Library routine looked like:

void fcn(const double *x, double *z, Integer iuser[], double ruser[]) { z = sin(ruser[0]*x*x) - 0.1; }then the NAG AD Library equivalent routine is simply

void fcn_a1w(void **ad_handle, const nagad_a1w_w_rtype *x, nagad_a1w_w_rtype *z, Integer iuser[], nagad_a1w_w_rtype ruser[]) { z = sin(ruser[0]*x*x) - 0.1; }or, using dco/c++ types,

void fcn_a1w(void **ad_handle, const dco::ga1s<double>::type *x, dco::ga1s<double>::type *z, Integer iuser[], dco::ga1s<double>::type ruser[]) { z = sin(ruser[0]*x*x) - 0.1; }

Many routines within the NAG Library require derivative information of a user-supplied function or subroutine. dco/c++ can deliver those derivatives automatically by using operator overloading on the function implementation. Depending on the required derivatives (e.g., Jacobian or Gradient), tangent or adjoint modes can be used. In the NAG AD Library, the unconstrained minimization routine e04gb_a1w_f has an objective function argument (**lsqfun**) for which Jacobian matrix values can be returned. Therefore, the supplied procedure argument can create its own IR, register input variables and interpret adjoints of the objective function in order to return Jacobian values.

As an example, see the implementation of the user-supplied function in Section 10 in e04gb_a1w_f.

The adjoint routines are collated together in the FL Interface Contents.

Each routine document provides links in a similar manner to the NAG FL Interface documentation, but note that the Chapter Introduction link is to the NAG FL Interface
as separate chapter introductions are not provided for the NAG AD Library (other than the AD-specific Chapter X10).

Each specification provides details of the Fortran and C++ header interfaces available. Hovering over an argument name will reveal a tooltip popup, and as in the NAG FL Interface is a link to Section 5 giving more details of the arguments.

As described in Section 2.1 in the X10 Chapter Introduction each routine will have a first argument (**ad_handle**) of type c_ptr which requires a Use iso_c_binding statement at the start of the calling Fortran program unit (e.g., in any Fortran example program of the NAG AD Library). Subsequent arguments that were real-valued in the primal version of the NAG Library are replaced by arguments of the same name with NAG AD type, e.g., nagad_a1w_w_rtype.

The C++ header specifications begin with a #include "nagad.h" line. If full dco/c++ is available (i.e., dco/c++ is installed and licensed) then this may be preceded by the line #include "dco.hpp" as shown in the C++ examples e.g., Section 10 in c05ay_a1w_f.

The description section contains an abbreviated description of the primal routine. When a NAG AD Library routine can use the symbolic computational mode, details of the symbolic adjoint computation will be provided following this description. A link to the Symbolic Adjoints section will also be provided in the table of contents. See for example Section 3.1 in c05ay_a1w_f.

The argument descriptions of arguments that have corresponding arguments in the primal routine are provided via a link to the description in the primal routine document. An abbreviated version of the text is also provided as a tooltip if you hover over the argument name.

Arguments that are unique to the AD interface show the description in the document, as well as the tooltip.

Each example is a variant of the example for the primal routine which has been modified to demonstrate calling the NAG AD Library. This will demonstrate the basic stages of adjoint calculation including the calling of NAG AD Library computational routines.

As for the main NAG Library the example programs are designed so that they can be fairly easily modified, and so serve as the basis for a simple program to solve your problem.

For each implementation of the Library, NAG distributes the example programs in machine-readable form, with all necessary modifications applied. Many sites make the programs accessible to you in this form. Generic forms of the programs, without implementation-specific modifications, may be obtained directly from the NAG web site. The Users' Note for your implementation will mention any special changes which need to be made to the example programs.

Note that the results obtained from running the example programs may not be identical in all implementations and may not agree exactly with the results in the Manual.

Dunford N, Schwartz J T, Bade W G and Bartle R G (1971) *Linear Operators* Wiley Interscience, New York

Giles M (2008) *Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation* Springer

Griewank A and Walther A (2008) Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiations (2nd Edition) *SIAM*

Hascoët L, Naumann U and Pascual V (2005) ‘To be Recorded’ Analysis in Reverse-Mode Automatic Differentiation *Future Generation Computer Systems* **21(8)** 299–304 Elsevier

Naumann U (2012) The Art of Differentiating Computer Programs: An Introduction to Algorithmic Differentiation *SIAM*

Naumann U, Lotz J, Leppkes K, Towara M (2015) Algorithmic Differentiation of Numerical Methods: Tangent and Adjoint Solvers for Parameterized Systems of Nonlinear Equations *ACM Trans. Math. Softw.*