NAG AD Library

Settings help

AD Name Style:

AD Specification Language:

1 Introduction

The NAG AD Library implements Automatic Differentiation (AD) on a variety of algorithms from the NAG Library. It builds on the functionality of dco/c++ and delivers first and second order derivatives via forward and reverse mode AD (see NAG and Algorithmic Differentiation). A subset of the supported AD algorithms has been optimized by using symbolic differentiation (see Section 3.3.2), which gives substantial savings in runtime and memory consumption. The NAG AD Library comes with C++ interfaces which allow seamless use with dco/c++. Read about Algorithmic Differentiation in More Depth on the NAG website.
The NAG AD Library can be used from other languages and with other AD tools.  For details, please contact NAG.  Though this documentation is meant to be self-contained, full details of how to use the NAG Library and its documentation may be of interest and can be found in How to Use the NAG Library.

2 The NAG AD Library and dco/c++

The Library is designed to work seamlessly with dco/c++, however being a compiled library, it only supports a subset of the features. The following data types are supported to compute the primal (the underlying non-AD algorithm) and first and second order derivatives:
Primal (or passive values) double
First order tangent dco::gt1s<double>::type
First order adjoint dco::ga1s<double>::type
Second order tangent dco::gt1s<dco::gt1s<double>::type>::type
Second order adjoint dco::ga1s<dco::gt1s<double>::type>::type
Not all routines have second order versions. However where these exist, they can be found highlighted on the Chapter Contents pages.
Additionally, the Library supports chunk and blob tapes (see the dco/c++ documentation for more details) by setting a preprocessor define:
Preprocessor define
blob tape none
chunk tape -DDCO_CHUNK_TAPE
Blob tape gives the best performance but requires the user to set the maximum tape size upfront. On Linux, huge tape sizes can be specified safely since the operating system only allocates memory when it is used.
On Windows, the operating system allocates and initializes memory in debug builds, when the program requests it, hence requesting huge tape sizes is not advised. Windows users are advised to start with chunk tape, and then consider switching to blob tape as a performance optimization.
Linux users should also consider enabling huge pages if the program ends up using a lot of tape memory. Performance can be improved when dco/c++ uses huge pages (see the corresponding sections in the dco/c++ documentation).
All dco/c++ features with the data types listed in the table above can be used. These include, for example, checkpointing, Jacobian preaccumulation, or external adjoint vectors. For more information on these and other features of dco/c++, see the NAG website. If you require dco/c++ features which are currently not supported by the NAG AD Library, e.g., modulo adjoint vector, please email your suggestions to NAG.

3 Using the NAG AD Library

The package consists of a set of C++ headers and precompiled libraries. Details on how to build and link an executable are given in the Users' Note.

3.1 Routine Interfaces

The C++ interfaces of the routines are overloaded and can be used to compute primal values (the underlying non-AD algorithm) with double type, as well as first and second order derivatives with dco/c++ data types (the supported types are in Section 2).  Every routine comes with an example using each of the types. See f07ca for an example.

3.2 Handle Object

All routines take as the first argument a nag::ad::handle_t object. This can be used to switch between algorithmic and symbolic differentiation strategies. The same object is also forwarded as the first argument to any user-supplied callbacks. The callback should only need to interact with this object if users are implementing a nag::ad::symbolic_expert strategy (see below). Otherwise, the nag::ad::handle_t object can be ignored inside the user callback.
nag::ad::handle_t has constructors (default, copy, move) as well as assignment operators (copy, move). Default constructor and copy operations may throw exceptions of type std::bad_alloc or nag::ad::exception_t which derives from std::exception.  When the default constructor detects incompatibility between the included dco/c++ and the Library backend, it will throw a nag::ad::exception_t.
In addition, nag::ad::handle_t has the following member functions:
void set_strategy(strategy_e strategy)
Sets the differentiation strategy to be used with the following enum:
enum nag::ad::strategy_e {   
    algorithmic, symbolic, symbolic_expert   
See Section 3.3.3 for more details.
active_inputs_e active_inputs()
In symbolic expert mode, the user-supplied callback implementation needs to provide specific code paths depending on which set of inputs is “active” (see Section 3.3.3 below).  The callback can call this function to see which set of inputs is currently “active”. The following enum is returned: 
enum nag::ad::active_inputs_e {  
    none, state, params, all  
Explaining these values in an abstract setting leads to rather abstract documentation.  Instead, we have documented these values in the example programs of the routines which support a symbolic expert strategy.  This provides the most concrete advice on how these values should be used.
It is not necessary to create a new handle object each time a NAG routine is called. The same handle object can be reused when calling different routines. In this case, please ensure the correct/desired settings (e.g., AD strategy) are set.

3.3 AD Strategies 

AD strategy refers to the method for computing the derivatives. The handle object is used to configure the AD strategy to be used (see Section 3.2).

3.3.1 Algorithmic Derivatives

Since we use dco/c++ internally to generate the AD routines, by default, all routines compute algorithmic derivatives. This refers to the case where dco/c++ is used on a scalar operation level to facilitate the calculation of the derivatives.

3.3.2 Symbolic Derivatives

For some routines, properties of the underlying mathematical problem can be exploited, yielding a more efficient implementation for the derivative computation. We call this a symbolic derivative. The efficiency gains depend on the specific routine. Examples of cases with huge benefits are the linear and nonlinear solvers. For more on the symbolic treatment of linear solvers see Giles (2008) and for nonlinear solvers see Naumann et al. (2015). Descriptions of how the derivative is computed symbolically are provided in the respective routine documentation. The Chapter Contents pages highlight routines that offer symbolic derivatives.

3.3.3 Setting the AD Strategy

The AD strategy can be set via the handle object (see Section 3.2) with one of the supported enum values:
This is the default strategy, see Section 3.3.1.
This enables the symbolic derivative implementation, if available. No further changes are required compared to the algorithmic case.
For routines with user-supplied callbacks, users can take advantage of this strategy. Derivatives are computed symbolically here as well, but the callback must provide a more specialised implementation which usually improves overall performance. The symbolic implementation of a nonlinear solver, for example, first needs to solve the nonlinear system passively, i.e., without propagating derivatives (neither in forward mode, nor in reverse mode). This means the callback could call lower-order arithmetic while in this stage (double in first and dco::gt1s<double>::type in second order), which is usually more efficient. The set of inputs is partitioned into two groups, ‘state’ and ‘parameters’, and the solver will tell the callback which group is currently active.  Full details on how to implement this more specialised callback can be found in the example programs of the routines which support the symbolic expert strategy.

3.4 How to Handle Overwriting of Inputs

When using reverse mode AD, program variables store a virtual address to their derivative component. This virtual address changes when a variable gets overwritten. If a variable x is overwritten, and derivatives with respect to it are required (by calling dco::derivative(x)), then a copy of x must be stored so that correct derivatives can be retrieved at the end of the program (e.g., f07ca example program). This is not NAG AD Library specific but is a feature of many AD tools including dco/c++.

3.5 Error handling

NAG AD Library routines use the same error reporting mechanism as the main Library (described in Section 4 in the Introduction to the NAG Library FL Interface). AD routines have additional error exits, which are either related to dco/c++ or to cases where the derivatives could not be computed accurately. All exit codes specific to a routine are described in its corresponding documentation.

4 References

Dunford N, Schwartz J T, Bade W G and Bartle R G (1971) Linear Operators Wiley Interscience, New York
Giles M (2008) Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation Springer
Griewank A and Walther A (2008) Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation (2nd Edition) SIAM
Hascoët L, Naumann U and Pascual V (2005) ‘To be Recorded’ Analysis in Reverse-Mode Automatic Differentiation Future Generation Computer Systems 21(8) 299–304 Elsevier
Naumann U (2012) The Art of Differentiating Computer Programs: An Introduction to Algorithmic Differentiation SIAM
Naumann U, Lotz J, Leppkes K, Towara M (2015) Algorithmic Differentiation of Numerical Methods: Tangent and Adjoint Solvers for Parameterized Systems of Nonlinear Equations ACM Trans. Math. Softw.