NAG AD Library
Introduction
1
Introduction
The NAG AD Library implements Automatic Differentiation (AD)
on a variety of algorithms from the NAG Library. It builds on the
functionality of
dco/c++ and delivers first and second order
derivatives via forward and reverse mode AD
(see
NAG and Algorithmic Differentiation). A subset of the supported AD
algorithms has been optimized by using symbolic differentiation (see
Section 3.3.2), which gives substantial savings in runtime and
memory consumption. The NAG AD Library comes with C++ interfaces which
allow seamless use with
dco/c++. Read about
Algorithmic Differentiation in More Depth on the NAG website.
The NAG AD Library can be used from other languages and with
other AD tools. For details, please contact
NAG. Though
this documentation is meant to be self-contained, full details of
how to use the NAG Library and its documentation may be of
interest and can be found in
How to Use the NAG Library.
2
The NAG AD Library and dco/c++
The Library is designed to work seamlessly with dco/c++, however being a compiled
library, it only supports a subset of the features. The following data types are supported to
compute the primal (the underlying non-AD algorithm) and first and second order derivatives:
Primal (or passive values) |
double |
First order tangent |
dco::gt1s<double>::type |
First order adjoint |
dco::ga1s<double>::type |
Second order tangent |
dco::gt1s<dco::gt1s<double>::type>::type |
Second order adjoint |
dco::ga1s<dco::gt1s<double>::type>::type |
First order vector types |
types listed in individual routine documentation |
Second order vector types |
types listed in individual routine documentation |
Not all routines have second order or vector versions. However where these exist, they can be found
highlighted on the
Chapter Contents
pages.
Additionally, the Library supports chunk and blob tapes (see the dco/c++ documentation for more details) by setting a preprocessor define:
|
Preprocessor define |
blob tape |
none |
chunk tape |
-DDCO_CHUNK_TAPE |
Blob tape gives the best performance but requires the user to set the maximum tape size upfront. On
Linux, huge tape sizes can be specified safely since the operating system only allocates memory when it is
used.
When using the NAG AD Library by including nagad.h, dco.hpp gets included as well.
In case you want to use the interface of the NAG AD Library without dco/c++, you can define
NAGAD_SKIP_DCO_HPP_INCLUDE to skip the include of dco.hpp.
On Windows, the operating system allocates and initializes memory in debug builds, when the program requests it,
hence requesting huge tape sizes is not advised. Windows users are advised to start with chunk tape, and
then consider switching to blob tape as a performance optimization.
Linux users should also consider enabling
huge pages if the program ends up using a lot of tape
memory. Performance can be improved when
dco/c++ uses huge pages (see the corresponding sections in
the dco/c++ documentation).
All
dco/c++ features with the data types listed in the
table above can be used. These include, for example, checkpointing, Jacobian preaccumulation, or
external adjoint vectors. For more information on these and other features of
dco/c++,
see the
NAG website. If you
require
dco/c++ features which are currently not supported by the NAG AD Library, e.g., modulo adjoint
vector, please email your suggestions to
NAG.
3
Using the NAG AD Library
The package consists of a set of C++ headers and precompiled libraries. Details on how to build and
link an executable are given in the
Users' Note.
3.1
Routine Interfaces
The C++ interfaces of the routines are overloaded and can be used to compute primal values (the
underlying non-AD algorithm) with double type, as well as first and second
order derivatives with
dco/c++ data types (the supported types are
in
Section 2). Every routine comes with an example using each
of the types. See
f07ca for an example.
3.2
Handle Object
All routines take as the first argument a nag::ad::handle_t object. This can be used to switch
between algorithmic and symbolic differentiation strategies. The same object is also forwarded as the first
argument to any user-supplied callbacks. The callback should only need to interact with this object if
users are implementing a nag::ad::symbolic_expert strategy (see
below). Otherwise, the nag::ad::handle_t object can be ignored inside the user callback.
nag::ad::handle_t has constructors (default, copy, move) as well as assignment operators
(copy, move). Default constructor and copy operations may throw exceptions of
type std::bad_alloc or nag::ad::exception_t which derives
from std::exception. When the default constructor detects incompatibility between
the included dco/c++ and the Library backend, it will throw
a nag::ad::exception_t.
In addition,
nag::ad::handle_t has the following member functions:
void set_strategy(strategy_e strategy) |
Sets the differentiation strategy to be used with the following enum:
enum nag::ad::strategy_e {
algorithmic, symbolic, symbolic_expert
};
See Section 3.3.3 for more details. |
active_inputs_e active_inputs() |
In symbolic expert mode, the user-supplied callback implementation needs to provide specific code paths depending on which set of inputs is “active” (see Section 3.3.3 below). The callback can call this function to see which set of inputs is currently “active”. The following enum is returned:
enum nag::ad::active_inputs_e {
none, state, params, all
};
Explaining these values in an abstract setting leads to rather abstract documentation. Instead, we have
documented these values in the example programs of the routines which support a symbolic expert strategy.
This provides the most concrete advice on how these values should be used. |
It is not necessary to create a new handle object each time a NAG routine is called. The same handle
object can be reused when calling different routines. In this case, please ensure the correct/desired
settings (e.g., AD strategy) are set.
3.3
AD Strategies
AD strategy refers to the method for computing the derivatives. The handle object is used
to configure the AD strategy to be used (see
Section 3.2).
3.3.1
Algorithmic Derivatives
Since we use dco/c++ internally to generate the AD routines, by default, all routines
compute algorithmic derivatives. This refers to the case where dco/c++ is used on a
scalar operation level to facilitate the calculation of the derivatives.
3.3.2
Symbolic Derivatives
For some routines, properties of the underlying mathematical problem can be exploited, yielding a more
efficient implementation for the derivative computation. We call this a symbolic derivative. The
efficiency gains depend on the specific routine. Examples of cases with huge benefits are the linear and
nonlinear solvers. For more on the symbolic treatment of linear solvers see
Giles (2008) and for
nonlinear solvers see
Naumann et al. (2015). Descriptions of how the derivative is computed symbolically
are provided in the respective routine documentation.
The
Chapter Contents
pages highlight routines that offer symbolic derivatives.
3.3.3
Setting the AD Strategy
The AD strategy can be set via the handle object (see
Section 3.2) with one
of the supported enum values:
nag::ad::algorithmic |
This is the default strategy, see Section 3.3.1. |
nag::ad::symbolic |
This enables the symbolic derivative implementation, if available. No further changes are required compared to the algorithmic case. |
nag::ad::symbolic_expert |
For routines with user-supplied callbacks, users can take advantage of this strategy. Derivatives are
computed symbolically here as well, but the callback must provide a more specialised implementation which usually improves overall performance. The symbolic implementation of a
nonlinear solver, for example, first needs to solve the nonlinear system passively, i.e., without
propagating derivatives (neither in forward mode, nor in reverse mode). This means the callback could
call lower-order arithmetic while in this stage (double in first
and dco::gt1s<double>::type in second order), which is usually more efficient. The set of inputs is
partitioned into two groups, ‘state’ and ‘parameters’, and the solver will tell the callback which group
is currently active. Full details on how to implement this more specialised callback can be found in
the example programs of the routines which support the symbolic expert strategy. |
3.4
How to Handle Overwriting of Inputs
When using reverse mode AD, program variables store a virtual
address to their derivative component. This virtual address changes
when a variable gets overwritten. If a variable
x is overwritten,
and derivatives with respect to it are required (by
calling
dco::derivative(x)), then a copy of
x must be stored so
that correct derivatives can be retrieved at the end of the
program (e.g.,
f07ca example program).
This is not NAG AD Library specific but is a feature of many AD tools including
dco/c++.
3.5
Error handling
NAG AD Library routines use the same error reporting mechanism as the main Library (described
in
Section 4 in the Introduction to the NAG Library FL Interface). AD routines have additional error exits, which
are either related to
dco/c++ or to cases where the derivatives could not be computed accurately. All exit
codes specific to a routine are described in its corresponding documentation.
4
References
Dunford N, Schwartz J T, Bade W G and Bartle R G (1971) Linear Operators Wiley Interscience, New York
Giles M (2008) Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation Springer
Griewank A and Walther A (2008) Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation (2nd Edition) SIAM
Hascoët L, Naumann U and Pascual V (2005) ‘To be Recorded’ Analysis in Reverse-Mode Automatic Differentiation Future Generation Computer Systems 21(8) 299–304 Elsevier
Naumann U (2012) The Art of Differentiating Computer Programs: An Introduction to Algorithmic Differentiation SIAM
Naumann U, Lotz J, Leppkes K, Towara M (2015) Algorithmic Differentiation of Numerical Methods: Tangent and Adjoint Solvers for Parameterized Systems of Nonlinear Equations ACM Trans. Math. Softw.