Creating C++ Interfaces For The NAG Library: Part 2
In the first part of this series we came up with a basic list of requirements for a suite of C++ interfaces to the NAG Library and a list of restrictions we need to work under. In this article we start to look at how we can transform those lists into a template for producing the interfaces.
Underlying the majority of the routines in the NAG Library is the NAG Library Engine. The API for the Engine contains only simple (ANSI C mappable) types, with all array like data structures (vectors, matrices etc.) supplied as contiguous memory. the Engine API is undocumented, the closest documented interface we have is our standard FL interfaces. The main difference between these and the Engine API is that the Engine interfaces tend to have more arguments.
For our C++ interfaces we would like to allow richer data structures to be supplied directly to the NAG routines. Rather than implement specific data classes we want to take a more flexible approach, some advice on how to do this has been discussed by one of my colleagues in an earlier blog article. That article, along with the Engine API, will form the starting point for our new C++ interfaces.
There are a number of components that need to be taken into consideration when designing a new suite of interfaces, in the rest of this article we are going to concentrate mainly on one of them: how to handle array arguments. Of the four requirements we identified in the first article of this series, two can be related to the handling of arrays: the interface
should not impose a rigid data structureand
should be as simplified as possible.
Use of Templates
For flexibility all array arguments will be individually templated, so an interface may look like:
template <typename A, typename B>
void this_routine(A && a, B && b, double c);
The minimum amount of information that the Engine API needs from a class used as an array argument is the location of the underlying raw data, therefore we are going to assume that each class used to supply an array has a data
method:
template <typename DT> DT data(void);
where DT
is a pointer to an array of elements which can be statically cast into a double
, std::complex
or a nagcpp::types::f77_integer
for real, complex and integer arrays respectively. Here, nagcpp::types::f77_integer
will either be an int
or long
depending on the implementation. If casting is required then the data will need to be copied but the returned pointer will be passed directly to the Engine otherwise.As can be seen from the
f77_integer
type we are currently planning on using the nagcpp
namespace as the parent namespace which will contain a number of children, currently one per chapter of the Library and a couple of utility namespaces.An alternative to using a specific method like
a.data
to access the raw data would be to use &(a[0])
if the []
operator had been implemented. The requirement of a specific method seems cleaner (and easier to document) than making use of the side effect of an operator, so we are not currently planning on allowing the use of &(a[0])
. However the code should be abstracted enough to allow this functionality to be easily added if required.The above set up means that STL vectors can be passed directly to the NAG routines out of the box, however Boost matrices can not. The Boost matrix API does not provide a method to cleanly access the raw data (i.e. the equivalent of the
data
method). This is one class where allowing the []
operator to be used rather than the data
method may work - however it would be using a non-documented feature of the API, so is not ideal. An alternative would be to use the Boost copy API to copy data into contiguous memory before passing it to the NAG routine - a decision on that is something we will defer to a later date.In addition to the
data
method, we are going to assume that each class used to supply a multi-dimensional array (or matrix) has an is_col_major
method:
bool is_col_major(void);
which returns true
if the multi-dimensional array is supplied in column major order or row major order. If no such method is supplied the data will be assumed to be in column major order. Column major order is being used as the default because whilst the NAG Library Engine can accept arrays in either column or row major order, due to the way that various algorithms are implemented, it is often more efficient to supply data in column major order. It will be possible to override this default value on a routine by routine basis.Currently we are not planning on using templates for scalar arguments.
How arrays of strings will be best handled is something that we still need to investigate.
Interfaces Supplied as Header Files
Because of the use of templates, the C++ interface will need to be supplied as header files as opposed to a pre-compiled library. This also has the advantage of allowing it to sit more easily on top of any library products - one of the other requirements we had for this set of interfaces.
Currently it is planned for each routine to be in a separate, stand-alone header file. Header files for each chapter will be supplied in their own directory and a combined header file will be supplied for each chapter and one more for the whole library.
Arrays Know Their Own Size
Most routines that take an array as an input argument also have one or more arguments that relate to the size of that array, like in the Python interfaces it would be nice if these arguments could be dropped.
Because we are templating array arguments we can assume that they "know their own size". In order for this assumption to be true we need to assume that each class has some size methods:
template <typename IT> IT size(void);
template <typename IT> IT size1(void);
template <typename IT> IT size2(void);
template <typename IT> IT size3(void);
where the templated type IT
is a type which can be statically cast into a nagcpp::types::f77_integer
.The size methods,
size1
, size2
and size3
return the first, second and third dimensions of an array. They need not be implemented if the array does not have that dimension, so for a vector (one-dimensional array), only size
needs to be present. For a two-dimensional array both size1
and size2
would need to be implemented, etc. The methods size
and size1
can both be used to return the first dimension, with size
taking precedence if both are present (but then why would you implement both!).In addition to the size methods, we are also assuming that:
template <typename IT> ndims(void);
exists and returns the number of dimensions for an array. However, if ndims
is not present then the value is inferred from the presence of the various size methods.As well as allowing any arguments that are array sizes (or inferable from an array size) to be dropped from the interface assuming that arrays "know their own size" allows us to add additional runtime checks to ensure that the supplied arrays are the correct size. Because of this we currently assume that each array in an interface has the size methods implemented, even where it would be possible to infer all array sizes arguments from a subset of array arguments.
Oversized Arrays
Some of the Engine APIs allow oversized arrays to be supplied. As an example, suppose that you have an matrix and rather than supplying in an array with rows and columns you supply it in the top left-hand corner of an array with rows and columns, for . In order for the NAG routine to access the elements of correctly it needs to know not only the size of but also the column stride, (here we are assuming that the data is stored in column major order, a similar argument applies to data store in row major order, just with the dimensions switch over). The main reason for allowing oversized arrays to be supplied is to allow for sub-arrays to be passed directly to a routine. However there are very few cases where this is needed and the added complication of this, in terms of the interface and documentation, tends to outweigh the benefit.
We are currently assuming that no arrays are oversized, so in the above example we assume that the column stride is always the same as the number of rows.
Allocation of Output Arrays
The standard interfaces to the NAG Library require a user to pre-allocate output arrays. Because arrays are templated and the fact that the C++ interface will be supplied as headers, we can relax this requirement if we assume that each class used as an output array argument has a resize method:
template <typename IT> resize(IT n1);
template <typename IT> resize(IT n1, IT n2);
template <typename IT> resize(IT n1, IT n2, IT n3);
which allocates the memory returned by the data
method to the specified size defined by input arguments n1
, n2
and n3
. If no such method is implemented then it will be assumed that the output array was pre-allocated prior to calling the NAG routine.Arrays in Callbacks
Whilst it is possible to allow arbitrary classes (with a handful of known methods) to be accepted as array arguments to the main routine interfaces it is not possible to do this in callback interfaces (i.e. functions written by the user that are passed as arguments to the NAG routine). This is because the input arguments to callbacks come from the NAG Library Engine and are then passed to the users calling program, at the point that this happens there is no information on the type of classes the user supplied when calling the main routine. We therefore need to implement a class to hold array data passed from the Engine to the user when calling a callback function. These classes have currently been given the very imaginative names
nagcpp::utility::array1D_ref
, nagcpp::utility::array2D_ref
and nagcpp::utility::array3D_ref
for one, two and three-dimensional arrays. Where appropriate these classes will implement the methods described above (with the exception of the resize
method) and the ()
operator which will allow the elements of the array to be accessed, for example a(i,j,k)
would access element of a three-dimensional array .Summary
Our current plan for the C++ interfaces now includes:
- Code will be supplied in header files
- All array arguments will be templated.
- A class used to supply an array argument will be assumed (where appropriate) to have the following methods implemented:
template <typename IT> IT size(void); template <typename IT> IT size1(void); template <typename IT> IT size2(void); template <typename IT> IT size3(void); template <typename IT> ndims(void); template <typename DT> DT data(void); bool is_col_major(void); template <typename IT> resize(IT n1); template <typename IT> resize(IT n1, IT n2); template <typename IT> resize(IT n1, IT n2, IT n3);
- Array arguments in callbacks will be type
nagcpp::utility::array1D_ref
,nagcpp::utility::array2D_ref
andnagcpp::utility::array3D_ref
.