AutoDiff.h

Classes

AutoDiff -- Class that computes partial derivatives by automatic differentiation. (full description)

template <class T> class AutoDiff

Interface

Public Members

AutoDiff()

AutoDiff(const T &v)

AutoDiff(const T &v, const uInt ndiffs, const uInt n)

AutoDiff(const T &v, const uInt ndiffs)

AutoDiff(const AutoDiff<T> &other)

AutoDiff(const T &v, const Vector<T> &derivs)

~AutoDiff()

AutoDiff<T> &operator=(const T &v)

AutoDiff<T> &operator=(const AutoDiff<T> &other)

void operator*=(const AutoDiff<T> &other)

void operator/=(const AutoDiff<T> &other)

void operator+=(const AutoDiff<T> &other)

void operator-=(const AutoDiff<T> &other)

void operator*=(const T other)

void operator/=(const T other)

void operator+=(const T other)

void operator-=(const T other)

AutoDiffRep<T> *theRep()

const AutoDiffRep<T> *theRep() const

T &value()

const T &value() const

Vector<T> derivatives() const

void derivatives(Vector<T> &res) const

T &derivative(uInt which)

const T &derivative(uInt which) const

T &deriv(uInt which)

const T &deriv(uInt which) const

uInt nDerivatives() const

Bool isConstant() const

const AutoDiff<T> &ref()

Private Members

void release()

Description

Review Status

Date Reviewed:

yyyy/mm/dd

Programs:

Demos:

dAutoDiff.cc

Tests:

tAutoDiff.cc

Prerequisite

Etymology

Class that computes partial derivatives by automatic differentiation, thus AutoDiff.

Synopsis

Class that computes partial derivatives by automatic differentiation. It does this by storing the value of a function and the values of its first derivatives with respect to its independent parameters. When a mathematical operation is applied to an AutoDiff object, the derivative values of the resulting new object are computed according to chain rules of differentiation.

Suppose we have a function f(x0,x1,...,xn) and its differential is

    df = (df/dx0)*dx0 + (df/dx1)*dx1 + ... + (df/dxn)*dxn

We can build a class that has the value of the function, f(x0,x1,...,xn), and the values of the derivatives, (df/dx0), (df/dx1), ..., (df/dxn) at (x0,x1,...,xn), as class members.

Now if we have another function, g(x0,x1,...,xn) and its differential is dg = (dg/dx0)*dx0 + (dg/dx1)*dx1 + ... + (dg/dxn)*dxn, since

    d(f+g) = df + dg, 
    d(f*g) = g*df + f*dg, 
    d(f/g) = df/g - fdg/g^2,
    dsin(f) = cos(f)df, 
    ...,

we can calculate

    d(f+g), d(f*g), ...,

based on our information on

    df/dx0, df/dx1, ..., dg/dx0, dg/dx1, ..., dg/dxn.

All we need to do is to define the operators and derivatives of common mathematical functions.

To be able to use the class as an automatic differentiator of a function object, we need a templated function object, i.e. an object with:

a template <class T> T operator()(const T)
or multiple variable input like: template <class T> T operator()(const Vector<T> &)
all variables and constants used in the calculation of the function value should have been typed with T

A simple example of such a function object could be:

	template <class T> f {
	public:
	  T operator()(const T &x, const T &a, const T &b) {
	  	return a*b*x; };
 	};
      // Instantiate the following versions:
	template class f<Double>;
	template class f<AutoDiff<Double> >;

A call with values will produce the function value:

	cout << f(7.0, 2.0, 3.0) << endl;
	// will produce the value at x=7 for a=2; b=3:
	42
	// But a call indicating that we want derivatives to a and b:
	cout << f(AutoDiff<Double>(7.0), AutoDiff<Double>(2.0, 2, 0),
		  AutoDiff<Double>(3.0, 2, 1)) << endl;
	// will produce the value at x=7 for a=2; b=3:
	// and the partial derivatives wrt a and b at x=7:
	(42, [21, 14])
	// The following will calculate the derivate wrt x:
	cout << f(AutoDiff<Double>(7.0, 1, 0), AutoDiff<Double>(2.0),
		  AutoDiff<Double>(3.0)) << endl;
	(42,[6])

In actual practice, there are a few rules to obey for the structure of the function object if you want to use the function object and its derivatives in least squares fitting procedures in the Fitting module. The major one is to view the function object having 'fixed' and 'variable' parameters. I.e., rather than viewing the function as depending on parameters a, b, x (f(a,b,x)), the function is considered to be f(x; a,b), where a, b are 'fixed' parameters, and x a variable parameter. Fixed parameters should be contained in a FunctionParam container object; while the variable parameter(s) are given in the function operator(). See Function class for details.

A Gaussian spectral profile would in general have the center frequency, the width and the amplitude as fixed parameters, and the frequency as a variable. Given a spectrum, you would solve for the fixed parameters, given spectrum values. However, in other cases the role of the parameters could be reversed. An example could be a whole stack of observed (in the laboratory) spectra at different temperatures at one frequency. In that case the width would be the variable parameter, and the frequency one of the fixed (and to be solved for)parameters.

Since the calculation of the derivatives is done with simple overloading, the calculation of second (and higher) derivatives is easy. It should be noted that higher deivatives are inefficient in the current incarnation (there is no knowledge e.g. about symmetry in the Jacobian). However, it is a very good way to get the correct answers of the derivatives. In practice actual production code will be better off with specialization of the f<AutoDiff<> > implementation.

The AutoDiff class is the class the user communicates with. Alias classes (AutoDiffA and AutoDiffX) exists to make it possible to have different incarnations of a templated method (e.g. a generic one and a specialized one). See the dAutoDiff demo for an example of its use.

All operators and functions are declared in AutoDiffMath. The output operator in AutoDiffIO. The actual structure of the data block used by AutoDiff is described in AutoDiffRep.

Example

    // First a simple example.
    // We have a function of the form f(x,y,z); and want to know the
    // value of the function for x=10; y=20; z=30; and for
    // the derivatives at those point.
    // Specify the values; and indicate 3 derivatives:
    	AutoDiff<Double> x(10.0, 3, 0); 
    	AutoDiff<Double> y(20.0, 3, 1); 
    	AutoDiff<Double> z(30.0, 3, 2);
    // The result will be:
    	AutoDiff<Double> result = x*y + sin(z);
    	cout << result.value() << endl;
    // 199.012
    	cout << result.derivatives() << endl;
    // [20, 10, 0.154251]
    // Note: sin(30) = -0.988; cos(30) =  0.154251;

See for an extensive example the demo program dAutoDiff. It is based on the example given above, and shows also the use of second derivatives (which is just using AutoDiff<AutoDiff<Double> > as template argument).

 	// The function, with fixed parameters a,b:
	template <class T> class f {
	public:
	  T operator()(const T& x) { return a_p*a_p*a_p*b_p*b_p*x; }
	  void set(const T& a, const T& b) { a_p = a; b_p = b; }
	private:
	  T a_p;
	  T b_p;
	};
	// Call it with different template arguments:
	  Double a0(2), b0(3), x0(7);
	  f<Double> f0; f0.set(a0, b0);
	  cout << "Value:     " << f0(x0) << endl;
	
	  AutoDiff<Double> a1(2,2,0), b1(3,2,1), x1(7);
	  f<AutoDiff<Double> > f1; f1.set(a1, b1);
	  cout << "Diff a,b:   " << f1(x1) << endl;
	
	  AutoDiff<Double> a2(2), b2(3), x2(7,1,0);
	  f<AutoDiff<Double> > f2; f2.set(a2, b2);
	  cout << "Diff x:     " << f2(x2) << endl;
	
	  AutoDiff<AutoDiff<Double> > a3(AutoDiff<Double>(2,2,0),2,0),
	    b3(AutoDiff<Double>(3,2,1),2,1), x3(AutoDiff<Double>(7),2);
	  f<AutoDiff<AutoDiff<Double> > > f3; f3.set(a3, b3);
	  cout << "Diff2 a,b:  " << f3(x3) << endl;
	
	  AutoDiff<AutoDiff<Double> > a4(AutoDiff<Double>(2),1),
	    b4(AutoDiff<Double>(3),1),
	    x4(AutoDiff<Double>(7,1,0),1,0);
	  f<AutoDiff<AutoDiff<Double> > > f4; f4.set(a4, b4);
	  cout << "Diff2 x:    " << f4(x4) << endl;

    // Result will be:
    // Value:     504
    // Diff a,b:  (504, [756, 336])
    // Diff x:    (504, [72])
    // Diff2 a,b: ((504, [756, 336]), [(756, [756, 504]), (336, [504, 112])])
    // Diff2 x:   ((504, [72]), [(72, [0])])

    // It needed the template instantiations definitions:
	template class f<Double>;
	template class f<AutoDiff<Double> >;
	template class f<AutoDiff<AutoDiff<Double> > >;

Motivation

The creation of the class was motivated by least-squares non-linear fit where partial derivatives of a fitted function are needed. It would be tedious to create functionals for all partial derivatives of a function.

Template Type Argument Requirements (T)

any class that has the standard mathematical and comparisons defined

To Do

Nothing I know

Member Description

AutoDiff()

Construct a constant with a value of zero. Zero derivatives.

AutoDiff(const T &v)

Construct a constant with a value of v. Zero derivatives.

AutoDiff(const T &v, const uInt ndiffs, const uInt n)

A function f(x0,x1,...,xn,...) with a value of v. The total number of derivatives is ndiffs, the nth derivative is one, and all others are zero.

AutoDiff(const T &v, const uInt ndiffs)

A function f(x0,x1,...,xn,...) with a value of v. The total number of derivatives is ndiffs. All derivatives are zero.

AutoDiff(const AutoDiff<T> &other)

Construct one from another

AutoDiff(const T &v, const Vector<T> &derivs)

Construct a function f(x0,x1,...,xn) of a value v and a vector of derivatives derivs(0) = df/dx0, derivs(1) = df/dx1, ...

~AutoDiff()

AutoDiff<T> &operator=(const T &v)

Assignment operator. Assign a constant to variable. All derivatives are zero.

AutoDiff<T> &operator=(const AutoDiff<T> &other)

Assign one to another.

void operator=(const AutoDiff<T> &other)
void operator/=(const AutoDiff<T> &other)
void operator+=(const AutoDiff<T> &other)
void operator-=(const AutoDiff<T> &other)
void operator=(const T other)
void operator/=(const T other)
void operator+=(const T other)
void operator-=(const T other)

Assignment operators

AutoDiffRep<T> theRep()
const AutoDiffRep<T> theRep() const

Returns the pointer to the structure of value and derivatives.

T &value()
const T &value() const

Returns the value of the function

Vector<T> derivatives() const
void derivatives(Vector<T> &res) const

Returns a vector of the derivatives of an AutoDiff

T &derivative(uInt which)
const T &derivative(uInt which) const
T &deriv(uInt which)
const T &deriv(uInt which) const

Returns a specific derivative. The second set does not check for a valid which; the first set does through Vector addressing.

uInt nDerivatives() const

Return total number of derivatives

Bool isConstant() const

Is it a constant, i.e., with zero derivatives?

const AutoDiff<T> &ref()

Indicate that we are going to use a temporary value for the last time (e.g. at the of a function returning by value). This way superfluous copying can be circumvented.

void release()

Release a struct of value and derivative data