Suppose we have a function f(x0,x1,...,xn) and its differential is
df = (df/dx0)*dx0 + (df/dx1)*dx1 + ... + (df/dxn)*dxnWe can build a class that has the value of the function, f(x0,x1,...,xn), and the values of the derivatives, (df/dx0), (df/dx1), ..., (df/dxn) at (x0,x1,...,xn), as class members.
Now if we have another function, g(x0,x1,...,xn) and its differential is dg = (dg/dx0)*dx0 + (dg/dx1)*dx1 + ... + (dg/dxn)*dxn, since
d(f+g) = df + dg, d(f*g) = g*df + f*dg, d(f/g) = df/g - fdg/g^2, dsin(f) = cos(f)df, ...,we can calculate
d(f+g), d(f*g), ...,based on our information on
df/dx0, df/dx1, ..., dg/dx0, dg/dx1, ..., dg/dxn.All we need to do is to define the operators and derivatives of common mathematical functions.
To be able to use the class as an automatic differentiator of a function object, we need a templated function object, i.e. an object with:
template <class T> f { public: T operator()(const T &x, const T &a, const T &b) { return a*b*x; }; }; // Instantiate the following versions: template class f<Double>; template class f<AutoDiff<Double> >;A call with values will produce the function value:
cout << f(7.0, 2.0, 3.0) << endl; // will produce the value at x=7 for a=2; b=3: 42 // But a call indicating that we want derivatives to a and b: cout << f(AutoDiff<Double>(7.0), AutoDiff<Double>(2.0, 2, 0), AutoDiff<Double>(3.0, 2, 1)) << endl; // will produce the value at x=7 for a=2; b=3: // and the partial derivatives wrt a and b at x=7: (42, [21, 14]) // The following will calculate the derivate wrt x: cout << f(AutoDiff<Double>(7.0, 1, 0), AutoDiff<Double>(2.0), AutoDiff<Double>(3.0)) << endl; (42,[6])In actual practice, there are a few rules to obey for the structure of the function object if you want to use the function object and its derivatives in least squares fitting procedures in the Fitting module. The major one is to view the function object having 'fixed' and 'variable' parameters. I.e., rather than viewing the function as depending on parameters a, b, x (f(a,b,x)), the function is considered to be f(x; a,b), where a, b are 'fixed' parameters, and x a variable parameter. Fixed parameters should be contained in a FunctionParam container object; while the variable parameter(s) are given in the function operator(). See Function class for details.
A Gaussian spectral profile would in general have the center frequency, the width and the amplitude as fixed parameters, and the frequency as a variable. Given a spectrum, you would solve for the fixed parameters, given spectrum values. However, in other cases the role of the parameters could be reversed. An example could be a whole stack of observed (in the laboratory) spectra at different temperatures at one frequency. In that case the width would be the variable parameter, and the frequency one of the fixed (and to be solved for)parameters.
Since the calculation of the derivatives is done with simple overloading, the calculation of second (and higher) derivatives is easy. It should be noted that higher deivatives are inefficient in the current incarnation (there is no knowledge e.g. about symmetry in the Jacobian). However, it is a very good way to get the correct answers of the derivatives. In practice actual production code will be better off with specialization of the f<AutoDiff<> > implementation.
The AutoDiff class is the class the user communicates with. Alias classes (AutoDiffA and AutoDiffX) exists to make it possible to have different incarnations of a templated method (e.g. a generic one and a specialized one). See the dAutoDiff demo for an example of its use.
All operators and functions are declared in AutoDiffMath. The output operator in AutoDiffIO. The actual structure of the data block used by AutoDiff is described in AutoDiffRep.
// First a simple example. // We have a function of the form f(x,y,z); and want to know the // value of the function for x=10; y=20; z=30; and for // the derivatives at those point. // Specify the values; and indicate 3 derivatives: AutoDiff<Double> x(10.0, 3, 0); AutoDiff<Double> y(20.0, 3, 1); AutoDiff<Double> z(30.0, 3, 2); // The result will be: AutoDiff<Double> result = x*y + sin(z); cout << result.value() << endl; // 199.012 cout << result.derivatives() << endl; // [20, 10, 0.154251] // Note: sin(30) = -0.988; cos(30) = 0.154251;
See for an extensive example the demo program dAutoDiff. It is based on the example given above, and shows also the use of second derivatives (which is just using AutoDiff<AutoDiff<Double> > as template argument).
// The function, with fixed parameters a,b: template <class T> class f { public: T operator()(const T& x) { return a_p*a_p*a_p*b_p*b_p*x; } void set(const T& a, const T& b) { a_p = a; b_p = b; } private: T a_p; T b_p; }; // Call it with different template arguments: Double a0(2), b0(3), x0(7); f<Double> f0; f0.set(a0, b0); cout << "Value: " << f0(x0) << endl; AutoDiff<Double> a1(2,2,0), b1(3,2,1), x1(7); f<AutoDiff<Double> > f1; f1.set(a1, b1); cout << "Diff a,b: " << f1(x1) << endl; AutoDiff<Double> a2(2), b2(3), x2(7,1,0); f<AutoDiff<Double> > f2; f2.set(a2, b2); cout << "Diff x: " << f2(x2) << endl; AutoDiff<AutoDiff<Double> > a3(AutoDiff<Double>(2,2,0),2,0), b3(AutoDiff<Double>(3,2,1),2,1), x3(AutoDiff<Double>(7),2); f<AutoDiff<AutoDiff<Double> > > f3; f3.set(a3, b3); cout << "Diff2 a,b: " << f3(x3) << endl; AutoDiff<AutoDiff<Double> > a4(AutoDiff<Double>(2),1), b4(AutoDiff<Double>(3),1), x4(AutoDiff<Double>(7,1,0),1,0); f<AutoDiff<AutoDiff<Double> > > f4; f4.set(a4, b4); cout << "Diff2 x: " << f4(x4) << endl; // Result will be: // Value: 504 // Diff a,b: (504, [756, 336]) // Diff x: (504, [72]) // Diff2 a,b: ((504, [756, 336]), [(756, [756, 504]), (336, [504, 112])]) // Diff2 x: ((504, [72]), [(72, [0])]) // It needed the template instantiations definitions: template class f<Double>; template class f<AutoDiff<Double> >; template class f<AutoDiff<AutoDiff<Double> > >;
Construct a constant with a value of zero. Zero derivatives.
Construct a constant with a value of v. Zero derivatives.
A function f(x0,x1,...,xn,...) with a value of v. The total number of derivatives is ndiffs, the nth derivative is one, and all others are zero.
A function f(x0,x1,...,xn,...) with a value of v. The total number of derivatives is ndiffs. All derivatives are zero.
Construct one from another
Construct a function f(x0,x1,...,xn) of a value v and a vector of derivatives derivs(0) = df/dx0, derivs(1) = df/dx1, ...
Assignment operator. Assign a constant to variable. All derivatives are zero.
Assign one to another.
Assignment operators
Returns the pointer to the structure of value and derivatives.
Returns the value of the function
Returns a vector of the derivatives of an AutoDiff
Returns a specific derivative. The second set does not check for a valid which; the first set does through Vector addressing.
Return total number of derivatives
Release a struct of value and derivative data