Version 1.9 Build 1367

News	FAQ
Search	Home

Next: Calibration as a service Up: Recommendations for the AIPS++ Telescope Model Previous: Summary of the recommended Imaging Model

Splitting Calibration

My discussion of Imaging Model could be applied to any linear system of equations. Is calibration linear? I can think of three levels of difficulty in calibration:

Linear

Single dish calibration often involves simple ratios of averaged data points and often can reasonably be formulated as linear algebra.

Mildly non-linear

The ANTSOL approach to interferometer calibration is formally non-linear since it solves an equation like:

g_ig_j^* = X_ij

(1)

However, in practice, this can be solved quite easily using simple gradient search techniques since the gains g_i are often roughly the same scale.

Strongly non-linear

A nice example arises in global fringe fitting in VLBI where strange heuristics are needed to get an initial guess for the least-squares algorithm which is used to solve for antenna-based phases and phase derivatives. While the least-squares algorithm is mildly non-linear, obtaining the initial guess is tough and highly non-linear. Another example from VLBI is the fully general polarization case.

Hence my conclusion is that calibration can range from cases where linear algebra works, to cases where full non-linear optimization is required.

Extending the linear equations used in Imaging, we could say that the observed data D is a function of a set of unknown parameters P and some ``correct'' data $\widehat{D}$ . We then have that:

D = C( $\displaystyle \widehat{D}$ , P)

(2)

where C is now a general non-linear function. Given the parameters P, we can predict the data perfectly. The two interesting problems are, first, the inverse problem of deriving P from knowledge of the functional form of C, measurements of the data vector D and a prediction of $\widehat{D}$ perhaps from the Imaging Model $\widehat{D}$ = AI, and, second, deriving $\widehat{D}$ for real observations.

Note that we could introduce the Imaging Model explicitly by writing, instead, a relation involving the image I and the matrix A:

D = C(AI, P)

(3)

However, I think that this is not worthwhile at this point and so for the moment I will continue with writing $\widehat{D}$ for AI. I will return to this point later when I discuss algorithms in more detail.

Turning to the issue of how to solve equations of this type, we can define a $\chi^{2}_{}$ term to be minimized in order to derive P:

$\displaystyle \chi^{2}_{}$ = (C( $\displaystyle \widehat{D}$ , P) - D)^Tw(C( $\displaystyle \widehat{D}$ , P) - D)

(4)

where w is inverse of the covariance matrix of errors. If the errors are independent between data points, then w is diagonal with elements:

w_{i, i} = $\displaystyle {1\over \sigma^2_i}$

(5)

We consider an approach to solving for the P parameters based upon the idea of optimization: many iterative algorithms update an estimate of P based upon $\chi^{2}_{}$ and its gradient with respect to P:

$\displaystyle {\partial \chi^2\over\partial P}$ = 2 $\displaystyle {\partial C(\widehat D,P)\over\partial P}^{T}_{}$ w(C( $\displaystyle \widehat{D}$ , P) - D)

(6)

As before, we can require the services of the Telescope model to calculate this term:

1.

Use a service predict to find C( $\widehat{D}$ , P)^T,

2.

Subtract D to get C( $\widehat{D}$ , P) - D,

3.

Use a service solve to get:

$\displaystyle {\partial C(\widehat D,P)\over\partial I}^{T}_{}$ w(C( $\displaystyle \widehat{D}$ , P) - D)

(7)

So far, this is all quite obvious but is it helpful? One could imagine feeding this Telescope Model to a non-linear least-squares Solver in the same way that Imaging Model was fed to an Imager. I can think of several objections to this scheme.

1.: Non-linear optimization is much trickier than linear optimization and so it often pays to use heuristics to help the solution along. Another way of seeing that this must be true is to note that for a general function C( $\widehat{D}$ , P) all the derivatives will be important whereas for the linear functions used in the Imaging Model, the terms beyond the second derivative A^TA are irrelevant (actually zero).
2.: Non-linear least-squares problems are often feasible only if the initial guess is sufficiently close to the true global minimum (since then derivatives higher than the second can be neglected). Often, however, getting an initial guess is the hardest step. An example in VLBI global fringe fitting was described above.
3.: Another difference from the Imaging Model is in the dimensionality of P. In the Imaging Model, there may be 10⁵ - 10⁷ pixels and so A could have up to 10¹⁴ elements. In calibration, the number of free parameters could be very small (even as small as one for an amplitude scale) and so one could imagine using second derivative information, something that was assumed to be impossible for Imaging and may be vital in some particularly difficult cases of calibration.

The overwhelming conclusion is that the scheme I proposed for the Imaging Model cannot be easily extended to non-linear functions such as that found in calibration. The answer to the first question (``can we abstract the calibration of Telescopes into an equation part and a solver part?'') is No.

Next: Calibration as a service Up: Recommendations for the AIPS++ Telescope Model Previous: Summary of the recommended Imaging Model