Version 1.9 Build 1556

News	FAQ
Search	Home

Next: Image-Plane Representations Up: The Dwingeloo `UVCI' Design AIPS++ Implementation Note 181 Previous: The Design in Brief

Subsections

MeasurementSet Classes

The MeasurementSet is the primary repository of instrumental data inside of AIPS++. The MeasurementSet is organized as a big Table, with all data appearing in columns of the table, and meta-information (e.g., units) attached as keywords to the table.

Data Organization

No organization of the data can be optimal for all circumstances. The major principle we adopted for organizing the MeasurementSet is to keep all the data which is most apt to be required to be aggregated together to apply common corrections together on one row of the table. The principle result of this is that the fundamental data array of shape n_Stokes x n_channels. That is, each row contains the correlations for all polarisations and channels for a given integration for a single combination of antenna-pair, feed-pair, and spectral window.

It is important to note that the MeasurementSet is a single table, not a collection of tables. Other software systems have tended to separate things which vary at different rates (per observation, per source, per integration etc.) into different tables to prevent data bloat (caused by repeating constant values for several or many rows). Besides the navigational difficulties in finding the value that corresponds to the current row in another table, a more fundamental problem is that values can vary at different rates for different instruments (and observing modes), and hence finding a particular value might require a moderately complicated runtime lookup.

The columns which are common among astronomical instruments, or unique to synthesis radiotelescopes are listed in the file MeasurementSet.h (currently to be found in the AIPS++ system as
/aips++/code/trial/implement/MeasurementSet/MeasurementSet.h) These columns include items like the data matrix, `coordinates' (time, uvw, spectral window, field, etc.), data quality measures (flags and noise estimates) and telescope specific items (system temperatures, weather data, etc.).

Any column name or keyword which is not otherwise reserved, may be used by a telescope as it wishes (e.g. to record monitor data).²

Many of the above are either constant, or at least slowly varying if the MeasurementSet is stored in its ``natural'' order (increasing time, baselines on consecutive rows, etc.). This could cause tremendous data bloat if the constant values were repeated. A Table optimization, the Miriad storage manager, has been implemented to prevent this. (Likewise in FITS, the Single Dish convention allows a keyword to masquerade as a constant column). If we wish to preserve the ``Big Table'' view of data, an approach such as this is necessary.

One modest problem with the ``Big Table'' view of data is that miscellaneous monitor (e.g., wind speed) data might not be emitted when the instrument is producing data, i.e. it might not be sampled at times when a row would be emitted by the instrument. To date we just ``grid'' it on to the nearest row. This solution seems adequate so long as we are not interested in miscellaneous data that varies more rapidly than rows are produced and which cannot be averaged.

Concatenating two synthesis radiotelescopes MeasurementSets together should be as simple as taking the columns which are in common between them, and appending one onto the other (every MeasurementSet for a synthesis telescope is required to have the minimal set of aperture synthesis columns). This concatenation could be done via a virtual table, or it could be done by physically appending the data. Concatenating a synthesis radiotelescope MS and a Single Dish MS would be a more complicated operation requiring addition of (virtual) columns.

The lack of adequate coordinate (``Measure'') classes is clearly sorely missed, and would greatly clean up many of the current column definitions. Likewise, the Unit (Quantum) class should be integrated.

Iteration and Aggregation

As we have seen, the MeasurementSet is organized at a moderate level of granularity. While a row isn't ``atomic'' -- each row contains a data matrix (channels and polarizations) -- it also only contains a single time, antenna-pair, feed-pair, spectral-window, etc.

Often it is useful to aggregate data together in a larger chunk. The SynthesisMSIterator was created for this purpose. This class basically fulfills the mandate of the ``VisSet'' in [Gle94].

This class is used to iterate through a MeasurementSet in two main orders:

1.: Iterate through all spectral windows before advancing to the next FIELD_ID.
2.: Iterate through all the fields before going to the next spectral window.

Usually a MeasurementSet will be filled in time order, but at present this is not required. The iterator does not enforce time order. Besides presentation of selected data that results from the present iteration and aggregation, this class has convenience functions for, e.g. setting the desired Stokes type, and returning UVW in wavelengths.

It seems to one of us (BEG) that the SynMSIterator might be a natural abstract base class, as it might be a natural point to, e.g., inject simulated data into an imaging algorithm. That is, we could create an abstract base class, derive SynMSIterator from it, and recast arguments from the MeasurementSet to this new class where possible. This base class would be one very much like the VisSet.

The reason the VisSet has disappeared is that in the previous implementation it turned out to be an `empty shell' with the VisSetIterator doing all the work and accessing the VisSet internals. To allow simulation of both calibration and imaging we will need to produce at least the full set of required columns for synthesis arrays. The present MSSimulator class produces a filled MeasurementSet Table on disk from a number of parameter files.

Visibility

A ``Visibility'' object has been defined which consists of the complex data matrix, time, antenna numbers, feed numbers, spectral window, calibration group, Stokes type, feed type, polarization response, and the parallactic angle of the antenna (the polarization response is the 'fixed' response for pa=0).

This is a convenient aggregation of values found on each row of a synthesis radiotelescope MeasurementSet.³ It will also play an important role in correction of synthesis MeasurementSet's, described later.

Default Corrector

Every MeasurementSet may have a default corrector associated with it (this corrector may be a corrector-sequence). When instantiated every MeasurementSet always presents its raw values, however the default correction, if any, may readily be applied.

Presently the MeasurementSet only has a single default corrector. It is anticipated that this will be extended to allow for versioning.

This functionality is not fundamental -- it is a convenience built on top of the save and restore methods of the Corrector class. Details on the operation of Correctors are supplied later in this paper.

Next: Image-Plane Representations Up: The Dwingeloo `UVCI' Design AIPS++ Implementation Note 181 Previous: The Design in Brief Contents