AIPS++ Consortium User Specifications

AIPS++ Consortium Development Group
07 April 1992

(AIPS++ User Specifications Memo 115)
HTML version: 07 March 1995, 16:15:36 EST
URL: http://aips2.nrao.edu/aips++/docs/specs/115.html

Purpose

This document merges and summarizes the User Specifications produced by astronomers and programmers at each of the seven observatories in the AIPS++ consortium.

Contents

  1. Introduction
  2. General Characteristics of AIPS++
  3. Data
  4. Specific Requirements
  5. Conclusions
  6. References

1. Introduction

AIPS++ is an acronym for the Astronomical Information Processing System that is being designed and implemented by a consortium of seven radio astronomy institutions:

AIPS++ is intended to replace the AIPS (Astronomical Image Processing System) with a more modern, more extensive, and more extensible software system.

This document is mainly based upon the User Specification documents prepared by each member of the consortium, with some use of other written contributions to the User Specifications Memo series. "Distillation documents'', written by the consortium members participating in the initial six months design phase in Charlottesville, have been extensively used in the preparation of this document.


2. General Characteristics of AIPS++

2.1. Guiding principles

These specifications describe the capabilities needed in AIPS++ by astronomers who use telescopes operated by members of the consortium. We attempt to avoid expressing opinions on how such capabilities should be implemented. However, because AIPS++ should be optimized for the astronomer user, we do specify some aspects of the user interface that we consider essential.

AIPS++ must anticipate a wide range of experience within its user community. Both the user interface and the off-line documentation must address the disparate needs of novice (or occasional) users and of experienced users who may be analyzing technically demanding observations. To match the needs of users with a wide range of experience, a hierarchy of interfaces and documentation will be essential. Users will also need a hierarchy of programmability. At the lowest level of experience, this should allow them to connect major (and sometimes repetitive) steps in data processing conveniently. At the highest level, an efficient interface is needed to encourage development of new, experimental, algorithms and processing techniques.

The following principles are important in the design and implementation of AIPS++:

2.2. Scientific goals

AIPS has been an acronym for "Astronomical Image Processing System"; however, its capabilities, and users' requirements, have evolved far beyond image plane processing. AIPS++ should now be a general tool for turning telescope data, and model calculations, into scientific results. In some cases, e.g., graphics and tables, the results should be in publishable form. Most n-dimensional images are produced only as an intermediate step between raw data and useful results, however some constitute final scientific results and require reproduction in publishable form. A similar range of purposes has evolved for single dish data in systems such as UniPOPS. The concept for AIPS++ should be that of an Astronomical Information Processing System.

In specifying a new software system, it is useful to consider what aspects of astronomical data processing have remained stable over the last 15 or so years. The most stable parts of array and single-dish processing systems are the fundamental descriptions of telescope data. For the major arrays, the basic description has accumulated more attributes (e.g. spectral channels, IFs) but it is still fundamentally a visibility data set -- samples of a spatial coherence function in some convenient spatial or temporal order. Similarly, a final image is still an array of calibrated pixel intensities in a known coordinate system, polarization, and observing frequency. The "end results'' are scientifically meaningful quantities extracted from one or more such images, and published visual representations of these images. Key, and probably stable, basic ingredients of a user specification are therefore the types of data to be handled (e.g. visibility data, single dish spectra, images, image cubes).

Basic operations on parts of data sets, such as Fourier transforms, least squares fitting algorithms, plotting, display, mathematical, and other standard functions are also relatively stable. We will call these basic operations tools. The second ingredient needed by users is an itemized tool kit of basic operations from which more complex astronomical applications can be assembled.

In contrast, the algorithms used to calibrate, construct and interpret data sets and images evolve as the astronomical community acquires experience and sophistication in data and image analysis techniques. The algorithms are the least stable elements of present software. They continually evolve or are replaced (either as explicit programs or as informal procedures that may involve astronomer interaction). The algorithms are embodied in tasks which can be implemented either as specific programs in a language such as C++ or as scripts in a higher level language. Many of the tasks that are now part of the lexicon of astronomical image processing will be embodied in AIPS++ at an early stage. The tools in the kit provided by AIPS++ must, however, be easily usable by astronomers to carry out new tasks whose nature and scope may evolve rapidly with time.

In these terms, the core of AIPS++ must provide a generic toolbox operating on specific data types. Given the finite resources available, the limitations of AIPS++ should be more in the diversity of data that can be handled rather than in what can be done to these data.

2.3. General attributes of AIPS++

AIPS++ should have good command line interface with "full'' programming capability. This should be at the level to eliminate, for most astronomers, the need to write FORTRAN or C++ programs. We view the issue of who will be able to develop applications programs as one of the most important issues for the future. "Full programming" capabilities with the AIPS++ "command language" is very important; however, the use of C++ and FORTRAN "template" programs that can be run "with" AIPS++ is also important. In addition, the current plan to have many astronomers doing C++/OOP programming for AIPS++ will require special attention to astronomer-oriented documentation, programming guides, and possibly things like programming "summer schools". Assuming that everyone can learn what they need from industry-wide material aimed at professional programmers is unwise, and is likely to limit the AIPS++ pool of developers to too small a group with too little astronomical experience.

Documentation for AIPS++ should be available both on-line and in hard copy. This should have multiple levels ranging from simple "help" to extensive information, and dealing with both specific applications and individual parameters.. Consistency between hard copy and on-line documentation is imperative. Multi-window environments, as mentioned above, should allow context-sensitive information to be displayed by "clicking" on appropriate items. While the implementation aspects of a UNIX "man" page might be useful, the displayed information should be easily understandable to user-astronomers.

Multiple levels of user interface would be desirable to allow for both novice users and experienced expert. User selection of the style of interaction and the range of "packages" to be used should be possible. Choice of the user interface should have no effect on the code used in processing.

Styles of user interface are difficult to decide upon, and are very dependent upon user experience and preference. The discussion in Wood (1991) is an example of a useful approach to the user interface that goes into details we have not discussed here. We recommend planning a number of available styles, and extensive user testing of each of them during early phases of AIPS++ development, as opposed to deciding upon one approach and precluding all others. The idea that the user interface is just another applications task, that can take many forms, is probably very important in planning for the future with a wide range of user needs and expertise.

A combination of the inclusion of single dish data reduction as part of the domain of AIPS++, and the increased use of "nearly real-time" data processing and remote observing for both single dish observing and arrays, makes the use of AIPS++ as an integral part of the observing process very important. This should not change or add to the processing and display needs of AIPS++, but rather adds to the richness of the tools that can be used to support the users' involvement in the observing process. In addition the post-processing tools need by instrumental staff to maintain their instruments have great commonalty with the things a knowledgeable astronomer would like to see and do during the observing process. The observer would like:

In addition to the use of normal AIPS++ processing tasks, this list of needs makes it necessary for preparation and changing of observing programs to be immediately possible. Indeed, the preparation of observing programs may become one of the extended tasks of AIPS++ for some instruments.

The simulation of data produced by real instruments, based upon assumed models of sources, is an additional capability that is essential for AIPS++. This should be viewed as a necessary part of the testing of AIPS++ applications software (both for de-bugging and evaluating efficiency of processing), and as a tool for the astronomer that provides both more realistic preparation for observing and the necessary tools to compare models and data in AIPS++.


3. Data

3.1. The general nature of the data for AIPS++

The data will come principally from radio telescopes although AIPS++ must allow import of images and data from other wavelengths. The primary data types that are needed to support the AT, BIMA, EVN, GMRT, MERLIN, VLA, VLBA, WSRT, the future mmA, and the various instrument packages on the JCMT, GBT, the 12m and the 43m are as follows:

  1. Telescope status information
  2. Total power and phased array data sequences reflecting switched or time series observations
  3. Spectra
  4. Images
    1. Planar images at radio, optical, X-ray, etc. wavelengths
    2. Spectral cubes - images in multi-spectral regions
    3. Time cubes - time-ordered images of variable sources
  5. Coherence function (visibility) data from correlation arrays
    1. arrays with real-time delay and phase variation correction
    2. tape recording arrays where the correlator output is coherence function data for a range of time lags (or transformed frequencies)
  6. Calibration tables
  7. Data editing information
  8. Computed models
  9. Processing histories

Some of these data categories are naturally associated with each other; it is also important to be able to group some together when appropriate, e.g. in mosaicing observations. Some of these data types are either super-sets or sub-sets of others; it is important to be able to compose super-sets out of sub-sets and to decompose super-sets into their sub-sets.

It is important that the astronomer have access at all stages of data processing to the conditions under which an observation was made, and to what has been done to it in the data processing. The ability to wipe the slate clean has proven its utility over and over again in many data processing systems. Hence the database should carry both telescope-provided status information and a processing history in formats that make it easy to "start over" if processing goes awry. This supplementary information begins with data structures with telescope information as a function of time, position, or other data identifiers such as telescope name, latitude, etc. It continues with data processing history sufficient to understand, un-do, and re-do that processing.

Another view of the data relates to different uses and time scales of use. These uses lead to three major categories of software: on-line data analysis; system support software for staff operating and diagnosing the operation of the instrument; and observers analysis software. For single telescopes the observer has often done a major fraction of data analysis at the telescope as part of observing process. Recent hardware and networking developments have made such on-line data analysis feasible even for high data rate instruments like the large arrays and single dishes with fast sampling spectral processors. Most telescopes have, or soon will have, full remote and local analysis capability. In all these system data should be accessible to the user as soon as practicable in nearly real-time. For all these reasons data analysis software in the AIPS++ system should provide for the needs of the above mentioned three categories of software.

Different single dishs and arrays have different approaches to data handling in which similar words mean different things. These "cultural" difference in language must be directly addressed and data descriptions and terminologies that are consistent at all telescope and development sites must be created and maintained.

3.2. User-oriented data organization

From the point of view of the user the highest level identification of the problem is what we will call a "project". Projects are aimed at obtaining answers to scientific questions. Answers to these scientific questions frequently involve obtaining data from a variety of telescopes. Some projects require radio data from both single dish and array observations from the same or different instruments, each serving a different "purpose". Observations for each instrument are organized into observing "runs" with sequences of "scans" with identical instrumental and observing parameters. Each scan contains "sub-scans" with data elements in the form of spectra, time instances of coherence function data or spectra, etc., that are associated with instances of time. Astronomers need to deal with this hierarchy of data: project, purpose, instrument, observing run, scans, and sub-scans. It would be very helpful if the astronomer could be aided in dealing with things according to this hierarchy. Data that are viewed as simple sequences of data from stand-alone telescopes leave the astronomer to impose a mental image of project/instrument/purposes and then runs/scans on the simple data elements. The future mmA will be a case where the same instrument will generate both single dish and coherence function data sets. This makes it a prime example where the same instrument will serve diverse instrumental purposes for a wide variety of "projects".

In this document we list preparation for observing as an AIPS++ task. This is partly because simulation, using AIPS++ processing tools, can be very useful in understanding an observing program during the planning and preparation process. In addition, it is at this stage that the user imposes the logic of project/instrument/purposes/runs/scans on the observing process, and this logic must be remembered and used as part of the data reduction and processing. If tools were available in AIPS++ to aid the user in passing on and using this logic all the way through data processing, it would be very helpful. It would be analogous to having and updating the map of a maze that can be used while passing through the maze. Data processing is very much like a maze to be negotiated for most astronomers, and assistance in dealing with the higher level purposes of data would be very useful.

The above can be describe more technically by saying that data sets should have a hierarchy of descriptor (or "header") items, with descriptor items being identified by context information (such as name, position, etc., for images). These data descriptors should allow specification ranging very large, merged data sets to basic elements like pixels or u-v data points. It should be possible to eliminate redundancy by describing information on a sufficiently high level while allowing exceptions by overriding this information at a lower level; that is, mixtures of positive and negative data/information specifications.


4. Specific Requirements

This section contains long lists of brief description of the elements of the user requirements from all the consortium specifications. It is based on a merger of the distillation of specifications by AIPS++ working groups in the areas of "User Interfaces", "UV Data System and Processing Requirements", and "Image Handling" -- and the original material from individual consortium user requirements.

4.1. User interfaces

4.1.1. General

4.1.2. Command line interface

4.1.3. Graphical user interface

The basic graphical interface should be the primary user interface for users in 1994. It should be the most attractive one for most AIPS++ users, and maybe even for experts. It should be a window-oriented graphical interface with pull-down menus (for application selection and parameter specification), multiple windows, and pop-up menus for context sensitive help. Menus for application selection and parameter specification should have pop-up sub-menus with options/parameters depending on menu context.

It would be desirable if there were an advanced GUI with visual programming of applications. Icons/glyphs for individual "tool" components and connecting lines for passing of data. Sequence of graphical task and data flow connections should be capable of being saved, edited, and retrieved.

4.1.4. User documentation

Documentation for AIPS++ must be a planned part of the AIPS++ development. It should be:

4.2. Data handling

4.2.1. General considerations

As discussed earlier, a large fraction of the data processing in AIPS++ can described as "data handling". These data can be images, single dish data sets, coherence function data sets, telescope performance data, model data, or any data set that can be imported into the system.

4.2.2. Data import and export

4.3. Nature of instrumental data

4.3.1. General

4.3.2. Single dish and summed/phased array data

4.3.3. Interferometer data

Interferometer data should be regarded having potentially inhomogeneous antenna properties, but this should not preclude dealing with simplified cases where homogeneity can be assumed.

4.3.4. VLBI data

4.3.5. Mosaicing data

4.4. Data correction and calibration

4.4.1. General

4.4.2. Single dish and summed/phased array data

4.4.3. Interferometer data

4.4.4. Mosaic total power and interferometer data

4.4.5. Additional considerations for VLBI data

4.4.6. Data editing

4.5. Imaging and image processing

In this section we consider the formation of images from edited, calibrated data. While this is mainly image computation and deconvolution, it must be remembered, that for the user, imaging and image deconvolution is an integral part of the process of data inspection/editing, calibration, imaging, self-calibration, data/image display, spectrum/time/image analysis, and production of hard copy for publication purpose. This process must be well integrated for the convenience of the user. It should be possible to easily "mix-and-match" self-calibration, data transformation, and de-convolution "tools", for example, using CLEAN to deconvolve in the early stages, and maximum entropy later on when CLEAN begins to be less useful. This is related to the need to make self-calibration use a generic model, which could be a table of CLEAN-components, a table of Gaussian components, or an image.

4.5.1. Image and spectral image formation

4.5.2. Image transformations

In this section we list some of the image-specific transformations of data that are very general operations. Many image transformations are basically transformations of images as arrays of numbers, so we include these operations in the upcoming sections on data "structure" transformation.

4.5.3. Data "structure" transformations

Data "structures" are assumed to be 1-, 2-, 3- (or n-) dimensional aggregations of data values. Included are tabular data structures that are special two-dimensional arrays which may have different (numerical) data contents for each "column". Most of the requirements in this section should apply to any type of data, and where the operations are meaningful only for certain data types this will be noted.

4.6. Image analysis

Many of the major functions of image analysis have already been discussed under the general category of data structure transformations and analysis. This is an area of applications that is highly dependent upon astronomer specification of the needs for a particular problem. For this reason tools for this analysis, and programmability by the astronomer, are most important. A few cases that illustrate advanced problems are the following.

The extraction of information from data cubes is one of the most important, but computationally (and visually) difficult areas of image analysis. The visualization problem in general requires both special hardware and flexible analysis software. The relation of most data cubes to spectroscopy, and the importance of radiative transfer to spectroscopy, presents a basic need for the astronomer to analyze data in an environment where it is possible to compute and compare models derived from spectral radiative transfer. Since this cannot be viewed a the job of any instrumental support group, the versatile programming of computational tools is the most important thing for the astronomer who has reached this stage of "image analysis".

In addition to spectroscopy, image analysis and comparison, for some problems in the future will require dynamical gas/fluid computations. For example, analyzing HI or molecular line images of galaxies as snapshots of "fluids" means the divergence, curl, and Laplacian of vector fields must be calculated to study continuity, vorticity, and viscosity.

Moving source problems, particularly the difficult case of solar imaging, requires special modeling or data corrections. Rotation and registration of imaging taken at different times and locations, and with different instruments, requires special treatments dependent upon the scientific problem at hand, which is usually in the solar system domain involving the Sun, planets, asteroids, comets, etc.

4.7. Data display and recording

By data display we mean listings, plots, and "pictures" that are useful in examining data and results derived from data. By recording we mean hard copy of these data displays. The form of data display depends on the user interface. The form of data recording may depend upon printer and other hardware so output files should be as device-independent as possible, with separate production of device-dependent files.

4.7.1. General

4.7.2. Spectral domain

4.7.3. Time domain

4.7.4. Images (including spectral)


5. Conclusions

The preparation of specifications for AIPS++ involves two major complications.

The first is that the instruments, and the types of measurements for which they are designed, are diverse and complicated at levels of detail that are important for many applications. However, this can be handled by careful attention to detail and a judicious balance between generality and those details. In this document we have dealt mainly with generalities, so technical details needed in the implementation of software for specific instrumentation must be dealt with elsewhere.

The second, and most severe problem, is that while it may be (relatively) easy to specify what we need for the science of the past and present, the most important needs are for the scientific problems of instruments and astronomers in the future. Careful consideration of what this means for a software system leads to one general conclusion: availability of computational tools, user-programmability of these tools, and easy transferability of data between software systems are the most important capabilities that one can have for the future of any scientific software system.


6. References

ATNF Staff, 1991, ATNF AIPS++ User Specifications, AIPS++ User Specifications Memo 106.

BIMA 1991, AIPS++ User Specifications: BIMA Version, AIPS++ User Specifications Memo 108.

Cornwell, T.J. 1990, (ed.), Final Report of the Software Advisory Group (SWAG), AIPS++ User Specifications Memo 102. (available only in paper form)

DRAO 1991, DRAO User Requirements -- AIPS++, AIPS++ User Specifications Memo 111.

Foster, R., Haynes, M., Heyer, M., Jewell, P., Maddalena, R.J., Matthews, H., Reich, W., and Salter, C. 1991 Requirements for Data Analysis Software for the Green Bank Telescope, GBT Memo 72.

GMRT group, 1992, GMRT Requirements Documents, AIPS++ User Specifications Memo 114. (available only in paper form)

Hjellming, R.M. 1991, Miscellaneous Suggestions for AIPS++, AIPS++ User Specifications Memo 107.

Hjellming, R.M., Bridle, A.H., Maddalena, R.J., Wood, D.O.S., Zensus, J.A. and Westpfahl, D.J.1991, AIPS++ User Specifications: An Initial NRAO-Oriented Version, AIPS++ User Specifications Memo 105.

Liszt, H.S. 1992, A Single-Dish Data Handling Environment for AIPS++, AIPS++ User Specifications Memo 113. (available only in paper form)

Noordam, J.E. 1991, (ed.) Dutch Requirements for AIPS++, AIPS++ User Specifications Memo 112.

Shone, D.L. 1992, (ed.) Jodrell Bank User Requirements for AIPS++, AIPS++ User Specifications Memo 110.

Wood, D.O.S. 1991, The AIPS++ User Interface, AIPS++ User Specifications Memo 104. (available only in paper form)


Copyright © 1992,1995,2002 Associated Universities Inc., Washington, D.C.


abridle@nrao.edu