AIPS++ Consortium User Specifications
AIPS++ Consortium Development Group
07 April 1992
(AIPS++ User Specifications Memo 115)
HTML version: 07 March 1995, 16:15:36 EST
URL: http://aips2.nrao.edu/aips++/docs/specs/115.html
Purpose
This document merges and summarizes the User
Specifications produced by astronomers and programmers at each of
the seven observatories in the AIPS++ consortium.
Contents
- Introduction
- General Characteristics of AIPS++
- Data
- Specific Requirements
- Conclusions
- References
AIPS++ is an acronym for the Astronomical Information
Processing System that is being designed and implemented by a
consortium of seven radio astronomy institutions:
- the Australia Telescope National Facility (ATNF),
- the Herzberg Institute for Astrophysics (HIA) through the Dominion
Radio Astrophysical Observatory (DRAO),
- the National Radio Astronomy Observatory (NRAO),
- the Netherlands Foundation for Research in Astronomy (NFRA),
- the Nuffield Radio Astronomy Laboratory (NRAL),
- the Tata Institute of Fundamental Research (TIFR) through the
National Centre for Radio Astrophysics (NCRA) with GMRT headquarters at
Pune, and
- the Berkeley-Illinois-Maryland Association (BIMA).
AIPS++ is intended to replace the AIPS (Astronomical Image Processing System) with a more modern,
more extensive, and more extensible software system.
This document is mainly based upon the User Specification
documents prepared by each member of the consortium, with some use
of other written contributions to the User Specifications Memo series.
"Distillation documents'', written by the consortium members
participating in the initial six months design phase in Charlottesville,
have been extensively used in the preparation of this document.
These specifications describe the capabilities needed in AIPS++ by
astronomers who use telescopes operated by members of the consortium.
We attempt to avoid expressing opinions on how such capabilities should
be implemented. However, because AIPS++ should be optimized for the
astronomer user, we do specify some aspects of the user interface that
we consider essential.
AIPS++ must anticipate a wide range of experience within its user
community. Both the user interface and the off-line documentation must
address the disparate needs of novice (or occasional) users and of
experienced users who may be analyzing technically demanding
observations. To match the needs of users with a wide range of
experience, a hierarchy of interfaces and documentation will be
essential. Users will also need a hierarchy of programmability. At the
lowest level of experience, this should allow them to connect major (and
sometimes repetitive) steps in data processing conveniently. At the
highest level, an efficient interface is needed to encourage development
of new, experimental, algorithms and processing techniques.
The following principles are important in the design and
implementation of AIPS++:
- Accountability -- Data should have associated
telescope performance ("monitor") and processing histories so their
origins and evolution can be easily reviewed and understood by
astronomers. Unnecessary structures should not be imposed on data, and
data should be accessible in both "raw'' and any modified forms.
However, it should be possible to have very flexible selection of data
sub-sets.
- Astronomical terminology and concepts -- Names
and labels in the data processing system should use the common language
of astronomy and mathematics.
- Programmability -- It should be easy for users to
prepare data processing "scripts'' for repetitive or multi-stage
processing, and to augment the system easily with new operations and
algorithms.
- Easy Customization -- The user should be able to
flexibly select data processing packages to be used, the style of user
interface, and environmental parameters such as directory names and
output devices.
- Hiding complexity -- Where possible the
complexity of algorithms or multi-step processing should be hidden from
the novice user.
- Confidence in results -- Data processing
diagnostics and the capability to un-do and re-do steps in the
processing are essential for telescope data so the user can understand
and have confidence in the data at any processing stage.
- Range of processing styles -- the same software
should be usable for both post-observing processing and remote or local
on-line data analysis
- Future types of processing -- the design should
allow for future growth in network computing with use of remote
displays, remote batch processing, and parallel processing of different
machines or in machines with parallel processing architecture.
AIPS has been an acronym for "Astronomical Image Processing System";
however, its capabilities, and users' requirements, have evolved far
beyond image plane processing. AIPS++ should now be a general tool for
turning telescope data, and model calculations, into scientific results.
In some cases, e.g., graphics and tables, the results should be in
publishable form. Most n-dimensional images are produced only as an
intermediate step between raw data and useful results, however some
constitute final scientific results and require reproduction in
publishable form. A similar range of purposes has evolved for single
dish data in systems such as
UniPOPS. The concept for AIPS++ should be that of an
Astronomical
Information Processing System.
In specifying a new
software system, it is useful to consider what aspects of astronomical
data processing have remained stable over the last 15 or so years. The
most stable parts of array and single-dish processing systems are the
fundamental descriptions of telescope data. For the major
arrays, the basic description has accumulated more attributes (e.g.
spectral channels, IFs) but it is still fundamentally a visibility data
set -- samples of a spatial coherence function in some convenient
spatial or temporal order. Similarly, a final image is still an array
of calibrated pixel intensities in a known coordinate system,
polarization, and observing frequency. The "end results'' are
scientifically meaningful quantities extracted from one or more such
images, and published visual representations of these images. Key, and
probably stable, basic ingredients of a user specification are therefore
the types of data to be handled (e.g. visibility data,
single dish spectra, images, image cubes).
Basic operations on
parts of data sets, such as Fourier transforms, least squares fitting
algorithms, plotting, display, mathematical, and other standard
functions are also relatively stable. We will call these basic
operations tools. The second ingredient needed by users is
an itemized tool kit of basic operations from which more complex
astronomical applications can be assembled.
In contrast, the algorithms used to calibrate, construct and
interpret data sets and images evolve as the astronomical community
acquires experience and sophistication in data and image analysis
techniques. The algorithms are the least stable elements of present
software. They continually evolve or are replaced (either as explicit
programs or as informal procedures that may involve astronomer
interaction). The algorithms are embodied in tasks which can
be implemented either as specific programs in a language such as C++ or
as scripts in a higher level language. Many of the tasks that are now
part of the lexicon of astronomical image processing will be embodied in
AIPS++ at an early stage. The tools in the kit provided by AIPS++ must,
however, be easily usable by astronomers to carry out new tasks whose
nature and scope may evolve rapidly with time.
In these terms, the core of AIPS++ must provide a generic toolbox
operating on specific data types. Given the finite resources available,
the limitations of AIPS++ should be more in the diversity of data that
can be handled rather than in what can be done to these data.
AIPS++ should have good command line interface with "full''
programming capability. This should be at the level to eliminate, for
most astronomers, the need to write FORTRAN or C++ programs. We view
the issue of who will be able to develop applications programs as one of
the most important issues for the future. "Full programming"
capabilities with the AIPS++ "command language" is very important;
however, the use of C++ and FORTRAN "template" programs that can be run
"with" AIPS++ is also important. In addition, the current plan to have
many astronomers doing C++/OOP programming for AIPS++ will require
special attention to astronomer-oriented documentation, programming
guides, and possibly things like programming "summer schools". Assuming
that everyone can learn what they need from industry-wide material aimed
at professional programmers is unwise, and is likely to limit the AIPS++
pool of developers to too small a group with too little astronomical
experience.
Documentation for AIPS++ should be available both on-line and in
hard copy. This should have multiple levels ranging from simple "help"
to extensive information, and dealing with both specific applications
and individual parameters.. Consistency between hard copy and on-line
documentation is imperative. Multi-window environments, as mentioned
above, should allow context-sensitive information to be displayed by
"clicking" on appropriate items. While the implementation aspects of a
UNIX "man" page might be useful, the displayed information should be
easily understandable to user-astronomers.
Multiple levels of user interface would be desirable to allow for
both novice users and experienced expert. User selection of the style
of interaction and the range of "packages" to be used should be
possible. Choice of the user interface should have no effect on the
code used in processing.
Styles of user interface are difficult to decide upon, and are very
dependent upon user experience and preference. The discussion in
Wood (1991) is an example of a useful approach
to the user interface that goes into details we have not discussed here.
We recommend planning a number of available styles, and extensive user
testing of each of them during early phases of AIPS++ development, as
opposed to deciding upon one approach and precluding all others. The
idea that the user interface is just another applications task, that can
take many forms, is probably very important in planning for the future
with a wide range of user needs and expertise.
A combination of the inclusion of single dish data reduction as part
of the domain of AIPS++, and the increased use of "nearly real-time"
data processing and remote observing for both single dish observing and
arrays, makes the use of AIPS++ as an integral part of the observing
process very important. This should not change or add to the
processing and display needs of AIPS++, but rather adds to the richness
of the tools that can be used to support the users' involvement in the
observing process. In addition the post-processing tools need by
instrumental staff to maintain their instruments have great commonalty
with the things a knowledgeable astronomer would like to see and do
during the observing process. The observer would like:
- capability to see instrumental status data both at the telescope
and on remote display devices connected to the telescope by networks;
- automatic first order data editing and calibration where possible;
- as much immediate data processing and display as feasible in
real-time;
- to be able to make changes in the observing program during
observing "runs"; and
- to be able record data processed in "nearly real-time" on
transportable media for further processing
In addition to the use of normal AIPS++ processing tasks, this
list of needs makes it necessary for preparation and changing of
observing programs to be immediately possible. Indeed, the preparation
of observing programs may become one of the extended tasks of AIPS++ for
some instruments.
The simulation of data produced by real instruments, based upon
assumed models of sources, is an additional capability that is essential
for AIPS++. This should be viewed as a necessary part of the testing of
AIPS++ applications software (both for de-bugging and evaluating
efficiency of processing), and as a tool for the astronomer that
provides both more realistic preparation for observing and the necessary
tools to compare models and data in AIPS++.
The data will come principally from radio telescopes although AIPS++
must allow import of images and data from other wavelengths. The
primary data types that are needed to support the AT, BIMA, EVN, GMRT,
MERLIN, VLA, VLBA, WSRT, the future mmA, and the various instrument
packages on the JCMT, GBT, the 12m and the 43m are as follows:
- Telescope status information
- Total power and phased array data sequences reflecting switched or
time series observations
- Spectra
- Images
- Planar images at radio, optical, X-ray, etc. wavelengths
- Spectral cubes - images in multi-spectral regions
- Time cubes - time-ordered images of variable sources
- Coherence function (visibility) data from correlation arrays
- arrays with real-time delay and phase variation correction
- tape recording arrays where the correlator output is coherence
function data for a range of time lags (or transformed frequencies)
- Calibration tables
- Data editing information
- Computed models
- Processing histories
Some of these data categories are naturally associated with
each other; it is also important to be able to group some together when
appropriate, e.g. in mosaicing observations. Some of these data types
are either super-sets or sub-sets of others; it is important to be able
to compose super-sets out of sub-sets and to decompose super-sets into
their sub-sets.
It is important that the astronomer have access at all stages of
data processing to the conditions under which an observation was made,
and to what has been done to it in the data processing. The ability to
wipe the slate clean has proven its utility over and over again in many
data processing systems. Hence the database should carry both
telescope-provided status information and a processing history in
formats that make it easy to "start over" if processing goes awry. This
supplementary information begins with data structures with telescope
information as a function of time, position, or other data identifiers
such as telescope name, latitude, etc. It continues with data
processing history sufficient to understand, un-do, and re-do that
processing.
Another view of the data relates to different uses and time scales
of use. These uses lead to three major categories of software: on-line
data analysis; system support software for staff operating and
diagnosing the operation of the instrument; and observers analysis
software. For single telescopes the observer has often done a major
fraction of data analysis at the telescope as part of observing process.
Recent hardware and networking developments have made such on-line data
analysis feasible even for high data rate instruments like the large
arrays and single dishes with fast sampling spectral processors. Most
telescopes have, or soon will have, full remote and local analysis
capability. In all these system data should be accessible to the user as
soon as practicable in nearly real-time. For all these reasons data
analysis software in the AIPS++ system should provide for the needs of
the above mentioned three categories of software.
Different single dishs and arrays have different approaches to
data handling in which similar words mean different things. These
"cultural" difference in language must be directly addressed and data
descriptions and terminologies that are consistent at all telescope and
development sites must be created and maintained.
From the point of view of the user the highest level identification
of the problem is what we will call a "project". Projects are aimed at
obtaining answers to scientific questions. Answers to these scientific
questions frequently involve obtaining data from a variety of
telescopes. Some projects require radio data from both single dish and
array observations from the same or different instruments, each serving
a different "purpose". Observations for each instrument are organized
into observing "runs" with sequences of "scans" with identical
instrumental and observing parameters. Each scan contains "sub-scans"
with data elements in the form of spectra, time instances of coherence
function data or spectra, etc., that are associated with instances of
time. Astronomers need to deal with this hierarchy of data: project,
purpose, instrument, observing run, scans, and sub-scans. It would be
very helpful if the astronomer could be aided in dealing with things
according to this hierarchy. Data that are viewed as simple sequences
of data from stand-alone telescopes leave the astronomer to impose a
mental image of project/instrument/purposes and then runs/scans on the
simple data elements. The future mmA will be a case where the same
instrument will generate both single dish and coherence function data
sets. This makes it a prime example where the same instrument will
serve diverse instrumental purposes for a wide variety of "projects".
In this document we list preparation for observing as an AIPS++
task. This is partly because simulation, using AIPS++ processing tools,
can be very useful in understanding an observing program during the
planning and preparation process. In addition, it is at this stage that
the user imposes the logic of project/instrument/purposes/runs/scans on
the observing process, and this logic must be remembered and used as
part of the data reduction and processing. If tools were available in
AIPS++ to aid the user in passing on and using this logic all the way
through data processing, it would be very helpful. It would be
analogous to having and updating the map of a maze that can be used
while passing through the maze. Data processing is very much like a
maze to be negotiated for most astronomers, and assistance in dealing
with the higher level purposes of data would be very useful.
The above can be describe more technically by saying that data sets
should have a hierarchy of descriptor (or "header") items, with
descriptor items being identified by context information (such as name,
position, etc., for images). These data descriptors should allow
specification ranging very large, merged data sets to basic elements
like pixels or u-v data points. It should be possible to eliminate
redundancy by describing information on a sufficiently high level while
allowing exceptions by overriding this information at a lower level;
that is, mixtures of positive and negative data/information
specifications.
This section contains long lists of brief description of the
elements of the user requirements from all the consortium
specifications. It is based on a merger of the distillation of
specifications by AIPS++ working groups in the areas of "User
Interfaces", "UV Data System and Processing Requirements", and "Image
Handling" -- and the original material from
individual consortium user requirements.
- The user should be able to choose one of a variety of user
interface styles. The most important are:
- an AIPS++ command line interface (CLI) which is a programmable
interpreter;
- a basic graphical user interface (GUI) for X-windows workstations.
- A FORTRAN program interface to AIPS++ tasks at the level of the
host operating system (used by FORTRAN programs who prefer this method
of adding software capabilities);
- Useful, but lower priority interfaces, are:
- Execution of AIPS++ tasks at the level of the host operating system
(used mainly by programmers and expert astronomers)
- A data flow graphical processing environment
- Additional interpreter or GUI interfaces
- All interfaces should use the same language for "commands" and
parameters, and where possible have the same look-and-feel
- Users should be able to execute operating system command sequences
from any interface
- Application "task" parameters should have the following properties:
- All parameters should have assigned defaults, possibly dependent on
context
- Selected parameters should be user-assigned according to the mode
of user interface
- Parameters should NOT be global across applications, unless
specifically requested or used to specify processing environment and
scope
- Names must be consistent across applications, using astronomical
terminology where possible
- Parameters should be able to be passed between applications when
applications share the same parameters
- Checking of parameters by applications before execution starts
should be done whenever possible (the user will be warned about
inconsistent, unusual, or dangerous combinations, and, for the latter,
may refuse executions with re-entry dependent on the type of user
interface)
- It should be possible to save, edit, and restore, parameters sets
for applications
- For some applications it would be useful if a task in execution
could have some parameters changed by user request
- A processing history or log should be maintained for documentation
and re-execution after editing
- Users should be able to choose any editor supported by the
operating system for editing ASCII data, parameter lists, processing
histories, etc.
- A variety of help levels, preferably context sensitive, appropriate
to novices, experts, etc.
- There must be a batch capability for all user interfaces with
capabilities to monitor, interrupt, and modify batch operations
- There must be error handling at all levels in as
user-understandable form as feasible
- It would be useful to have multi-tasking operation for CLI, batch,
and GUI interfaces
- It would be very desirable if the system could warn of high
resource usage (CPU, disk, tape) before applications are executed
(advise and request confirmations)
- Planned developments in the use of parallel processing make it very
important that one be able to develop and optimize algorithms based on
parallel architectures, particular for large mosaicing and spectral line
applications
- Must be usable from ASCII terminal and remote network nodes; this
particularly includes PC's or terminals with alternation between ASCII
modes and Tektronix emulation modes
- Must be programmable in the sense of supporting variable
assignment, conditional statements, control loops, string manipulation,
functions, and procedures
- Should have capabilities to build, read in, and write out
"procedures"
- Should have a beginner mode, with prompts in plain English, and
advanced mode
- Must have a "batch" facility for execution of series of CLI
commands
- Should have command-line recall and editing facilities
- Should allow uses to define their own commands and procedural
sequences of command lines
- Must have access and control of AIPS++ objects and data structures
- Should have user selectable input and output data streams, and
display devices
- Must support normal arithmetic operations for intrinsic,
user-defined, and image/spectral data types
- As much as possible, applications written for the AIPS++
interpreter should look the same (to users) as those written in the
compiled language
- A very useful feature would be "un-do" operations wherever this is
feasible
The basic graphical interface should be the primary user interface
for users in 1994. It should be the most attractive one for most AIPS++
users, and maybe even for experts. It should be a window-oriented
graphical interface with pull-down menus (for application selection and
parameter specification), multiple windows, and pop-up menus for context
sensitive help. Menus for application selection and parameter
specification should have pop-up sub-menus with options/parameters
depending on menu context.
It would be desirable if there were an advanced GUI with visual
programming of applications. Icons/glyphs for individual "tool"
components and connecting lines for passing of data. Sequence of
graphical task and data flow connections should be capable of being
saved, edited, and retrieved.
Documentation for AIPS++ must be a planned part of the AIPS++
development. It should be:
- Uniform in style, which may require a central documentation editor
or a single, capable technical writer;
- Prepared by, or in real consultation with, experts on the material;
- Use astronomical and mathematical terminology wherever possible;
- Not be the same as programming documentation except for the case of
AIPS++ programming guides or cook-books aimed at astronomers;
- Be completely available on line, based on the same textual material
used in printed documentation;
- Be used in the connection with "help" with context sensitivity and
capability to search on keywords or names (in or out of context);
- Have complete documentation, with references, of algorithms used
and their effects on data
- Have several levels including
- User cookbooks;
- Application descriptions;
- On-line help.
As discussed earlier, a large fraction of the data processing in
AIPS++ can described as "data handling". These data can be images,
single dish data sets, coherence function data sets, telescope
performance data, model data, or any data set that can be imported into
the system.
- Data from consortium instruments are to be imported into AIPS++
with software tailored for each telescope, producing data file(s) with
data that can be read, modified, and written using AIPS++ data I/O
routines.
- AIPS++ data files must be written to, and read from, tape and other
transportable media using AIPS++ data base I/O software.
- FITS and UVFITS data I/O to both disk and transportable media must
be supported.
- Data import and export for general cases should be supported in the
form of reading and writing tabular formats in ASCII or binary form. In
the case of ASCII tables, support of "columns" in both numerical and
"string" form should be supported.
- Transfer of data over networks should be supported.
- Data files in AIPS++ should be identifiable, and accessible, using
host operating system or user-written software.
- Data book-keeping software, both inside and outside AIPS++, should
be available to help users organize and otherwise deal with their data.
- Searching, summarizing, copying, and concatenating contents of
AIPS++, FITS, and UVFITS data files, with selection based upon data
parameters, must be possible for both disk files and transportable
media.
- Support of large data sets on multiple "tapes" must be supported
for AIPS++, FITS, and UVFITS data files.
- The data system file format(s) should be accessible from all
supported machines without conversion
- Distinctions such as "multi-source" and "single-source" data sets
should be avoided
- Applications should function on n-dimensional data in arbitrary
sort order with it being possible to hide sorting from the user
- It should be possible to extract data subsets from a data set,
manipulate it with the AIPS++ (IDL-like) command language, treat these
data like any other data for computation and display, and optionally put
it back in the original data set
- Flexible transport of data sets or subsets to and from AIPS++ and
other major data reduction packages should be supported
- Data coordinate handling must very general
- Support for non-regular increments in coordinate axes
- Flexible and reversible coordinate transformations must be
supported
- Coordinate and ephemeris information at the level needed for
astrometric and near field imaging should be supported
- Support for errors in data must be fundamental to the data system,
providing a basis for support of data error handling in a variety of
applications
- Error images associated with astronomical images
- Easy generation of error models
- Error propagation through a series of data processing steps
- Properly formatted errors when data are extracted for tabulation
- The user should have access to both data values and all the
associated (`header') information that govern the interpretation of the
data
- Processing histories should be maintained for all data sets, with
easy review and re-use by the user
- It should be possible to import general astronomical data (e.g.
catalogues) into AIPS++
- For applications where the built-in tasks and command language
features are insufficient, there needs to be a program interface
(outside AIPS++) to allow the casual programmer reasonable access to the
data; some flexibility and efficiency can be sacrificed in making this
interface comparatively simple; FORTRAN programmers should be supported
- All data from single dishes or interferometers should be assumed to
involve full measurement of the electromagnetic field involving all four
Stokes parameters and the equivalent complex polarization
representations, with support for, and transformation between, both
forms
- Multiple frequency bands may be simultaneously observed (e.g., for
observing multiple lines simultaneously or multi-frequency synthesis),
with variable numbers of channels in each band
- Frequency axes may be non-linear (e.g. as produced by
acousto-optical spectrometers) and time variable
- Polarization measurements may be time switched if all polarization
measurements are not obtained simultaneously
- Data combinations for different observations may have different
numbers of spectral channels and channel widths which may need to be
accommodated within single data sets
- As much as possible the data handling system should support
generalized imaging handling rather than having one system for images
and another for all other data, allowing scope for "vector images",
complex images, images with associated errors, double precision images,
etc.
- The user should be able to use the host operating systems
capabilities and utilities to manage data sets, using normal file names
and directory hierarchies
- Capability to transform tabular data file formats to AIPS++
instrumental data formats will allow general importation of non-standard
types of instrumental data
- Instrumental performance and meteorological data need to be
associated with instrumental data sets either directly or with
associated data "tables"
- The data system must deal with telescope dependent instrumental
data
- Data for focal plane arrays, or multi-beam feeds, with arbitrary
geometry (e.g., field rotation during the observations) characteristics
must be supported
- Mosaicing observations may have as many as 1000 pointing centers
which must be supported for single dish data, interferometer data,
imaging, image processing, and image display
- Must support rapid time switching of polarizations, frequencies,
and pointing centers (i.e., they may change for every integration)
- Time-series data for total power measurements and visibilities must
be supported (e.g. pulsar data, time variable sources)
- Error measures or estimates (e.g., weights) should be regarded as
standard in an observation
- The data system must allow for simultaneous processing of
"associated" data sets, such as different (e.g., by calibration,
integration time, fringe fitting, etc.) versions of the "same"
observation, so the best can be selected later
- Correlation data in the form or 16-bit integers or 32-bit floating
point must be supported
- It is desirable that the data system be as extensible as possible,
including new data types
- Triple correlation data, including cases where visibilities have
different frequencies
- Optical interferometer data
- It is desirable to be able to import, search, and select data,
including spectra, images, etc., from instrumental data catalogues and
archives
- There must be support for the major types of single dish data in
the system
- 1-D spectra, both evenly and non-evenly spaced in frequency (e.g.,
taken with AOS spectrometers), where an associated 1-D array identifies
spectral frequencies
- 1-D sequence of total power (continuum) measurements, possibly
unevenly spaced and associated with a 1-D sequence of pointing positions
or time
- 1-D sequences of data values taken at arbitrary positions, times,
foci, etc. and used for tipping, continuum on-off, focusing, and
pointing observations (the previous two items can be viewed as subsets
of this more general organization scheme for data
- 2-D matrices of data values as a function of (x-position,
y-position), (position, frequency), (frequency, time), (position, time),
(time, pulsar phase), (pulsar phase, frequency), etc., where both axes
many be non-linear or non-parameterizable
- 3-D "cube" of data values as a function of (x-position, y-position,
frequency), (x-position, y-position, radial velocity), (time,
frequency, pulsar phase), (x-position, y-position, time)
- Total power auto-correlation data must be supported
- There should be support of bit-field data for pulsar observations
- There must be data handling for fast sampling spectrometers with
from 128 to 32768 channels of data, producing very large data "cubes",
both for spectroscopy and observing where interference excision is
important
- There must be support for data from fast sampling surveys (basket
weaving, mosaic sampling)
Interferometer data should be regarded having potentially
inhomogeneous antenna properties, but this should not preclude dealing
with simplified cases where homogeneity can be assumed.
- Antenna size, system temperatures, and frequency band-passes may
differ widely
- The input data and calibration procedures may vary from antenna to
antenna
- Integration time may vary from antenna pair to antenna pair
- Visibility data handling must support for many correlator formats,
including MkII, S-2, K-4, MkIII, and VLBA modes
- The data system should support the merging of correlation data and
associated calibration data from different correlators and allow the
user to deal with duplicate correlations
- VLB antennas in space will require support for orbital position
dependence including acceleration terms
- Both single dish and aperture synthesis data may need to be merged
in the mosaic imaging process
- Self-calibration with cross-referencing to data for overlapping
areas should be supported
- Data handling for multiple pointing centers, including effects of
beam shapes and pointing errors for each center, should be handled in a
convenient manner
- Data should be selectable in terms of identification with a
particular type of calibration observation
- Both standard and user-defined models of data behavior should be
usable in determining calibration information from data sets
- Instrumental behavior that affects calibration should be integrable
in the calibration process through a mixture of parameterized functions
and models in tabular form
- Data correction based upon standard and user-defined functions,
with user supplied parameters, should be possible
- Calibration and correction of data should be reversible, with the
capability to BOTH store calibration/correction information and apply it
"on-the-fly" during processing, and apply this calibration/correction
information "once and for all", creating new, calibrated data sets
- Calibration should be made as generic as possible, with
telescope-specific methods kept to a minimum
- Calibration/correction of data should be possible from derived
tables of instrumental parameters (e.g., system temperature vs. time,
gain vs. elevations), with derivation of such tables from calibration
observations
- The calibration process should include flexible averaging of
calibration data and application with interpolations or weighted
averaging, all under control of the user
- Cross-calibration from different instruments should be possible
(e.g. flux scale, pointing) particular when data from different arrays
are to be combined
- Model fitting should be possible in both the image and u-v planes,
and it should be possible to use the resultant models for further
calibration and self-calibration
- There must be simulation programs for single dish, interferometer,
and mosaicing data bases for both planning and comparison of data with
models - with optional error generation for thermal noise, pointing
errors, primary beam errors, atmosphere, antennas surface errors,
beam-switching for total power, etc.
- Flexible spectral fitting for components and baselines;
interpolation/blanking of bad channels
- De-dispersing of spectral, long time series data for pulsars with
analysis and fitting in the intensity-frequency-time domain
- Telescope pointing and beam pattern determination and correction
- Analysis of telescope performance data: pointing,
telescope-tipping, focusing, and holographic data
- Deconvolution of `channel' shapes and `frequency-switched' data
- Analysis of telescope instrumental data in "nearly" real time
- Phased display of selected time sequence data
- Special intensity and polarization calibration of phased-array data
- Antenna-based determination of calibration and self-calibration
functions should be the primary form of calibration determination
wherever possible
- There must be capability to make phase and/or amplitude
corrections of data based upon difference between the data and modeled
data sets, where the latter are usually derived from imaging of the same
or highly related data
- Redundancy in data (possibly including crossing points) should be
used whenever possible as an additional constraint on calibration and
self-calibration
- Determination of, and application of corrections for, closure
errors should be possible with flexible averaging of input closure
information
- Fringe fitting for a range of spectral channels and fringe rates
(normally only for VLBI data) should be possible by baseline, as well as
globally by antenna
- Spectra calculation from complex summing fo visibilities in each
spectral channel for user-specified positions in the field of view
- Interferometric pointing, baseline, and beam pattern fitting and
related analysis
- Application and de-application of astrometric/geodetic correction
factors with complete and reversible histories
- Calibration of data for effects of the ionosphere, utilizing data
at multiple frequencies and/or external data on variations of electron
content
- Calibration for non-isoplanicity using special extensions of
self-calibration
- Calibration of mosaic data bases with necessary cross-referencing
of multi-pointing information; this makes it necessary for individual
data points to be associated with pointing center information, which is
a special problem for interferometric data
- Calibration parameterization must include specification of pointing
centers to be used, because of the need for "all-pointings-at-once"
operations, in addition, it must be possible to specify a subset of all
pointing for a given operation
- Determination and correction for pointing errors, and errors in
beam shape, using mosaic self-calibration techniques, will be important
- Special limitations on calibration for VLBI data
- Because of independent atmospheres, clocks, and LO's at each
antenna, there are uncertainties in phase referencing for fringes, and
variations in coherence time
- Because almost all geometrical effects scale with baseline
length, one needs the most accurate geometry for earth- and space-based
interferometry
- The correlated visibility functions are essentially phaseless, so
self-calibration of phase is the only possibility
- Antenna characteristics can be very different across the array,
hence one must be careful about antenna-based simplifying assumptions
- For amplitude calibration one must use the T_sys and K/Jy
determined for each antenna, and determination of these quantities are
an essential part of the calibration process
- For spectral line sources one can do amplitude calibration with
auto-correlation spectra plus calibration at one antenna
- Accurate Doppler correction for each spectral channel is essential
- Fringe-fitting with sets of data with as many baselines as possible
is essential, with the limitations that sources must be detected in a
few minutes and only bright sources can be observed unless one does
phase-referencing
- For polarization calibration, all calibration sources are resolved
and the polarized intensity distribution may not be like he total
intensity distribution, therefore one must iteratively determine both
source polarization structure and instrumental polarization
- Polarization calibration must use an ellipticity-orientation model
for feed polarization, which is non-linear and computationally
"expensive", because one must calibrate mixed linear and circular
polarization characteristics (requires very careful amplitude and phase
calibration)
- Calibration and self-calibration on the same, with both depending
on deconvolved source models, and this process can take tens to more
than a hundred iterations
- Calibration has baseline-dependent factors because of mismatched
frequency passbands in non-identical telescopes
- Full phase calibration is an iterative process involving limits
set by: astrometry, geodesy, and weak source imaging/detection,
therefore one needs:
- very accurate geometric models, typically to at least 1/10 of a
wavelength accuracy
- knowledge of location of the Earth's pole and UT1, both of which
are generally known only after astrometric/geodetic analysis
- values of ionospheric delay as determined from measurements at
simultaneous frequencies, or external measurements of ionospheric
electron content
- measurement of properties of troposphere dry terms (from surface
meteorological measurements) and wet terms (Kalman filtering, GPS
multi-frequency satellite measurements, WVR)
- instrumental delays as determined from phase calibration signals
- knowledge of non-rigidity of the earth due to earth tides and
atmospheric loading
- Very accurate coordinate systems are required; geodesy uses a
system based on solar system barycenter
- Full history of telescope behavior/environment and assumed
correlator model must be part of data subjected to global fringe-fitting
- Data display and editing should be seen as generic tools applicable
to single dish, interferometer, and other forms of data
- Data visualization for evaluation and editing purposes should be
seen as an integral, or closely coupled, aspect of the data system
- It should be possible to do interactive editing based upon display,
with "zoom" or magnification, and menu selection of editing options
- Various "viewing strategies" should be available
- For interferometer data, baseline by baseline display (with
magnification of local areas) and interactive editing (including
multiple, simultaneous baselines) using both Intensity-time-baseline
displays and Intensity displays in u-v plane
- Displays of spectra and spectral cubes aggregated in various ways
(spectra vs. time, averaging in time, averaging of channels)
- Selection of data by specifying windows in space and/or time
- Selection of arbitrary cuts through data (e.g. circular, radial,
or a user-defined locus) through selected data coordinates
- Display of expanded data aggregates (e.g., pointing and clicking
on an average multi-channel region of data to show the component
spectrum
- Comparison displays of generic model data (from fitted
components) with observed and/or processed data, including display of
data with model subtracted or divided
- Data editing should be reversible, with the capability to store,
apply, and un-do editing information
- Data editing should be possible on the basis of monitor/observing
log data
- Editing should be possible from "consistency check" information,
particularly where there is redundancy or (for interferometric data)
where there are crossing points in the u-v plane
- It is desirable to have parameter-driven, automated flagging for
large data sets
- Editing must be possible based upon difference between data and
models generated during self-calibration
- Data editing based upon recognition of interference patterns in
intensity-time-frequency data is very important, particularly for low
frequency observations
In this section we consider the formation of images from edited,
calibrated data. While this is mainly image computation and
deconvolution, it must be remembered, that for the user, imaging and
image deconvolution is an integral part of the process of data
inspection/editing, calibration, imaging, self-calibration, data/image
display, spectrum/time/image analysis, and production of hard copy for
publication purpose. This process must be well integrated for the
convenience of the user. It should be possible to easily
"mix-and-match" self-calibration, data transformation, and
de-convolution "tools", for example, using CLEAN to deconvolve in the
early stages, and maximum entropy later on when CLEAN begins to be less
useful. This is related to the need to make self-calibration use a
generic model, which could be a table of CLEAN-components, a table of
Gaussian components, or an image.
- Image construction from calibrated total power data (beam-switched,
multi-beam, focal plane array) data sequences from single antennas and
phased arrays, with and without spectrometers, is required
- Spectral line cube formation
- Image construction using u-v data sets must be possible with a
range of capabilities
- Computation of "dirty" images and point spread functions by 2-D
FFT of selected, sorted, and gridded data with user control of data
selection, gridding algorithm and its parameters, and image parameters
(image size, cell sizes, polarization)
- Flexible computation of data cubes where the third axis is
frequency/velocity or time
- Simultaneous, multiple field imaging with un-gridded data
subtraction using MX-like algorithms
- Direct Fourier transform imaging of arbitrary (and usually small)
size fields
- Imaging after subtraction for sources
- Imaging of spectral line data sets with continuum subtraction
based upon continuum data, or continuum models
- Estimation and input of zero-spacing flux density and appropriate
weighting
- Mosaic image construction using mixture of u-v data sets and single
dish data for multiple antenna pointing centers
- Linear combination of pre-deconvolved images, weighting
determined by primary beam
- Linear mosaic algorithm with linear deconvolution (MOSLIN in SDE)
- Non-linear (MEM-based) mosaic algorithm (VTESS, UTESS in AIPS,
MOSAIC in SDE)
- Cross-calibration (enforced consistency) between data taken with
different instruments (flux scale, pointing)
- Pointing self-calibration to determine corrections for both
single dish and visibility data
- 3-D mosaicing allowing for sky curvature (non coplanar baselines)
- Self-calibration and editing of all pointings in one processing
step
- Capability to determine the primary beam(s) from a mosaic image
and its related data sets
- Ability to deal with any primary beams in different forms
(analytic 1- and 2-D, tabular), including user modification of primary
beam models
- Imaging using multiple-frequency data sets and a user-defined model
for spectral combination "rules" must be possible
- Imaging computation should generally take multiple data sets where
this makes sense
- Imaging data selection should flexibly allow use of data sub-sets,
with data selection based upon time, antenna, frequency, and ranges of
other data (including monitor data)
- 3-D imaging of data affection by sky curvature (wide-field problem)
is essential
- Imaging wide fields large than the isoplanatic region is essential
- Near field imaging of nearby objects like comets and asteroids must
be possible
- Special VLB imaging requirements:
- Need more accurate handing of precession, nutation, and
aberration in the u,v,w used for imaging
- Larger gaps in u-v plan data produces "dirtier" beams and greater
need for image modeling/deconvolution
- Fields of view not radially smeared due to finite bandwidths are
relatively small, so one needs "fringe-rate" imaging, and
multi-pointing processing for widely spaced sources in the field
- Near field problems for solar system objects are more important
because of the larger baselines
- Need to do Lorentz transformation to inertial reference frame
because of the relativistic distortion due to the Earth's orbital motion
In
this section we list some of the image-specific transformations of data
that are very general operations. Many image transformations are
basically transformations of images as arrays of numbers, so we include
these operations in the upcoming sections on data "structure"
transformation.
- Image de-convolution from dirty image and point-spread-function
- Hðgbom CLEAN
- Clark-Hðgbom CLEAN
- Cotton-Schwab CLEAN
- Smoothness-stabilized CLEANs
- Maximum entropy
- Maximum emptiness
Data "structures" are assumed to be 1-, 2-, 3- (or n-)
dimensional aggregations of data values. Included are tabular data
structures that are special two-dimensional arrays which may have
different (numerical) data contents for each "column". Most of the
requirements in this section should apply to any type of data, and where
the operations are meaningful only for certain data types this will be
noted.
- Extraction and creation of new data structures
- Selection of a lower-dimension structure from a higher-dimension
data structure (i.e., a vector from a plane or cube, a plane from a
cube, a cube from a 4-dimensional structure, or an n-dimensional
sub-structure from an n-dimensional structure) based on reasonable
extraction criteria must be supported.
- Selection of sub-structures as in the previous item, but with
user-selectable, arbitrary "rotation" angles and regular interpolation
in each dimension (i.e., arbitrary lines through planes, rotated
sub-planes from planes, arbitrary planes from cubes).
- Creation of large n-dimensional structures from smaller
n-dimensional structures (tessellation of planes, cubes)
- Extraction of vectors perpendicular to a curvilinear,
user-defined track in a plane
- Extraction of a new structure based on interpolation with respect
to different coordinate system
- Extraction of a new structure with different spatial/velocity/etc.
resolution, possibly based on a new coordinate system, using
convolving, fitting, or de-convolving functions
- Extraction of a new data structure based on SQL-like queries on
data values and parameters
- Generalized data structure arithmetic
- Mathematical operations and functions for numbers and
n-dimensional (vectors, planes, cubes, ...) allowing creation of new
data structures
- Averaging, summing, weighted summing of data structures
- General tensor arithmetic
- Unary and binary matrix operations
- Data structure creation with specification of indices and
dimensions
- Concatenation
- Inner and outer vector products
- Matrix inversion
- Spread-sheet like processing with arrays and numbers
- Specialized Operations on Data Structures
- n-dimensional cube rotation and transposition
- forward and inverse Fourier transforms for real and complex arrays
as appropriate
- non-generic Fourier transforms
- smoothing, convolving, filtering, and histogram equalization
- max/min, sigma clipping, mean, median, mode, edge operations
- differentiation (gradient, divergence, curl, Laplacian) operations
for vectors
- Interpolations through blanked (missing) areas
- "Linear" and "non-linear" registration of images
- Operations on arrays that apply/remove primary beam or gridding
correction functions
- De-convolution of channel shapes
- De-convolution of frequency-switched spectral line data
- Support for error handling in analysis tasks
- Source subtraction for standard (Gaussian) and user-defined models
- Filtering with standard (e.g. Sobel, unsharp mask) and
user-definable filters
- Source subtraction in both image and u-v domains
- Correction of data for source motion (asteroids, comets)
- Modification of planetary and solar data to remove effects of
disk emission, motion, and rotation
- Statistical Analysis of Data Structures
- Histogram displays of data in selected regions
- Noise statistics of selected areas (mean, median, mode, rms,
chi-squared, etc.)
- Power spectrum analysis
- Structure function analysis
- Cross-correlation analysis
- Defining Regions within data structures
- Interactive input of rectangular (box) and curvilinear (blotch)
areas of interest based upon pixel or coordinate specification, or
cursor "point and click"
- Capability of saving and restoring regions of interest in files
- Flexible identification of blanked-pixel (-voxel) regions
- Capability of blanking regions based on noise limitations
- Data Analysis Operations
- Spectral line fitting for components (Gaussian, user-defined) and
baselines (polynomials, sinusoids, user-defined)
- Automatic finding (fitting) of sources in images, generating
lists of source positions and intensities
- Fitting and removing source components (Gaussians, parabolas, etc.)
- Functional fitting in continuum cubes (rotation measure, spectral
curvature, user-defined)
- Fitting n-dimensional surfaces with linear and non-linear
(chi-squared) techniques
- Least squares fitting (with error analysis) to ordinary and
orthogonal polynomials, for equally spaced and unequally spaced data
- Spline fitting and interpolation
- Zonal averaging in elliptical/spheroidal rings/shells
Many of the major functions of image analysis have already been
discussed under the general category of data structure transformations
and analysis. This is an area of applications that is highly dependent
upon astronomer specification of the needs for a particular problem. For
this reason tools for this analysis, and programmability by the
astronomer, are most important. A few cases that illustrate advanced
problems are the following.
The extraction of information from data cubes is one of the most
important, but computationally (and visually) difficult areas of image
analysis. The visualization problem in general requires both special
hardware and flexible analysis software. The relation of most data
cubes to spectroscopy, and the importance of radiative transfer to
spectroscopy, presents a basic need for the astronomer to analyze data
in an environment where it is possible to compute and compare models
derived from spectral radiative transfer. Since this cannot be viewed a
the job of any instrumental support group, the versatile programming of
computational tools is the most important thing for the astronomer who
has reached this stage of "image analysis".
In addition to spectroscopy, image analysis and comparison, for some
problems in the future will require dynamical gas/fluid computations.
For example, analyzing HI or molecular line images of galaxies as
snapshots of "fluids" means the divergence, curl, and Laplacian of
vector fields must be calculated to study continuity, vorticity, and
viscosity.
Moving source problems, particularly the difficult case of solar
imaging, requires special modeling or data corrections. Rotation and
registration of imaging taken at different times and locations, and with
different instruments, requires special treatments dependent upon the
scientific problem at hand, which is usually in the solar system domain
involving the Sun, planets, asteroids, comets, etc.
By data display we mean listings, plots, and "pictures" that are
useful in examining data and results derived from data. By recording we
mean hard copy of these data displays. The form of data display depends
on the user interface. The form of data recording may depend upon
printer and other hardware so output files should be as
device-independent as possible, with separate production of
device-dependent files.
- User selection of display and recording devices
- Flexible numerical data display (including output to files and
printers) in the form of numerical tables
- Flexible plotting, and data specification for, one data variable
vs. another, with optional error bars, with point type, line type, and
color differentiation for multiple plots
- Contour plots of 2-D data arrays with optional number labeling,
distinguishing negative contours and depressions, with color
differentiation where possible
- Ruled surface plots of 2-D data arrays, with color display on
that surface for another 2-D data array
- Rendered surface displays of a data cube from an n-dimensional
array, with rotation, aspect, and external lighting control
- "Opacity" summation display for a displayed data cube
- Projection of user-selected image planes, or summed images, on
the "sides" of an 3-D image "box"
- It should be possible to request diagnostic warnings if plot,
contour, etc. are below designated noise levels
- Tiled displays of 1-D and contour plots
- Calibrated wedge displays for color and grey scale
representations
- User-definable color palettes and transfer functions
- Useful "header" information should appear by default on plots,
but user-defined annotations should be possible
- Flexible overlay capabilities for comparison of different types
of data
- Capability of displaying tabular data in one or two windows with
a corresponding X-Y plot in another, with interactive identification
of points in plot with entries in table
- Plots of spectral profiles with user-defined superposition or
tiling
- Sub-windows of spectra associated with images should be
displayable, user-movable, plots with respect to an image using either
superposition or with lines connect image position to spectral window
display
- Contour, grey-scale, and color plots of axis of an n-dimensional
plot as a function of any other (two) coordinates should be possible -
subsets of these would be the common longitude-velocity plots for
specific latitudes
- Flexible extraction of spectra or sequences of spectra from
user-defined regions in a spectral cube
- Display of spectra and spectral cubes with and without model or
continuum subtractions
- Image, ruled surface, etc. displays of variable data where one
axis is time
- Period phased plots of data with user-defined binning
- Image displays in windows with numerical and/or analog control of
parameters, transfer functions, and color tables
- Cursor feedback facility of numerical information in displayed
images
- Multiple image display windows (different displays of data for
the same coordinate space) and overlaying of images in a given window
- Intensity-hue display and independent RGB image superposition or
comparison (for appropriate hardware) of two or three images
- 4-D display of image information where intensity is a rendered
surface and color on that surface is coded for a fourth parameter like
rotation measure, polarization, spectral index, etc.
- User-controlled "blinking" of images
- Multi-panel displays of images related by frequency (velocity)
time or other (third) dimensions
- Flexible "movie" displays of images as a function of frequency,
time, etc., with interactive control of speed, zoom, and pixel display
range - and optional averaging of "frames"
- Facility to return/display spectra and other data for cursor
selected points (or regions) in a spectral line "cube"
- Polarization image displays with flexible display sensible
combinations of intensity, polarized intensity, percentage polarization,
position angle, etc.
- Image displays with histogram equalization
- Plotting of pixel values in one image vs. pixel values in another
for user-defined regions
- Optional pixel histogram displays associated with images
- Superposition of multiple coordinate grids on images - pixel,
equatorial, galactic, ecliptic, etc.
- "Smart" superposition of contours on image displays (contours
adjust grey scale or color depending on background)
- Support of "all-sky" displays of data, wrap-around contouring
- Snapshot hard copy of both separate windows and multi-window
screen displays
- Translation of image displays to input files for high quality
plotting, grey scale, and color copy devices, preserving transfer
functions and color palettes where appropriate
- Screen scratch pad capability added to images (and their
coordinate overlays), including insertion of descriptive lines,
curves, boxes, shaded areas, and text - with transfer of all to hard
copy devices
- Capability to transform on-screen displays to device independent
(or equivalent) files that can be used in manuscripts
The preparation of specifications for AIPS++ involves two major
complications.
The first is that the instruments, and the types of measurements for
which they are designed, are diverse and complicated at levels of detail
that are important for many applications. However, this can be handled
by careful attention to detail and a judicious balance between
generality and those details. In this document we have dealt mainly
with generalities, so technical details needed in the implementation of
software for specific instrumentation must be dealt with elsewhere.
The second, and most severe problem, is that while it may be
(relatively) easy to specify what we need for the science of the past
and present, the most important needs are for the scientific problems of
instruments and astronomers in the future. Careful consideration of
what this means for a software system leads to one general conclusion:
availability of computational tools, user-programmability of these
tools, and easy transferability of data between software systems are the
most important capabilities that one can have for the future of any
scientific software system.
ATNF Staff, 1991, ATNF AIPS++ User
Specifications, AIPS++ User Specifications Memo 106.
BIMA 1991,
AIPS++ User Specifications: BIMA Version,
AIPS++ User Specifications Memo 108.
Cornwell, T.J. 1990, (ed.), Final Report of the Software
Advisory Group (SWAG), AIPS++ User Specifications Memo 102.
(available only in paper form)
DRAO 1991, DRAO User Requirements --
AIPS++, AIPS++ User Specifications Memo 111.
Foster, R., Haynes, M., Heyer, M., Jewell, P., Maddalena, R.J.,
Matthews, H., Reich, W., and Salter, C. 1991
Requirements for Data Analysis Software for the Green Bank Telescope,
GBT Memo 72.
GMRT group, 1992, GMRT Requirements Documents, AIPS++
User Specifications Memo 114. (available only in paper form)
Hjellming, R.M. 1991,
Miscellaneous Suggestions for AIPS++,
AIPS++ User Specifications Memo 107.
Hjellming, R.M., Bridle, A.H., Maddalena, R.J., Wood, D.O.S.,
Zensus, J.A. and Westpfahl, D.J.1991,
AIPS++ User Specifications: An Initial
NRAO-Oriented Version, AIPS++ User Specifications Memo 105.
Liszt, H.S. 1992, A Single-Dish Data Handling Environment for
AIPS++, AIPS++ User Specifications Memo 113. (available only in
paper form)
Noordam, J.E. 1991, (ed.) Dutch Requirements
for AIPS++, AIPS++ User Specifications Memo 112.
Shone, D.L. 1992, (ed.) Jodrell Bank User
Requirements for AIPS++, AIPS++ User Specifications Memo 110.
Wood, D.O.S. 1991, The AIPS++ User
Interface, AIPS++ User Specifications Memo 104. (available only
in paper form)
Copyright © 1992,1995,2002 Associated Universities Inc.,
Washington, D.C.
abridle@nrao.edu