Previous Up Next

Chapter 10  Parallel Processing in CASA

Starting in CASA 4.5.0, a parallelized execution of a full data analysis from data import to imaging is possible using a new infrastructure that is based on the Message Passing Interface (MPI). Briefly, MPI is a standard which addresses primarily the message-passing parallel programming model in a practical, portable, efficient and flexible way.

10.1  The CASA parallelization scheme

In order to run one analysis on multiple processors, one can parallelize the work by dividing the data into several parts (“partitioning”) and then run a CASA instance on each part or have non-trivially parallelized algorithms, which make use of several processors within a single CASA instance. Non-trivial parallelization is presently only implemented in certain sections of the imaging code of CASA based on OpenMP.

All other parallelization is achieved by partitioning the MeasurementSet (MS) of interest using the task partition or at import time using importasdm. The resulting partitioned MS is called a “Multi-MS” or “MMS”. Logically, an MMS has the same structure as an MS but internally it is a group of several MSs which are virtually concatenated. Virtual concatenation of several MSs or MMSs into an MMS can also be achieved via task virtualconcat.

Due to the virtual concatenation, the main table of an MMS appears like the union of the main tables of all the member MSs such that when the MMS is accessed like a normal MS, processing can proceed sequentially as usual. Each member MS or “Sub-MS” of an MMS, however, is at the same time a valid MS on its own and can be processed as such. This is what happens when the MMS is accessed by a parallelized task. The partitioning of the MMS is recognized and work is started in parallel on the separate Sub-MSs, provided that the user has started CASA with mpicasa. See how to start CASA in parallel in § 10.3.1.

The internal structure of an MMS can be inspected using the task listpartition. See § 2.2.10.

10.2  Multi-MS creation

10.2.1  Partition

The partition task is the main task to create a “Multi-MS”. It takes an input Measurement Set and creates an output “Multi-MS” based on the data selection parameters.

The inputs to partition are:

#  partition :: Task to produce Multi-MSs using parallelism
vis                 =         ''        #  Name of input measurement set
outputvis           =         ''        #  Name of output measurement set
createmms           =       True        #  Should this create a multi-MS output
     separationaxis =     'auto'        #  Axis to do parallelization across(scan, spw, 
                                           baseline, auto)
     numsubms       =     'auto'        #  The number of SubMSs to create (auto or any number)
     flagbackup     =       True        #  Create a backup of the FLAG column in the MMS.

datacolumn          =      'all'        #  Which data column(s) to process.
field               =         ''        #  Select field using ID(s) or name(s).
spw                 =         ''        #  Select spectral window/channels.
scan                =         ''        #  Select data by scan numbers.
antenna             =         ''        #  Select data based on antenna/baseline.
correlation         =         ''        #  Correlation: '' ==> all, correlation='XX,YY'.
timerange           =         ''        #  Select data by time range.
intent              =         ''        #  Select data by scan intent.
array               =         ''        #  Select (sub)array(s) by array ID number.
uvrange             =         ''        #  Select data by baseline length.
observation         =         ''        #  Select by observation ID(s).
feed                =         ''        #  Multi-feed numbers: Not yet implemented.

10.2.1.1  The createmms parameter

The keyword createmms is by default set to True to create an output MMS. It contains three sub-parameters, separationaxis, numsubms and flagbackup. Partition accepts four axes to do separation across: ’auto’, ’scan’ ’spw’ or ’baseline’. The default separationaxis=’auto’ will first separate the MS in spws, then in scans. It tries to balance the spw and scan content in each Sub-MS also taking into account the available fields.

The baseline axis is mostly useful for Single-Dish data. This axis will partition the MS based on the available baselines. If the user wants only auto-correlations, use the antenna selection syntax such as antenna=’*&&&’ together with the baseline separation axis. Note that if numsubms=’auto’, the task will try to create as many Sub-MSs as the number of available servers in the cluster. If the user wants to have one Sub-MS for each baseline, set the numsubms parameter to a number higher than the number of baselines to achieve this.

The user may force the number of “Sub-MSs” in the output MMS by setting the sub-parameter numsubms. The default ’auto’ is to create as many “Sub-MSs” as the number of engines used when starting CASA, in an optimized way. Details on how to set the number of engines using mpicasa are given in § 10.3.2.

The flagbackup sub-parameter will create a backup of the FLAG column and save it to the .flagversions file.

10.2.2  Importasdm

Task partition has been embedded in task importasdm so that at import time the user can already create a MMS. Set the parameter createmms to True and the output of importasdm will be a MMS created with default parameters. Sub-parameters separationaxis and numsubms are also available in importasdm. From this point on in the data reduction chain, tasks that have been parallelized will run automatically in parallel when they see an MMS and tasks that are not parallelized will work in the same way as they normally do on a MS.

10.3  Parallelization control

10.3.1  Requirements

The following requirements are necessary for all the nodes to be included in the cluster:

10.3.2  Configuration and Start-Up

The main library used in CASA (4.4+) to achieve parallelization is the Message Passing Interface (MPI). MPI is already included in the CASA distribution so that users do not need to install it. The CASA distribution comes with a wrapper of the MPI executor, which is called mpicasa. This wrapper does several settings behind the scenes in order to properly configure the environment to run CASA in parallel.

The “cluster”, i.e. the collection of CASA instances which will run the jobs from parallelized tasks, is set up via mpicasa. Mpicasa only needs to know how many nodes will be used in the processing. The simplest example is to run CASA in parallel on the localhost using the available engines in the machine. A typical example would be a desktop with 16 engines. If you want to use half of the engines to run CASA in parallel, give this information to mpicasa and call CASA normally. Example:

        mpicasa -n <number_of_engines> path_to_casa/casa <casa_options>

Where:

  1. mpicasa: Wrapper around mpirun, which can be found in the casa installation directory. Example: /home/user/casa-release-4.5.0-el6/
  2. -n: MPI option to get the number of engines to use in the processing.
  3. number_of_engines: The number of engines to be used in the localhost machine.

    NOTE: MPI uses one engine as the MPI Client, which is where the user will see messages printed in the terminal or in the logger. The other engines are used for the parallel work and are called MPI Servers. Because of this, usually we give number_of_engines + 1.

  4. casa: Full path to the CASA executable, casa or casapy.
  5. casa_options: CASA options such as: -c, –nogui, –log2term, etc.

It is also possible to use other nodes, which can form a “cluster”. Following the requirements from § 10.3.1, replace the “-n” option of mpicasa with a “-hostfile host_file”, as shown below:

        mpicasa -hostfile <host_file> path_to_casa/casa <casa_options>

Where:

  1. host_file: It is a text file containing the name of the nodes forming the cluster and the number of engines to use in each one of the nodes.


Example:

orion slots=5
antares slots=4
sirius slots=4

The above configuration file will set up a cluster comprised of three nodes (orion, antares and sirius), deploying the engines per node as follows: At host “orion” up to 5 engines will be deployed (including the MPI Client). If the processing requires more engines, it will take them from “antares” and once all the 4 engines in “antares” are used, it will use up to 4 engines in “sirius”.

To get help do:

        mpicasa --help

10.3.3  Examples of running CASA in parallel

The following is a list of typical examples on how to run CASA in parallel. Once CASA is started with mpicasa and the “Multi-MS” is created, there is basically no difference between running CASA in serial and in parallel. You can find an example of a parallelized analysis in the following regression script located in a sub-directory of your CASA distribution.

alma-m100-analysis-hpc-regression.py

Example 1. Run the above regression script in parallel, using 8 engines in parallel and 1 engine as the MPI Client.

  mpicasa -n 9 path_to_casa/casa --nogui --log2term -c alma-m100-analysis-hpc-regression.py

Example 2. Start CASA as described in §10.3.2, for an interactive session, using 5 engines on the local machine.

        mpicasa -n 5 path_to_casa/casa --nogui --log2term

Run importasdm to create a “Multi-MS” and save the online flags to a file. The output will be automatically named uid__A002_X888a.ms, which is an MMS partitioned across spw and scan. The online flags are saved in the file uid__A002_X888a_cmd.txt.

CASA <2>: importasdm('uid__A002_X888a', createmms=True, savecmds=True)

List the contents of the MMS using listobs. In order to see how the MMS is partitioned, use listpartition.

CASA <3>: listobs('uid__A002_X888a.ms', listfile='uid__A002_X888a.listobs')
CASA <4>: listpartition('uid__A002_X888a.ms')

Apply the online flags produced by importasdm, using flagdata in list mode. Flagdata is parallelized so that each engine will work on a separated “Sub-MS” to apply the flags from the uid__A002_X888a_cmd.txt file. You will see messages in the terminal (also saved in the casapy-###.log file), containing the strings MPIServer-1, MPIServer-2, etc., for all the engines that process in parallel.

CASA <5>: flagdata('uid__A002_X888a.ms', mode='list', inpfile='uid__A002_X888a_cmd.txt')

Flag auto-correlations and the high Tsys antenna also using list mode for optimization.

CASA <6>: flagdata('uid__A002_X888a.ms', mode='list',
                   inpfile=["autocorr=True","antenna='DA62'"])

Create all calibration tables in the same way as for a normal MS. Task gaincal is not parallelized, therefore it will work on the MMS as if it was a normal MS.

CASA <7>: gaincal('uid__A002_X888a.ms', caltable='cal-delay_uid__A002_X888a.K',
                  field='*Phase*',spw='1,3,5,7', solint='inf',combine='scan',
                  refant=therefant, gaintable='cal-antpos_uid__A002_X888a',
                  gaintype='K'))

Apply all the calibrations to the MMS. Applycal will work in parallel on each “Sub-MS” using the available engines.

CASA <8>: applycal(vis='uid__A002_X888a.ms', field='0', spw='9,11,13,15',
                   gaintable=['uid__A002_X888a.tsys',
                              'uid__A002_X888a.wvr.smooth',
                              'uid__A002_X888a.antpos'],
                   gainfield=['0', '', ''], interp='linear,linear',
                   spwmap=[tsysmap,[],[]], calwt=True, flagbackup=False)

Split out science spectral windows. Task split is also parallelized, therefore it will recognize that the input is an MMS and will process it in parallel, creating also an output MMS.

CASA <9>: split(vis='uid__A002_X888a.ms', outputvis='uid__A002_X888a.ms.split',
                 datacolumn='corrected', spw='9,11,13,15', keepflags=True)

Run clean or tclean normally to create your images.


Previous Up Next