Version 1.9 Build 1556

News	FAQ
Search	Home

Next: Some further remarks Up: NOTE 199 - Table Query Language Previous: Selection from a table

Subsections

Expressions

An arbitrary expression can be used in the WHERE clause, as a sort key in the ORDERBY clause, in an element in the set in the GIVING clause, or as a value in the INSERT and UPDATE command. Note that the expression result must be a boolean scalar if used in the WHERE clause. If used in ORDERBY the result can also be a numeric or string scalar. If used in GIVING it can also be a DateTime scalar. If used as a new value in the UPDATE or INSERT command it can also be an array.
The expression in the clause can be as complex as one likes using the standard arithmetic, comparison, and logical operators. Parentheses can be used to group subexpressions.
The operands in an expression can be table columns, table keywords, constants, functions, sets and intervals, and subqueries.
The index operator can be used to take a single element or a subsection from an array expression.
E.g.

  column1 > 10
  column1 + arraycolumn[index] >= min (column2, column3)
  column1 IN [expr1 =:< expr2]

The last example shows a set with a continuous interval.

Data Types

Internally TaQL uses the following data types:

Bool
Double: which includes integers and times/positions
Complex: which includes single and double precision complex
String: Operator + can be used (concatenation).
Regex: which is formed by the functions regex, pattern, and sqlpattern or by a special constant (see discussion below).
DateTime: which represents a date/time. There are several functions acting on a date/time. Also operator + and - can be used.

Scalars and arbitrarily shaped arrays of these data types can be used. However, arrays of Regex are not possible.
If an operand or function argument with a non-matching data type is used, TaQL can do the following automatic conversions:
- from Double to Complex
- from String to DateTime

Externally TaQL supports any possible data type of a table column or keyword. In some commands (column list and CREATE TABLE) it is possible to specify the data type of a column. There are several synonyms. The possible types and values are:

   B          BOOL       BOOLEAN
   U1         UCHAR      BYTE
   I2         SHORT      SMALLINT
   U2   UI2   USHORT     USMALLINT
   I4         INT        INTEGER
   U4   UI4   UINT       UINTEGER
   R4   FLT   FLOAT
   R8   DBL   DOUBLE
   C4   FC    FCOMPLEX   COMPLEX
   C8   DC    DCOMPLEX
   S          STRING

Constants

Scalar constants of the various data types can be formed as follows (which is very similar to Glish):

A Bool constant is the value T or F.
A Double constant can be any integer or floating-point number.
Another way to define a Double constant is by means of a Time or Position. Such a constant is always converted to radians. It can be given in several ways:
- An integer or floating-point number immediately followed by a unit (thus without whitespace). E.g. 12.43deg
  Some valid units are deg, arcmin, arcsec (or as), rad. The units can be scaled by preceeding them with a letter (e.g. mrad is millirad).
- A time/position in hours in HMS format. E.g. 12h34m34.5 or 8h
- A position in degrees in DMS format. E.g. 12d34m34.5 or 8d
- A position in degrees in dot format. Note that the dots for degrees and minutes must always be present. E.g. 12.34.34.5 or 8..34.5
The imaginary part of a Complex constant is formed by a Double constant immediately followed by a lowercase i. A full Complex constant is formed by adding another Double constant as the real part. E.g.
```
       1.5 + 2i
       2i+1.5            is identical
```
Note that a full Complex constant has to be enclosed in parentheses if, say, a multiplication is performed on it. E.g.
```
       2 * (1.5+2i)
```

A String constant has to be enclosed in " or ' and can be concatenated (as in C++). E.g.

         "this is a string constant"
         'this is a string constant containing a "'
         "ab'cd"'ef"gh'
             which results in constant   ab'cdef"gh

A Regex constant can be given in three flavours:
- as a UNIX shell (file name) style pattern.
  p/pattern/ can be used after operator ~ and !~. Instead of a slash, the characters % and # can also be used as delimiter, as long as the same delimiter is used on both sides. The delimiter can not be part of the pattern itself.
  Function pattern can be used after operator = and !=.
  A pattern uses the special character ? for a single arbitrary character and * for zero or more arbitrary characters. Furthermore square brackets can be used for a choice from a set of single characters and braces can be used for a choice from a set of pattern parts.
```
		name~p/3c*/
		col=pattern('[_A-Za-z]*')
		name !~ p#/usr/*.{h,cc}#
```
- as an SQL style pattern.
  A string constant giving the pattern can be used after operator LIKE.
  Function sqlpattern can be used after operator = and !=.
  An SQL style pattern uses the special character _ for a single arbitrary character and % for zero or more arbitrary characters.
```
		name LIKE '3c%'
```
- as an extended regular expression.
  m/pattern/ can be used after operator ~ and !~. A slash, % and # can be used as delimiter.
  Function pattern can be used after operator = and !=. It uses many special characters. See class Regex for a full description.
```
		name ~ m/[a-zA-Z][a-zA-Z0-9]*/
		name = regex('[a-zA-Z][a-zA-Z0-9]')
```
Note that the functions and operator LIKE work on any string, thus they can be used with any string expression.
A DateTime constant can be formed in 2 ways:
1.
From a String constant using the datetime function. In this way all possible formats as explained in class MVTime are supported. E.g.
```
              datetime ("11-Dec-1972")
```
2.
A more convenient way is to specify it directly. Since this makes use of the delimiters - or /, it conflicts with the expression grammar as such. However, such conflicts can be solved by using whitespace in a expression and it is believed that in practice the convenience surpasses the possible conflicts.
A large subset of the MVTime formats is supported. A DateTime has to be specified as date/time or date-time, where the time part (including the / or - delimiter) is optional. The possible date formats are:
- YYYY/MM/DD or DD-MM-YYYY
- DD-MMMMMMMM-YY where the - is optional and MMMMMMM is the case-insensitive name of the month (at least 3 letters).
- YYYY//DDD or DDD-YYYY where DDD is the day number in the year.
In the DMY format, 2000 is added if year<50 and 1900 is added if 50< =year<100.
If MM>12, YYYY will be incremented accordingly.
The general time format in a DateTime constant is:
- hh:mm:ss.s
where the delimiter h or H can be used for the first colon and m or M for the second. Trailing parts can be omitted. E.g.
```
              10-2-97
              10-02-1997
              10-February-97
              10feb97
              1997/2/10          are all identical

              1May96/3:          : (or h) is mandatory
              1May96/3:0
              1May96/3:0:0
              1May96/3h          h (or :) is mandatory
              1May96/3H0
              1May96/3h0M
              1May96/3hm0.0
```
A DateTime constant with the current date/time can be made by using the function datetime without arguments.

Constant N-dimensional arrays cannot be formed directly. It is possible though to form a constant bounded discrete set. If needed such a set is automatically transformed to a 1-dimensional array. E.g.

     [1:10]

is equal to a 1D array of 10 elements with values 1..10.

Furthermore it is possible to use the array function to transform an array to another shape. This function is very similar to the same function in glish. E.g.

     array([1:10],10,4)

creates an array with shape [10,4]. Each row contains the values 1..10.
Please note that in TaQL arrays are stored in Fortran order, thus the first index varies most rapidly (which is opposite to C).

Similar to glish the value array is wrapped if the created array has more elements. Unlike glish the shape does not need to be defined by scalars, but it can also be defined by another set. Thus:

     array(F,shape(DATA))

is a boolean array filled with F with the same shape as the DATA array.

Operators

The following operators can be used (with their normal meaning and precedence) for scalars:

Unary + and -
Can only be used with double and complex operands. They do not have a higher precedence than the binary operators.
E.g. -3^2 results in -9.
Unary ! (or NOT)
Logical NOT operator. Can only be used with Bool operands.
Binary ^, *, /, %, +, and -
% is the modulo operator. E.g. 3%1.4 results in 0.2 and -10%3 results in -1.
^ is the power operator. Note that because of the precedence rules -3^2 results in -9.
All operators are left-associative, except ^ which is right-associative; thus 2^1^2 results in 2.
Operator % can only be used for double operands, while the others can be used for double and complex operands. Operator + can also be used for:
- Addition (i.e., concatenation) of 2 strings.
- Addition of a double (unit days) and a DateTime (resulting in a DateTime).
Operator - can also be used for:
- Subtraction of a double (unit days) from a DateTime (resulting in a DateTime).
- Subtraction of 2 DateTimes (resulting in a double with unit days).
==, ! =, >, > =, <, and < =
Can be used with any operand as long as their data types conform. Operator >, > =, <, and < = cannot be used for Regex. They use the norm for Complex values.
To be compliant with SQL = can be used for = = and < > can be used for ! =.
~, !~, and LIKE and NOT LIKE.
~ and !~ have to be used with regular expression constants.
LIKE exists for SQL compliancy. It operates on a string giving a pattern with special characters % (0 or more characters) and _ (any single character).
EXISTS and NOT EXISTS
Can be used to test if a subquery finds at least N matching rows. The value for N is taken from the LIMIT clause; if LIMIT is not given it defaults to 1. The subquery loop stops as soon as N matching rows are found. E.g.
```
    EXISTS(select from ::ANTENNA where NAME=''somename'' LIMIT 2)
```
results in T if at least 2 matching rows in the ANTENNA table were found.
IN and NOT IN
Can be used to test if a value is an element in a set of values, ranges, and/or intervals. (See the discussion of sets).
BETWEEN and NOT BETWEEN
Is a special form of IN with a single closed interval. It exists for SQL compliance. It can be used to test if a value is between two other values. E.g.
col BETWEEN 1 AND 3 and col in [1=:=3] are the same.
INCONE and NOT INCONE
Can be used to do a cone search. (See the discussion of cone search functions).
&& and | |
Logical AND and OR operator.
To be compliant with SQL the words AND and OR can be used instead of the symbols. These words are case insensitive.
Also & and | can be used.
These operators can only be used with Bool operands.

: The precedence order is:
^
unary +, -, !
*, /, %
+, -
==, !=, >, >=, <, <=, IN, INCONE, BETWEEN, EXISTS, LIKE, ~, !~
&&
||

All these operators can be used in the same way for arrays (also a mix of scalar and array).

Functions

Some functions have 2 names. One name is the AIPS++/Glish name, while the other is the name as used in SQL. In the following tables the function names are shown in uppercase, while the result and argument types are shown in lowercase. Note, however, that function names are case-insensitive.

Sets, and in particular subqueries, can result in a 1-dim array. This means that the functions accepting an array argument can also be used on a set or the result of a subquery. E.g.

  WHERE datecol IN date([11-Nov-97,16-Dec-97,14-Jan-98])

String functions

These functions can be used on a scalar or an array argument.

: double STRLENGTH(string), double LEN(string)
Returns the number of characters in a string (trailing whitespace is significant).
: string UPCASE(string), string UPPER(string)
Convert to uppercase.
: string DOWNCASE(string), string LOWER(string)
Convert to lowercase.
: string TRIM(string)
Removes trailing whitespace.

Regex functions

The syntax for forming a regex and pattern are explained in class Regex. These functions can only be used on a scalar argument.

: regex REGEX(string)
Handle the given string as a regular expression.
: regex PATTERN(string)
Handle the given string as a UNIX filename-like pattern and convert it to a regular expression.
: regex SQLPATTERN(string)
Handle the given string as an SQL-style pattern and convert it to a regular expression. An SQL-style pattern has two special character:
% for zero or more arbitrary characters
_ for one arbitrary character

A regex formed this way can only be used in a comparison == or !=. E.g.
upcase(object) == pattern('3C*')
to find all 3C objects in a catalogue. See also the discussion on constants for other ways of forming and testing regular expressions.

Date/time functions

These functions can be used on a scalar or an array argument.

: DateTime DATETIME(string), DateTime CTOD(string)
Parse the string and convert it to a date/time. The syntax of date/time is explained in class MVTime.
: DateTime MJDTODATE(double)
The double value, which has to be a MJD (ModifiedJulianDate), is converted to a DateTime.
: DateTime DATE(DateTime)
Get the date (i.e., remove the time part). This function is needed in
DATE(column) == 12Feb1997
if the column contains date/times with times>0.
: double MJD(DateTime)
Get the DateTime as a MJD (ModifiedJulianDate).
: double YEAR(DateTime)
Get the year (which includes the century).
: double MONTH(DateTime)
Get the month number (1-12).
: double DAY(DateTime)
Get the day number (1-31).
: double WEEKDAY(DateTime), double DOW(DateTime)
Get the weekday number (1=Monday, ..., 7=Sunday).
: string CMONTH(DateTime)
Get the name of the month (Jan ... Dec).
: string CWEEKDAY(DateTime), string CDOW(DateTime)
Get the name of the weekday (Mon .. Sun).
: double WEEK(DateTime)
Get the week number in the year (1 ... 53).
Note that week 1 is the week containing Jan 4th.
: double TIME(DateTime)
Get the time part of the day. It is converted to radians to be compatible with the internal representation of times/positions. In that way the function can easily be used as in:
TIME(date) > 12h

All functions can be used without an argument in which case the current date/time is used. E.g. DATE() results in the current date.
It is possible to give a string argument instead of a date. In this case the string is parsed and converted to a date (i.e., in fact the function DATETIME is used implicitly).

Comparison functions

Two functions make it possible to compare 2 double or complex values within a tolerance. They can be used on scalar and array arguments (and a mix of them). The tolerance must be a scalar though.

: bool NEAR(numeric val1, numeric val2, double tol)
Tests in a relative way if a value is near another. Relative means that the magnitude of the numbers is taken into account.
It returns abs(val2 - val1)/max(abs(val1),abs(val2)) < tol.
If tol<=0, it returns val1==val2. If either val is 0.0, it takes care of area around the minimum number that can be represented. The default tolerance is 1.0e-13.
: bool NEARABS(numeric val1, numeric val2, double tol)
Tests in an absolute way if a value is near another. Absolute means that the magnitude of the numbers is not taken into account.
It returns abs(val2 - val1) < tol. The default tolerance is 1.0e-13.
: bool ISNAN(numeric val)
Tests if a numeric value is a NaN (not-a-number).

Mathematical functions

Several functions can operate on double or complex arguments. The data types of such functions is given as 'numeric'. They can be used on scalar and array arguments (and a mix of them).

: double PI()
Returns the value of pi.
: double E()
Returns the value of e (is equal to EXP(1)).
: numeric SIN(numeric)
: numeric SINH(numeric)
: double ASIN(double)
: numeric COS(numeric)
: numeric COSH(numeric)
: double ACOS(double)
: double TAN(double)
: double TANH(double)
: double ATAN(double)
: double ATAN2(double y, double x)
Returns ATAN(y/x) in correct quadrant.
: numeric EXP(numeric)
: numeric LOG(numeric)
Natural logarithm.
: numeric LOG10(numeric)
: numeric POW(numeric, numeric)
The same as operator ^.
: numeric SQUARE(numeric), numeric SQR(numeric)
The same as ^2, but much faster.
: numeric SQRT(numeric)
: complex COMPLEX(double, double)
: numeric CONJ(numeric)
: double REAL(numeric)
Real part of a complex number.
: double IMAG(numeric)
Imaginary part of a complex number.
: double NORM(numeric)
: double ABS(numeric), double AMPLITUDE(numeric)
: double ARG(numeric), double PHASE(numeric)
: numeric MIN(numeric, numeric)
: numeric MAX(numeric, numeric)
: double SIGN(double)
Returns -1 for a negative value, 0 for zero, 1 for a positive value.
: double ROUND(double)
Rounds the absolute value of the number. E.g. ROUND(-1.6) = -2.
: double FLOOR(double)
Works towards negative infinity. E.g. FLOOR(-1.2) = -2
: double CEIL(double)
Works towards positive infinity.
: double FMOD(double, double)
The same as operator %.

Note that the trigonometric functions need their arguments in radians.

Array to scalar reduction functions

The following functions reduce an array to a scalar. They are meant for an array, but can also be used for a scalar.

: bool ANY(bool)
Is any element true?
: bool ALL(bool)
Are all elements true?
: double NTRUE(bool)
Number of true elements.
: double NFALSE(bool)
Number of false elements.
: numeric SUM(numeric)
Return sum of all elements.
: numeric SUMSQUARE(numeric), numeric SUMSQR(numeric)
Return sum of all squared elements.
: numeric PRODUCT(numeric)
Return product of all array elements.
: double MIN(double)
Return minimum of all array elements.
: double MAX(double)
Return maximum of all array elements.
: double MEAN(double), double AVG(double)
Return mean of all array elements.
: double VARIANCE(double)
Return variance (the sum of
(a(i) - mean(a))**2/(nelements(a) - 1).
: double STDDEV(double)
Return standard deviation (the square root of the variance).
: double AVDEV(double)
Return average deviation. (the sum of
abs(a(i) - mean(a))/nelements(a).
: double MEDIAN(double)
Return median (the middle element). If the array has an even number of elements, the mean of the two middle elements is returned.
: double FRACTILE(double, doublescalar fraction)
Return the value of the element at the given fraction. Fraction 0.5 is the same as the median.

Array to array reduction functions

These functions reduce an array to a smaller array by collapsing the given axes using the given function. The axes are the last argument(s). They can be given in two ways:
- As a single set argument; for example, maxs(ARRAY,[1:2])
- As individual scalar arguments; for example, maxs(ARRAY,1,2)
For example, using MINS(array,1,2) for a 3-dim array results in a 1-dim array where each value is the minimum of each plane in the cube.
Axes numbers are 1-relative, thus 1 is the first axis.
Axes numbers exceeding the dimensionality of the array are ignored. Thus, for example, maxs(ARRAY,[2:10]) works for arrays of virtually any dimensionality and results in a 1-dim array.
The function names are the 'plural' forms of the functions in the previous table. They can only be used for arrays, thus not for scalars.

: bool ANYS(bool)
Is any element true?
: bool ALLS(bool)
Are all elements true?
: double NTRUES(bool)
Number of true elements.
: double NFALSES(bool)
Number of false elements.
: numeric SUMS(numeric)
Return sum of elements.
: numeric SUMSQUARES(numeric), numeric SUMSQRS(numeric)
Return sum of squared elements.
: numeric PRODUCTS(numeric)
Return product of elements.
: double MINS(double)
Return minimum of elements.
: double MAXS(double)
Return maximum of elements.
: double MEANS(double), double AVG(double)
Return mean of elements.
: double VARIANCES(double)
Return variance (the sum of
(a(i) - mean(a))**2/(nelements(a) - 1).
: double STDDEVS(double)
Return standard deviation (the square root of the variance).
: double AVDEVS(double)
Return average deviation. (the sum of
abs(a(i) - mean(a))/nelements(a).
: double MEDIANS(double)
Return median (the middle element). If the array has an even number of elements, the mean of the two middle elements is returned.
: double FRACTILES(double, doublescalar fraction)
Return the value of the element at the given fraction. Fraction 0.5 is the same as the median.

Miscellaneous functions

: bool ISDEFINED(anytype)
Return false if the value in the current row is undefined. Is useful to test if a cell in a column with variable shaped arrays contains an array. It can also be used to test if a field in a record is defined.
: double NELEMENTS(anytype), double COUNT(anytype)
Return number of elements in an array (1 for a scalar).
: double NDIM(anytype)
Return dimensionality of an array (0 for a scalar).
: doublearray SHAPE(anytype)
Return shape of an array (returns an empty array for a scalar).
: double ROWNUMBER()
Return the row number being tested (first row is row number 1).
In combination with function RAND it can, for instance, be used to select arbitrary rows from a table.
: double ROWID()
Return the row number in the original table. This is especially useful for returning the result of a selection of a subtable of an AIPS++ measurement set (see also sections Subqueries and Examples).
: double RAND()
Return (per table row) a uniformly distributed random number between 0 and 1 using a Multiplicative Linear Congruential Generator. The seeds for the generator are deduced from the current date and time, so the results are different from run to run.
The function can, for instance, be used to select a random subset from a table.
: anytype IIF(cond,arg1,arg2)
This is a special funtion which operates like the ternary ?: operator in C++. If all arguments are scalars, the result is a scalar, otherwise an array. In the latter case possible scalar arguments are virtually expanded to arrays. IIF evaluates the condition for each element. If True, it takes the corresponding element of arg1, otherwise of arg2.
: anytypearray ARRAY(anytype,shape)
This function creates an array of the given type and shape. The shape is given in the last argument(s). It can be given in two ways:
- As a single set argument; for example, array(0,[3,4])
- As individual scalar arguments; for example, array(0,3,4)
The first argument gives the values the array is filled with. It can be a scalar or an array of any shape. To initialize the created array, the value array is flattened to a 1D array. Its successive values are stored in the created array. If the new array has more values than the value array, the value array is reset to its beginning and the process continues.

Cone search functions

Cone search functions make it possible to test if a source is within a given distance of a given sky position. The expression

     cos(0d1m) < sin(52deg) * sin(DEC) +
                 cos(52deg) * cos(DEC) * cos(3h30m - RA)

could be used to test if sources with their sky positions defined in columns RA and DEC are within 1 arcmin of the given sky position.
The cone search functions implement this expression making life much easier for the user. Because they can also operate on arrays of positions, searching in multiple cones can be done simultaneously. That makes it possible to find matching source positions in two catalogues as shown in an example at the end of this section.

The arguments in all functions are described below. All of them have to be given in radians. However, usually one does not need to bother because TaQL makes it possible to specify positions in many formats which are automatically converted to radians.

: SOURCES
is a set or array giving the positions of one or more sources (e.g. in equatorial coordinates) to be tested. Normally these are columns in a table. If argument name SOURCE is mentioned, only a single source can be used.
For example:
[RA,DEC] for scalar columns RA and DEC.
SKYPOS for a column SKYPOS containing 2-element vectors with RA and DEC.
: CONES
is a set or array giving the center positions and radii of one or more cones (e.g. as RA,DEC,radius). Usually this will be given by the user as constants.
For example:
[12:13:54, -5.3.34, 0d1m] for a single cone.
[12:13:54, -5.3.34, 0d1m, 1:2:3, 4.5.6, 0d1m] for two cones.
: CONEPOS
is a set or array giving the positions of one or more cone centers (e.g. as RA,DEC).
: RADII
is a scalar, set or array giving one or more radii. Each radius is applied to all positions in CONEPOS. Specifying a cone as CONEPOS,RADIUS is easier than specifying it as CONES if the same radius has to be used for multiple cones.
For example:
[12:13:54, -5.3.34, 1:2:3, 4.5.6], 0d1m is the same as the second CONES example above.

The following cone search functions are available.

: bool ANYCONE(SOURCE,CONES)
Return T if the source is contained in one of the cones. Operator INCONE is a synonym. So ANYCONE(SOURCE,CONES) is the same as SOURCE INCONE CONES.
: bool ANYCONE(SOURCE,CONEPOS,RADII)
It does the same as above.
: double FINDCONE(SOURCES,CONES)
Return the 1-based index of the first cone containing the source. If a single source is given, the result is a scalar. If multiple sources are given, the result is an array with the same shape as the source array.
: double FINDCONE(SOURCES,CONEPOS,RADII)
It does the same as above. Note that in this case each radius is applied to each cone, so the index is a combination of the two (with the radius as the most rapidly varying axis).
: bool CONES(SOURCES,CONES)
Return a 2-dim bool array. The length of the first axis is the number of cones. The length of the second axis is number of sources. An element (i,j) in the array is T if source j is contained in cone i.
: bool CONES(SOURCES,CONEPOS,RADII)
It does the same as above. However, the result is a 3-dim array with the radii as the first axis, cones as second and sources as third.

Please note that ANYCONE(SOURCE,CONES) does the same as any(CONES(SOURCE,CONES)), but faster.
Function CONES makes it possible to do catalogue matching. For example, to find sources matching other sources in the same catalogue (within a radius of 10 arcseconds):

  CALC FROM table.cat CALC CONES([RA,DEC],
       [SELECT FROM table.cat GIVING [RA,DEC]], 0d0m10)

Note that in this example the SELECT clause returns an array with positions which are used as the cone centers. So each source in the catalogue is tested against every source. It makes it an N-square operation, thus potentially very expensive. The result is a 4-dim boolean array with shape [1,nrow,1,nrow] which can be processed in Glish. Please note that the CONES function results for each row in a array with shape [1,nrow,1].
The query could be done with multiple radii, for example also with 1 arcsecond and 1 arcminute.

  CALC FROM table.cat CALC CONES([RA,DEC],
       [SELECT FROM table.cat GIVING [RA,DEC]], [0d0m1, 0d0m10, 0d1m])

resulting in an array with shape [3,nrow,1,nrow]. It would give a better indication how close sources are.

Table Columns

A column can contain a scalar or an array value. Note that only columns in the primary table can be handled directly. A column in another table can be used via a subquery. E.g.

  SELECT FROM tab WHERE col >
        mean([SELECT othercol FROM othertab])

An expression has to contain at least one column, since columns are the only variable part in it. That is, a row can only be selected or sorted by means of the column values in each row.

The name of a column can contain alphanumeric characters and underscores. It should start with an alphabetic character or underscore. A column name is case-sensitive.
It is possible to use other characters in the name by escaping them with a backslash. E.g. DATE\-OBS.
In the same way a numeric character can be used as the first character of the column name. E.g. \1stDay.
Because several words are used in the language, they cannot be used directly as column names. The reserved words are:

 ALL AND AS ASC BETWEEN CREATETABLE DELETE DESC DISTINCT
 EXISTS EXCEPT FROM GIVING GROUPBY HAVING
 IN INCONE INSERT INTERSECT INTO JOIN LIKE LIMIT
 MINUS NODUPLICATES NOT OFFSET ON OR ORDERBY
 SAVETO SELECT SET UNION UNIQUE UPDATE VALUES WHERE

They can, however, be used as a column name by escaping them with a backslash. E.g. \IN.
Note that in C++ and Glish a backslash itself has to be escaped by another backslash. E.g. in Glish: tab.query('DATE\\-OBS>10MAR1996').

If a column contains a record, one has to specify a field in it using the dot operator; e.g. col.fld means use field fld in the column. It is fully recursive, so col.fld.subfld can be used if field fld is a record in itself.
Alas records in columns are not really supported yet. One can specify fields, but thereafter an error message will be given.

Table Keywords

It is possible to use table or column keywords, which can have a scalar or an array value. A table keyword has to be specified as ::key. In an expression the :: part can be omitted if there is no column with such a name. A column keyword has to be specified as column::key.
Note that the :: syntax is chosen, because it is similar to the scope operator in C++.
As explained in the FROM clause of the syntax section, keywords from the primary table and from secondary tables can be used. If used from a secondary table, it has to be qualified with the (shorthand) name of the table. E.g.
sh.key or sh.::key
takes table keyword key from the table with the shorthand name sh.

If a keyword value is a record, it is possible to use a field in it using the dot operator. E.g. ::key.fld to use field fld. It is fully recursive, so if the field is a record in itself, a subfield can be used like col::key.fld.subfld

A keyword can be used in any expression. It is evaluated immediately and transformed to a constant value.

Array Index Operator

It is possible to take a subsection or a single element from an array column, keyword or expression using the index operator [index1,index2,...]. The rules for this are similar to those used in Glish. Taking a single element can be done as:

  array[1]
  array[1, some_expression]

Taking a subsection can be done as:

  array[start1:end1:incr1, start2:end2:incr2, ...]

If a start value is left out it defaults to the beginning of that axis. An end value defaults to the end of the axis and an increment defaults to one. If an entire axis is left out, it defaults to the entire axis.
E.g. an array with shape [10,15,20] can be subsectioned as:

  [,,3]               resulting in an array of shape [10,15,1]
  [2:4, ::3, 2:15:2]  resulting in an array of shape[3,5,7]

The examples show that an index can be a simple constant (as it will usually be). It can also be an expression which can be as complex as one likes. The expression has to result in a positive double value which will be truncated to an integer length. Note that as in Glish, array indices start at 1.
For fixed shaped arrays checking if array bounds are exceeded is done at parse time. For variable shaped arrays it can only be done per row. If array bounds are exceeded, an exception is thrown. In the future a special undefined value will be assigned if bounds of variable shaped arrays are exceeded to prevent the selection process from aborting due to the exception.

Note that the index operator will be applied directly to a column. This results in reading only the required part of the array from the table column on disk. It is, however, also possible to apply it to a subexpression (enclosed in parentheses) resulting in an array. E.g.

  arraycolumn[2,3,4] + 1
  (arraycolumn + 1)[2,3,4]

can both be used and have the same result. However, the first form is much faster, because only a single element is read (resulting in a scalar) and 1 is added to it. The second form results in reading the entire array. 1 is added to all elements and only then the requested element is taken.
From this example it should be clear that indexing an array expression has to be done with care.

Sets and intervals

As in SQL the operator IN can be used to do a selection based on an array or a set. E.g.

  SELECT FROM table WHERE column IN expr1

In this example expr1 is the array result of an expression. The result of operator IN is true if the column value matches one of the values in the array. It is also possible to use a scalar as the righthand of operator IN. So if expr1 is a scalar, operator IN gives the same result as operator ==.
It is also possible to define a set explicitly. E.g.

  SELECT FROM table WHERE column IN [expr1, expr2, expr3]

This example shows that (in its simplest form) a set consists of one or more values (which can be arbitrary expressions) separated by commas and enclosed in square brackets. The elements in a set have to be scalars and their data types have to be the same. The square brackets can be left out if the set consists of only one element. For SQL compliance parentheses can be used instead of square brackets.
An element in a set can, however, be more complicated and can define multiple values or an interval. The possible forms of a set element are:

1.

A single value as shown in the example above.

2.

start:end:incr. This is similar to the way an array index is specified. Incr defaults to 1. End defaults to an open end (i.e., no upper bound) and results in an unbounded set. Start and end can be a double or a datetime. Incr has to be a double. Some examples:

  1:10     means 1,2,...,10
  1:10:2   means 1,3,5,7,9
  1::2     means all odd numbers
  1:       means all positive integer numbers
  date('18Aug97')::2   means every other day from 18Aug97 on

These examples show constants only, but start, end, and incr can be any expression.
Note that :: used here can conflict with the :: in the keywords. E.g. a::b is scanned as a keyword specification. If the intention is start::incr it should be specified as a: :b. In practice this conflict will hardly ever occur.

3.

Continuous intervals can be specified for double, string, and datetime. The specification of an interval resembles the mathematical notation 1<x<5, where x is replaced by :. An open interval side is indicated by <, while a closed interval side is indicated by =.
Another way to specify intervals is using curly and/or angle brackets. A curly bracket is a closed side, the angle bracket is an open side. The following examples show how bounded and half-bounded, (half-)open and closed intervals can be specified.

  1=:=5   {1,5}     means 1<=x<=5   bounded closed
  1<:<5   <1,5>     means 1<x<5     bounded open
  1=:<5   {1,5>     means 1<=x<5    bounded right-open
  1<:=5   <1,5}     means 1<x<=5    bounded left-open
  1=:  {1,}  {1,>   means 1<=x      left-bounded closed
  1<:  <1,}  <1,>   means 1<x       left-bounded open
  :=5  {,5}  <,5}   means x<=5      right-bounded closed
  :<5  {,5>  <,5>   means x<5       right-bounded open

It is very important to note that the 2nd form results in discrete values, while the 3rd form results in a continuous interval.

Each element in a set can have its own form, i.e., one element can be a single value while another can be an interval. If a set consists of single or bounded discrete start:end:incr values only, the set will be expanded to an array. This makes it possible for array operators and functions (like mean) to be applied to such sets. E.g.

  WHERE column > mean([10,30:100:5])

Another form of constructing a set is using a subquery as shown below.

Subqueries

As in SQL it is possible to create a set from a subquery. A subquery has the same syntax as a main query, but has to be enclosed in square brackets or parentheses. Basically it looks like:

  SELECT FROM maintable WHERE time IN
      [SELECT time FROM othertable WHERE windspeed < 5]

The subquery on othertable results in a constant set containing the times for which the windspeed matches. Subsequently the main query is executed and selects all rows from the main table with times in that set. Note that like other bounded sets this set is transformed to a constant array, so it is possible to apply functions to it (e.g. min, mean).

  SELECT FROM maintable WHERE time IN
      [SELECT time FROM othertable WHERE windspeed <
           mean([SELECT windspeed FROM othertable])]

This contains another subquery to get all windspeeds and to take the mean of them. So the first subquery selects all times where the windspeed is less than the average windspeed.
A subquery result should contain only one column, otherwise an exception is thrown.

It may happen that a subquery has to be executed twice because 2 columns from the other table are needed. E.g.

  SELECT FROM maintable WHERE any(time >=
      [SELECT starttime FROM othertable WHERE windspeed < 5]
                               && time <=
      [SELECT endtime FROM othertable WHERE windspeed < 5])

In this case the other table contains the time range for each windspeed. For big tables it is expensive to execute the subquery twice. A better solution is to store the result of the subquery in a temporary table and reuse it.

  SELECT FROM othertable WHERE windspeed < 5 GIVING tmptab
  SELECT FROM maintable WHERE any(time >=
      [SELECT starttime FROM tmptab]
                               && time <=
      [SELECT endtime FROM tmptab])

However, this has the disadvantage that the table tmptab still exists after the query and has to be deleted explictly by the user. Below a better solution for this problem is shown.

TaQL has a few extensions to support tables better, in particular the AIPS++ measurement sets.

1.

The temporary problem above can be circumvented by using the ability to use a SELECT expression in the FROM clause. E.g.

  SELECT FROM maintable,
      [SELECT FROM othertable WHERE windspeed < 5] tmptab
      WHERE any(time >= [SELECT starttime FROM tmptab]
             && time <= [SELECT endtime FROM tmptab])

However, below a even nicer solution is given.

2.

The time range problem above can be solved elegantly by using a set as the result of the subquery. Instead of a table name, it is possible to give an expression in the GIVING clause (as mentioned in the syntax section). E.g.

  select from MY.MS where TIME in
      [select FROM OTHERTABLE where WINDSPEED < 5
           giving [TIME-INTERVAL/2 =:= TIME+INTERVAL/2]]

The set expression in the GIVING clause is filled with the results from the subquery and used in the main query. So if the subquery results in 5 rows, the resulting set contains 5 intervals. Thereafter the resulting intervals are sorted and combined where possible. In this way the minimum number of intervals have to be examined by the main query.

3.

In AIPS++ the other table will often be the name of a subtable, which is stored in a table or column keyword of the main table. The standard keyword syntax can be used to indicate that the other table is the table in the given keyword. Note that for a table keyword the :: part has to be given, otherwise the name is treated as an ordinary table name. E.g.

  select from MY.MS where TIME in
      [select TIME from ::WEATHER where WINDSPEED < 5]

In this example the other table is a subtable of table my.ms. Its name is stored in keyword WEATHER of my.ms.

4.

Often the result of a query on a subtable of a measurement set is used to select columns from the main table. However, several subtables do not have an explicit key, but use the row number as an implicit key. The function ROWID() can be used to return the row number as the subtable query result. E.g.

  select from MY.MS where DATA_DESC_ID in
      [select from ::DATA_DESCRIPTION where
         SPECTRAL_WINDOW_ID in [0,2,4] giving [ROWID()]]

Note that the function ROWNUMBER cannot be used here, because it will give the row number in the selection and not (as ROWID does) the row number in the original table. Furthermore, ROWID gives a 0-relative row number which is needed to be able to use it as a selection criterium on the 0-relative values in the measurement set.

Next: Some further remarks Up: NOTE 199 - Table Query Language Previous: Selection from a table Contents