ColumnsIndex.h

Classes

ColumnsIndex -- Index to one or more columns in a table. (full description)

class ColumnsIndex

Interface

Public Members
typedef Int Compare (const Block<void*>& fieldPtrs, const Block<void*>& dataPtrs, const Block<Int>& dataTypes, Int index)
ColumnsIndex (const Table&, const String& columnName, Compare* compareFunction = 0, Bool noSort = False)
ColumnsIndex (const Table&, const Vector<String>& columnNames, Compare* compareFunction = 0, Bool noSort = False)
ColumnsIndex (const ColumnsIndex& that)
~ColumnsIndex()
ColumnsIndex& operator= (const ColumnsIndex& that)
Bool isUnique() const
Vector<String> columnNames() const
const Table& table() const
void setChanged()
void setChanged (const String& columnName)
Record& accessKey()
Record& accessLowerKey()
Record& accessUpperKey()
uInt getRowNumber (Bool& found)
uInt getRowNumber (Bool& found, const Record& key)
Vector<uInt> getRowNumbers()
Vector<uInt> getRowNumbers (const Record& key)
Vector<uInt> getRowNumbers (Bool lowerInclusive, Bool upperInclusive)
Vector<uInt> getRowNumbers (const Record& lower, const Record& upper, Bool lowerInclusive, Bool upperInclusive)
Protected Members
void copy (const ColumnsIndex& that)
void deleteObjects()
void addColumnToDesc (RecordDesc& description, const ROTableColumn& column)
void create (const Table& table, const Vector<String>& columnNames, Compare* compareFunction, Bool noSort)
void makeObjects (const RecordDesc& description)
void readData()
uInt bsearch (Bool& found, const Block<void*>& fieldPtrs) const
static Int compare (const Block<void*>& fieldPtrs, const Block<void*>& dataPtrs, const Block<Int>& dataTypes, Int index)
void fillRowNumbers (Vector<uInt>& rows, uInt start, uInt end) const

Description

Review Status

Programs:
Tests:

Prerequisite

Synopsis

This class makes it possible to use transient indices on top of tables in order to speed up the process of finding rows based on a given key or key range. When constructing a ColumnsIndex object, one has to define which columns form the key for this index on the given table object. Only scalar columns are supported. The data in the given columns will be read, sorted (if needed), and stored in memory. When looking up a key or key range, the class will use a fast binary search on the data held in memory.

The ColumnsIndex object contains a Record object which can be used to define the key to be looked up. The record contains a field for each column in the index (with the same name and data type). The fastest way to fill the key is by creating a RecordFieldPtr object for each field in the record (see the example) and fill it as needed. However, one can also use the Record::define function, but that is slower.
A second record is available to define the upper key when a key range has to be looked up. The keys can be accessed using the various accessKey functions.

When a key is defined, the getRowNumbers function can be used to find the table rows containing the given key (range). Function getRowNumber can be used if all keys in the index are unique (which can be tested with the isUnique function).

Instead of using the internal records holding the keys, one can also pass its own Record object to getRowNumbers. However, it will be slower.

When constructing the object, the user can supply his own compare function. The default compare function compares each field of the key in the normal way. A user's compare function makes it possible to compare in a special way. E.g. one could use near instead of == on floating point fields. Another example (which is shown in one of the examples below) makes it possible to find a key in an index consisting of a time and width.

After an index is created, it is possible to change the data in the underlying columns. However, the ColumnsIndex can not detect if the column data have changed. It can only detect if the number of rows has changed. If the column data have changed, the user has to use the setChanged function to indicate that all columns or a particular column has changed.
If data have changed, the entire index will be recreated by rereading and optionally resorting the data. This will be deferred until the next key lookup.

Example

Suppose one has an antenna table with key ANTENNA.
    // Open the table and make an index for column ANTENNA.
    Table tab("antenna.tab")
    ColumnsIndex colInx(tab, "ANTENNA");
    // Make a RecordFieldPtr for the ANTENNA field in the index key record.
    // Its data type has to match the data type of the column.
    RecordFieldPtr<Int> antFld(colInx.accessKey(), "ANTENNA");
    // Now loop in some way and find the row for the antenna
    // involved in that loop.
    Bool found;
    while (...) {
        // Fill the key field and get the row number.
        // ANTENNA is a unique key, so only one row number matches.
        // Otherwise function getRowNumbers had to be used.
        *antFld = antenna;
        uInt antRownr = colInx.getRowNumber (found);
        if (!found) {
            cout << "Antenna " << antenna << " is unknown" << endl;
        } else {
            // antRownr can now be used to get data from that row in
            // the antenna table.
        }
    }
    

The following example shows how multiple keys can be used and how a search on a range can be done.

    Table tab("sometable")
    // Note that TIME is the main key.
    // Also note that stringToVector (in ArrayUtil.h) is a handy
    // way to convert a String to a Vector<String>.
    ColumnsIndex colInx(tab, stringToVector("TIME,ANTENNA"));
    // Make a RecordFieldPtr for the fields in lower and upper key records.
    RecordFieldPtr<Double> timeLow(colInx.accessLowerKey(), "TIME");
    RecordFieldPtr<Int> antLow(colInx.accessLowerKey(), "ANTENNA");
    RecordFieldPtr<Double> timeUpp(colInx.accessUpperKey(), "TIME");
    RecordFieldPtr<Int> antUpp(colInx.accessUpperKey(), "ANTENNA");
    while (...) {
        // Fill the key fields.
        *timeLow = ...;
        *antLow = ...;
        *timeUpp = ...;
        *antUpp = ...;
        // Find the row numbers for keys between low and upp (inclusive).
        Vector<uInt> rows = colInx.getRowNumbers (True, True);
    }
    

The following example shows how a specific compare function could look like. A function like this will actually be used in the calibration software.
The table for which the index is built, has rows with the TIME as its key. However, each row is valid for a given interval, where TIME gives the middle of the interval and WIDTH the length of the interval. This means that the compare function has to test whether the key is part of the interval.

    Int myCompare (const Block<void*>& fieldPtrs,
                   const Block<void*>& dataPtrs,
                   const Block<Int>& dataTypes,
                   Int index)
    {
      // Assert (for performance only in debug mode) that the correct
      // fields are used.
      DebugAssert (dataTypes.nelements() == 2, AipsError);
      DebugAssert (dataTypes[0] == TpDouble  &&  dataTypes[1] == TpDouble,
                   AipsError);
      // Now get the key to be looked up
      // (an awfully looking cast has to be used).
      const Double key = *(*(const RecordFieldPtr<Double>*)(fieldPtrs[0]));
      // Get the time and width of the entry to be compared.
      const Double time = ((const Double*)(dataPtrs[0]))[index];
      const Double width = ((const Double*)(dataPtrs[1]))[index];
      const Double start = time - width/2;
      const Double end = time + width/2;
      // Test if the key is before, after, or in the interval
      // (representing less, greater, equal).
      if (key < start) {
        return -1;
      } else if (key > end) {
        return 1;
      }
      return 0;
    }
    
    // Now use this compare function in an actual index.
    Table tab("sometable")
    ColumnsIndex colInx(tab, stringToVector("TIME,WIDTH"), myCompare);
    // Make a RecordFieldPtr for the TIME field in the key record.
    // Note that although the WIDTH is part of the index, it is
    // not an actual key. So it does not need to be filled in.
    RecordFieldPtr<Double> time(colInx.accessLowerKey(), "TIME");
    Bool found;
    while (...) {
        // Fill the key field.
        *time = ...;
        // Find the row number for this time.
        uInt rownr = colInx.getRowNumber (found);
    }
    

Motivation

The calibration software needs to lookup keys in calibration tables very frequently. This class makes that process much easier and faster.

Member Description

typedef Int Compare (const Block<void*>& fieldPtrs, const Block<void*>& dataPtrs, const Block<Int>& dataTypes, Int index)

Define the signature of a comparison function. The first block contains pointers to RecordFieldPtr<T> objects holding the key to be looked up. The second block contains pointers to the column data. The index argument gives the index in the column data. The third block contains data types of those blocks (TpBool, etc.). The function should return -1 if key is less than data, 0 if equal, 1 if greater.
An example above shows how a compare function can be used.

ColumnsIndex (const Table&, const String& columnName, Compare* compareFunction = 0, Bool noSort = False)

Create an index on the given table for the given column. The column has to be a scalar column. If noSort==True, the table is already in order of that column and the sort step will not be done. The default compare function is provided by this class. It simply compares each field in the key.

ColumnsIndex (const Table&, const Vector<String>& columnNames, Compare* compareFunction = 0, Bool noSort = False)

Create an index on the given table for the given columns, thus the key is formed by multiple columns. The columns have to be scalar columns. If noSort==True, the table is already in order of those columns and the sort step will not be done. The default compare function is provided by this class. It simply compares each field in the key.

ColumnsIndex (const ColumnsIndex& that)

Copy constructor (copy semantics).

~ColumnsIndex()

ColumnsIndex& operator= (const ColumnsIndex& that)

Assignment (copy semantics).

Bool isUnique() const

Are all keys in the index unique?

Vector<String> columnNames() const

Return the names of the columns forming the index.

const Table& table() const

Get the table for which this index is created.

void setChanged()
void setChanged (const String& columnName)

Something has changed in the table, so the index has to be recreated. The 2nd version indicates that a specific column has changed, so only that column is reread. If that column is not part of the index, nothing will be done.
Note that the class itself is keeping track if the number of rows in the table changes.

Record& accessKey()
Record& accessLowerKey()
Record& accessUpperKey()

Access the key values. These functions allow you to create RecordFieldPtr objects for each field in the key. In this way you can quickly fill in the key.
The records have a fixed type, so you cannot add or delete fields.

uInt getRowNumber (Bool& found)
uInt getRowNumber (Bool& found, const Record& key)

Find the row number matching the key. All keys have to be unique, otherwise an exception is thrown. If no match is found, found is set to False. The 2nd version makes it possible to pass in your own Record instead of using the internal record via the accessKey functions. Note that the given Record will be copied to the internal record, thus overwrites it.

Vector<uInt> getRowNumbers()
Vector<uInt> getRowNumbers (const Record& key)

Find the row numbers matching the key. It should be used instead of getRowNumber if the same key can exist multiple times. The 2nd version makes it possible to pass in your own Record instead of using the internal record via the accessKey functions. Note that the given Record will be copied to the internal record, thus overwrites it.

Vector<uInt> getRowNumbers (Bool lowerInclusive, Bool upperInclusive)
Vector<uInt> getRowNumbers (const Record& lower, const Record& upper, Bool lowerInclusive, Bool upperInclusive)

Find the row numbers matching the key range. The boolean arguments tell if the lower and upper key are part of the range. The 2nd version makes it possible to pass in your own Records instead of using the internal records via the accessLower/UpperKey functions. Note that the given Records will be copied to the internal records, thus overwrite them.

void copy (const ColumnsIndex& that)

Copy that object to this.

void deleteObjects()

Delete all data in the object.

void addColumnToDesc (RecordDesc& description, const ROTableColumn& column)

Add a column to the record description for the keys.

void create (const Table& table, const Vector<String>& columnNames, Compare* compareFunction, Bool noSort)

Create the various members in the object.

void makeObjects (const RecordDesc& description)

Make the various internal RecordFieldPtr objects.

void readData()

Read the data of the columns forming the index, sort them and form the index.

uInt bsearch (Bool& found, const Block<void*>& fieldPtrs) const

Do a binary search on itsUniqueIndex for the key in fieldPtrs. If the key is found, found is set to True and the index in itsUniqueIndex is returned. If not found, found is set to False and the index of the next higher key is returned.

static Int compare (const Block<void*>& fieldPtrs, const Block<void*>& dataPtrs, const Block<Int>& dataTypes, Int index)

Compare the key in fieldPtrs with the given index entry. -1 is returned when less, 0 when equal, 1 when greater.

void fillRowNumbers (Vector<uInt>& rows, uInt start, uInt end) const

Fill the row numbers vector for the given start till end in the itsUniqueIndex vector (end is not inclusive).