ColumnsIndexArray.h

Classes

ColumnsIndexArray -- Index to an array column in a table. (full description)

class ColumnsIndexArray

Interface

Public Members
ColumnsIndexArray (const Table&, const String& columnName)
ColumnsIndexArray (const ColumnsIndexArray& that)
~ColumnsIndexArray()
ColumnsIndexArray& operator= (const ColumnsIndexArray& that)
Bool isUnique() const
const String& columnName() const
const Table& table() const
void setChanged()
void setChanged (const String& columnName)
Record& accessKey()
Record& accessLowerKey()
Record& accessUpperKey()
uInt getRowNumber (Bool& found)
uInt getRowNumber (Bool& found, const Record& key)
Vector<uInt> getRowNumbers (Bool unique=False)
Vector<uInt> getRowNumbers (const Record& key, Bool unique=False)
Vector<uInt> getRowNumbers (Bool lowerInclusive, Bool upperInclusive, Bool unique=False)
Vector<uInt> getRowNumbers (const Record& lower, const Record& upper, Bool lowerInclusive, Bool upperInclusive, Bool unique=False)
Protected Members
void copy (const ColumnsIndexArray& that)
void deleteObjects()
void addColumnToDesc (RecordDesc& description, const ROTableColumn& column)
void makeObjects (const RecordDesc& description)
void readData()
uInt bsearch (Bool& found, void* fieldPtr) const
static Int compare (void* fieldPtr, void* dataPtr, Int dataType, Int index)
void fillRowNumbers (Vector<uInt>& rows, uInt start, uInt end, Bool unique) const
void getArray (Vector<uChar>& result, const String& name)
void getArray (Vector<Short>& result, const String& name)
void getArray (Vector<Int>& result, const String& name)
void getArray (Vector<uInt>& result, const String& name)
void getArray (Vector<String>& result, const String& name)
void fillRownrs (uInt npts, const Block<uInt>& nrel)

Description

Review Status

Reviewed By:
UNKNOWN
Date Reviewed:
before2004/08/25
Programs:
Tests:

Prerequisite

Synopsis

This class makes it possible to use transient indices on top of an array column in a table in order to speed up the process of finding rows based on a given key or key range. It is similar to class ColumnsIndex which is meant for one or more scalar columns.

When constructing a ColumnsIndexArray object, one has to define which column forms the key for this index on the given table object. Not every data type is supported; only uChar, Short, Int, uInt, and String array columns are supported. The column can contain arrays of any shape and it can also contain empty cells. The class will probably mostly be used for vectors, as they seem to be the most logical way to hold multiple keys.
The data in the given column will be read, sorted, and stored in memory. When looking up a key or key range, the class will use a fast binary search on the data held in memory.

The ColumnsIndexArray object contains a Record object which can be used to define the key to be looked up. The record contains a field for the column in the index (with the same name and data type). The fastest way to fill the key is by creating a RecordFieldPtr object for the field in the record (see the example) and fill it as needed. However, one can also use the Record::define function, but that is slower.
A second record is available to define the upper key in case a key range has to be looked up. The keys can be accessed using the various accessKey functions.

When a key is defined, the getRowNumbers function can be used to find the table rows containing the given key (range). Function getRowNumber can be used to lookup a single key if all keys in the index are unique (which can be tested with the isUnique function).

Instead of using the internal records holding the keys, one can also pass its own Record object to getRowNumbers. However, it will be slower.

After an index is created, it is possible to change the data in the underlying columns. However, the ColumnsIndexArray can not detect if the column data have changed. It can only detect if the number of rows has changed. If the column data have changed, the user has to use the setChanged function to indicate that the column has changed.
If data have changed, the entire index will be recreated by rereading and resorting the data. This will be deferred until the next key lookup.

Example

Suppose one has table with a column NAME containing vectors.
    // Open the table and make an index for the column.
    Table tab("my.tab")
    ColumnsIndexArray colInx(tab, "NAME");
    // Make a RecordFieldPtr for the NAME field in the index key record.
    // Its data type has to match the data type of the column.
    RecordFieldPtr<String> nameFld(colInx.accessKey(), "NAME");
    // Find the row for a given name.
    Bool found;
    // Fill the key field and get the row number.
    // NAME is a unique key, so only one row number matches.
    // Otherwise function getRowNumbers had to be used.
    *nameFld = "MYNAME";
    uInt rownr = colInx.getRowNumber (found);
    if (!found) {
        cout << "Name MYNAME is unknown" << endl;
    }
    // Now get a range of names and return the row numbers in ascending order.
    // This uses the fact that the 'unique' argument also sorts the data.
    RecordFieldPtr<String> nameUpp(colInx.accessUpperKey(), "NAME");
    *nameFld = "LOWER";
    *nameUpp = "UPPER";
    Vector<uInt> rownrs = colInx.getRowNumbers (True, True, True);
    

Motivation

Bob Garwood needed such a class.

Member Description

ColumnsIndexArray (const Table&, const String& columnName)

Create an index on the given table for the given column. The column can be a scalar or an array column. If noSort==True, the table is already in order of that column and the sort step will not be done. It only supports String and integer columns.

ColumnsIndexArray (const ColumnsIndexArray& that)

Copy constructor (copy semantics).

~ColumnsIndexArray()

ColumnsIndexArray& operator= (const ColumnsIndexArray& that)

Assignment (copy semantics).

Bool isUnique() const

Are all keys in the index unique?

const String& columnName() const

Return the names of the columns forming the index.

const Table& table() const

Get the table for which this index is created.

void setChanged()
void setChanged (const String& columnName)

Something has changed in the table, so the index has to be recreated. The 2nd version indicates that a specific column has changed, so only that column might need to be reread. If that column is not part of the index, nothing will be done.
Note that the class itself is keeping track if the number of rows in the table changes.

Record& accessKey()
Record& accessLowerKey()
Record& accessUpperKey()

Access the key values. These functions allow you to create RecordFieldPtr objects for each field in the key. In this way you can quickly fill in the key.
The records have a fixed type, so you cannot add or delete fields.
Note that accessKey and accessLowerKey are synonyms; they return the same underlying record.

uInt getRowNumber (Bool& found)
uInt getRowNumber (Bool& found, const Record& key)

Find the row number matching the key. All keys have to be unique, otherwise an exception is thrown. If no match is found, found is set to False. The 2nd version makes it possible to pass in your own Record instead of using the internal record via the accessKey functions. Note that the given Record will be copied to the internal record, thus overwrites it.

Vector<uInt> getRowNumbers (Bool unique=False)
Vector<uInt> getRowNumbers (const Record& key, Bool unique=False)

Find the row numbers matching the key. It should be used instead of getRowNumber if the same key can exist multiple times. The 2nd version makes it possible to pass in your own Record instead of using the internal record via the accessKey functions. Note that the given Record will be copied to the internal record, thus overwrites it.
A row can contain multiple equal values. In such a case the same row number can occur multiple times in the output vector, unless unique is set to True. Note that making the row numbers unique implies a sort, so it can also be used to get the row numbers in ascending order.

Vector<uInt> getRowNumbers (Bool lowerInclusive, Bool upperInclusive, Bool unique=False)
Vector<uInt> getRowNumbers (const Record& lower, const Record& upper, Bool lowerInclusive, Bool upperInclusive, Bool unique=False)

Find the row numbers matching the key range. The boolean arguments tell if the lower and upper key are part of the range. The 2nd version makes it possible to pass in your own Records instead of using the internal records via the accessLower/UpperKey functions. Note that the given Records will be copied to the internal records, thus overwrite them.
A row can contain multiple matching values. In such a case the same row number can occur multiple times in the output vector, unless unique is set to True. Note that making the row numbers unique implies a sort, so it can also be used to get the row numbers in ascending order.

void copy (const ColumnsIndexArray& that)

Copy that object to this.

void deleteObjects()

Delete all data in the object.

void addColumnToDesc (RecordDesc& description, const ROTableColumn& column)

Add a column to the record description for the keys. If the switch arrayPossible is True, the column can be an array. Otherwise it has to be a scalar.

void makeObjects (const RecordDesc& description)

Make the various internal RecordFieldPtr objects.

void readData()

Read the data of the columns forming the index, sort them and form the index.

uInt bsearch (Bool& found, void* fieldPtr) const

Do a binary search on itsUniqueIndexArray for the key in fieldPtrs. If the key is found, found is set to True and the index in itsUniqueIndexArray is returned. If not found, found is set to False and the index of the next higher key is returned.

static Int compare (void* fieldPtr, void* dataPtr, Int dataType, Int index)

Compare the key in fieldPtr with the given index entry. -1 is returned when less, 0 when equal, 1 when greater.

void fillRowNumbers (Vector<uInt>& rows, uInt start, uInt end, Bool unique) const

Fill the row numbers vector for the given start till end in the itsUniqueIndexArray vector (end is not inclusive). If unique is True, the row numbers will be made unique.

void getArray (Vector<uChar>& result, const String& name)
void getArray (Vector<Short>& result, const String& name)
void getArray (Vector<Int>& result, const String& name)
void getArray (Vector<uInt>& result, const String& name)
void getArray (Vector<String>& result, const String& name)

Get the data if the column is an array.

void fillRownrs (uInt npts, const Block<uInt>& nrel)

Fill the rownrs belonging to each array value.