casa
$Rev:20696$
|
Index to one or more columns in a table. More...
#include <ColumnsIndex.h>
Public Types | |
typedef Int | Compare (const Block< void * > &fieldPtrs, const Block< void * > &dataPtrs, const Block< Int > &dataTypes, Int index) |
Define the signature of a comparison function. | |
Public Member Functions | |
ColumnsIndex (const Table &, const String &columnName, Compare *compareFunction=0, Bool noSort=False) | |
Create an index on the given table for the given column. | |
ColumnsIndex (const Table &, const Vector< String > &columnNames, Compare *compareFunction=0, Bool noSort=False) | |
Create an index on the given table for the given columns, thus the key is formed by multiple columns. | |
ColumnsIndex (const ColumnsIndex &that) | |
Copy constructor (copy semantics). | |
~ColumnsIndex () | |
ColumnsIndex & | operator= (const ColumnsIndex &that) |
Assignment (copy semantics). | |
Bool | isUnique () const |
Are all keys in the index unique? | |
Vector< String > | columnNames () const |
Return the names of the columns forming the index. | |
const Table & | table () const |
Get the table for which this index is created. | |
void | setChanged () |
Something has changed in the table, so the index has to be recreated. | |
void | setChanged (const String &columnName) |
Record & | accessKey () |
Access the key values. | |
Record & | accessLowerKey () |
Record & | accessUpperKey () |
uInt | getRowNumber (Bool &found) |
Find the row number matching the key. | |
uInt | getRowNumber (Bool &found, const Record &key) |
Vector< uInt > | getRowNumbers () |
Find the row numbers matching the key. | |
Vector< uInt > | getRowNumbers (const Record &key) |
Vector< uInt > | getRowNumbers (Bool lowerInclusive, Bool upperInclusive) |
Find the row numbers matching the key range. | |
Vector< uInt > | getRowNumbers (const Record &lower, const Record &upper, Bool lowerInclusive, Bool upperInclusive) |
Static Public Member Functions | |
static void | copyKeyField (void *field, int dtype, const Record &key) |
Fill the internal key field from the corresponding external key. | |
Protected Member Functions | |
void | copy (const ColumnsIndex &that) |
Copy that object to this. | |
void | deleteObjects () |
Delete all data in the object. | |
void | addColumnToDesc (RecordDesc &description, const TableColumn &column) |
Add a column to the record description for the keys. | |
void | create (const Table &table, const Vector< String > &columnNames, Compare *compareFunction, Bool noSort) |
Create the various members in the object. | |
void | makeObjects (const RecordDesc &description) |
Make the various internal RecordFieldPtr objects. | |
void | readData () |
Read the data of the columns forming the index, sort them and form the index. | |
uInt | bsearch (Bool &found, const Block< void * > &fieldPtrs) const |
Do a binary search on itsUniqueIndex for the key in fieldPtrs . | |
void | fillRowNumbers (Vector< uInt > &rows, uInt start, uInt end) const |
Fill the row numbers vector for the given start till end in the itsUniqueIndex vector (end is not inclusive). | |
Static Protected Member Functions | |
static Int | compare (const Block< void * > &fieldPtrs, const Block< void * > &dataPtrs, const Block< Int > &dataTypes, Int index) |
Compare the key in fieldPtrs with the given index entry. | |
Private Member Functions | |
void | copyKey (Block< void * > fields, const Record &key) |
Fill the internal key fields from the corresponding external key. | |
Static Private Member Functions | |
template<typename T > | |
static void | copyKeyField (RecordFieldPtr< T > &field, const Record &key) |
Fill the internal key field from the corresponding external key. | |
Private Attributes | |
Table | itsTable |
uInt | itsNrrow |
Record * | itsLowerKeyPtr |
Record * | itsUpperKeyPtr |
Block< Int > | itsDataTypes |
Block< void * > | itsDataVectors |
Block< void * > | itsData |
Block< void * > | itsLowerFields |
Block< void * > | itsUpperFields |
Block< Bool > | itsColumnChanged |
Bool | itsChanged |
Bool | itsNoSort |
Compare * | itsCompare |
Vector< uInt > | itsDataIndex |
Vector< uInt > | itsUniqueIndex |
uInt * | itsDataInx |
uInt * | itsUniqueInx |
Index to one or more columns in a table.
Public interface
This class makes it possible to use transient indices on top of tables in order to speed up the process of finding rows based on a given key or key range. When constructing a ColumnsIndex
object, one has to define which columns form the key for this index on the given table
object. Only scalar columns are supported. The data in the given columns will be read, sorted (if needed), and stored in memory. When looking up a key or key range, the class will use a fast binary search on the data held in memory.
The ColumnsIndex
object contains a Record object which can be used to define the key to be looked up. The record contains a field for each column in the index (with the same name and data type). The fastest way to fill the key is by creating a RecordFieldPtr object for each field in the record (see the example) and fill it as needed. However, one can also use the Record::define
function, but that is slower.
A second record is available to define the upper key when a key range has to be looked up. The keys can be accessed using the various accessKey
functions.
When a key is defined, the getRowNumbers
function can be used to find the table rows containing the given key (range). Function getRowNumber
can be used if all keys in the index are unique (which can be tested with the isUnique
function).
Instead of using the internal records holding the keys, one can also pass its own Record object to getRowNumbers
. However, it will be slower.
When constructing the object, the user can supply his own compare function. The default compare function compares each field of the key in the normal way. A user's compare function makes it possible to compare in a special way. E.g. one could use near instead of == on floating point fields. Another example (which is shown in one of the examples below) makes it possible to find a key in an index consisting of a time and width.
After an index is created, it is possible to change the data in the underlying columns. However, the ColumnsIndex
can not detect if the column data have changed. It can only detect if the number of rows has changed. If the column data have changed, the user has to use the setChanged
function to indicate that all columns or a particular column has changed.
If data have changed, the entire index will be recreated by rereading and optionally resorting the data. This will be deferred until the next key lookup.
Suppose one has an antenna table with key ANTENNA.
// Open the table and make an index for column ANTENNA. Table tab("antenna.tab") ColumnsIndex colInx(tab, "ANTENNA"); // Make a RecordFieldPtr for the ANTENNA field in the index key record. // Its data type has to match the data type of the column. RecordFieldPtr<Int> antFld(colInx.accessKey(), "ANTENNA"); // Now loop in some way and find the row for the antenna // involved in that loop. Bool found; while (...) { // Fill the key field and get the row number. // ANTENNA is a unique key, so only one row number matches. // Otherwise function getRowNumbers had to be used. *antFld = antenna; uInt antRownr = colInx.getRowNumber (found); if (!found) { cout << "Antenna " << antenna << " is unknown" << endl; } else { // antRownr can now be used to get data from that row in // the antenna table. } }
The following example shows how multiple keys can be used and how a search on a range can be done.
Table tab("sometable") // Note that TIME is the main key. // Also note that stringToVector (in ArrayUtil.h) is a handy // way to convert a String to a Vector<String>. ColumnsIndex colInx(tab, stringToVector("TIME,ANTENNA")); // Make a RecordFieldPtr for the fields in lower and upper key records. RecordFieldPtr<Double> timeLow(colInx.accessLowerKey(), "TIME"); RecordFieldPtr<Int> antLow(colInx.accessLowerKey(), "ANTENNA"); RecordFieldPtr<Double> timeUpp(colInx.accessUpperKey(), "TIME"); RecordFieldPtr<Int> antUpp(colInx.accessUpperKey(), "ANTENNA"); while (...) { // Fill the key fields. *timeLow = ...; *antLow = ...; *timeUpp = ...; *antUpp = ...; // Find the row numbers for keys between low and upp (inclusive). Vector<uInt> rows = colInx.getRowNumbers (True, True); }
The following example shows how a specific compare function could look like. A function like this will actually be used in the calibration software.
The table for which the index is built, has rows with the TIME as its key. However, each row is valid for a given interval, where TIME gives the middle of the interval and WIDTH the length of the interval. This means that the compare function has to test whether the key is part of the interval.
Int myCompare (const Block<void*>& fieldPtrs, const Block<void*>& dataPtrs, const Block<Int>& dataTypes, Int index) { // Assert (for performance only in debug mode) that the correct // fields are used. DebugAssert (dataTypes.nelements() == 2, AipsError); DebugAssert (dataTypes[0] == TpDouble && dataTypes[1] == TpDouble, AipsError); // Now get the key to be looked up // (an awfully looking cast has to be used). const Double key = *(*(const RecordFieldPtr<Double>*)(fieldPtrs[0])); // Get the time and width of the entry to be compared. const Double time = ((const Double*)(dataPtrs[0]))[index]; const Double width = ((const Double*)(dataPtrs[1]))[index]; const Double start = time - width/2; const Double end = time + width/2; // Test if the key is before, after, or in the interval // (representing less, greater, equal). if (key < start) { return -1; } else if (key > end) { return 1; } return 0; } // Now use this compare function in an actual index. Table tab("sometable") ColumnsIndex colInx(tab, stringToVector("TIME,WIDTH"), myCompare); // Make a RecordFieldPtr for the TIME field in the key record. // Note that although the WIDTH is part of the index, it is // not an actual key. So it does not need to be filled in. RecordFieldPtr<Double> time(colInx.accessLowerKey(), "TIME"); Bool found; while (...) { // Fill the key field. *time = ...; // Find the row number for this time. uInt rownr = colInx.getRowNumber (found); }
The calibration software needs to lookup keys in calibration tables very frequently. This class makes that process much easier and faster.
Definition at line 226 of file ColumnsIndex.h.
typedef Int casa::ColumnsIndex::Compare(const Block< void * > &fieldPtrs, const Block< void * > &dataPtrs, const Block< Int > &dataTypes, Int index) |
Define the signature of a comparison function.
The first block contains pointers to RecordFieldPtr<T>
objects holding the key to be looked up. The second block contains pointers to the column data. The index
argument gives the index in the column data. The third block contains data types of those blocks (TpBool, etc.). The function should return -1 if key is less than data, 0 if equal, 1 if greater.
An example above shows how a compare function can be used.
Definition at line 238 of file ColumnsIndex.h.
casa::ColumnsIndex::ColumnsIndex | ( | const Table & | , |
const String & | columnName, | ||
Compare * | compareFunction = 0 , |
||
Bool | noSort = False |
||
) |
Create an index on the given table for the given column.
The column has to be a scalar column. If noSort==True
, the table is already in order of that column and the sort step will not be done. The default compare function is provided by this class. It simply compares each field in the key.
casa::ColumnsIndex::ColumnsIndex | ( | const Table & | , |
const Vector< String > & | columnNames, | ||
Compare * | compareFunction = 0 , |
||
Bool | noSort = False |
||
) |
Create an index on the given table for the given columns, thus the key is formed by multiple columns.
The columns have to be scalar columns. If noSort==True
, the table is already in order of those columns and the sort step will not be done. The default compare function is provided by this class. It simply compares each field in the key.
casa::ColumnsIndex::ColumnsIndex | ( | const ColumnsIndex & | that | ) |
Copy constructor (copy semantics).
Record & casa::ColumnsIndex::accessKey | ( | ) | [inline] |
Access the key values.
These functions allow you to create RecordFieldPtr<T> objects for each field in the key. In this way you can quickly fill in the key.
The records have a fixed type, so you cannot add or delete fields.
Definition at line 425 of file ColumnsIndex.h.
References itsLowerKeyPtr.
Record & casa::ColumnsIndex::accessLowerKey | ( | ) | [inline] |
Definition at line 429 of file ColumnsIndex.h.
References itsLowerKeyPtr.
Record & casa::ColumnsIndex::accessUpperKey | ( | ) | [inline] |
Definition at line 433 of file ColumnsIndex.h.
References itsUpperKeyPtr.
void casa::ColumnsIndex::addColumnToDesc | ( | RecordDesc & | description, |
const TableColumn & | column | ||
) | [protected] |
Add a column to the record description for the keys.
uInt casa::ColumnsIndex::bsearch | ( | Bool & | found, |
const Block< void * > & | fieldPtrs | ||
) | const [protected] |
Do a binary search on itsUniqueIndex
for the key in fieldPtrs
.
If the key is found, found
is set to True and the index in itsUniqueIndex
is returned. If not found, found
is set to False and the index of the next higher key is returned.
Vector<String> casa::ColumnsIndex::columnNames | ( | ) | const |
Return the names of the columns forming the index.
static Int casa::ColumnsIndex::compare | ( | const Block< void * > & | fieldPtrs, |
const Block< void * > & | dataPtrs, | ||
const Block< Int > & | dataTypes, | ||
Int | index | ||
) | [static, protected] |
Compare the key in fieldPtrs
with the given index entry.
-1 is returned when less, 0 when equal, 1 when greater.
void casa::ColumnsIndex::copy | ( | const ColumnsIndex & | that | ) | [protected] |
Copy that object to this.
void casa::ColumnsIndex::copyKey | ( | Block< void * > | fields, |
const Record & | key | ||
) | [private] |
Fill the internal key fields from the corresponding external key.
static void casa::ColumnsIndex::copyKeyField | ( | void * | field, |
int | dtype, | ||
const Record & | key | ||
) | [static] |
Fill the internal key field from the corresponding external key.
The data type may differ.
static void casa::ColumnsIndex::copyKeyField | ( | RecordFieldPtr< T > & | field, |
const Record & | key | ||
) | [inline, static, private] |
Fill the internal key field from the corresponding external key.
The data type may differ.
Definition at line 389 of file ColumnsIndex.h.
References casa::RecordInterface::get(), and casa::RecordFieldPtr< T >::name().
void casa::ColumnsIndex::create | ( | const Table & | table, |
const Vector< String > & | columnNames, | ||
Compare * | compareFunction, | ||
Bool | noSort | ||
) | [protected] |
Create the various members in the object.
void casa::ColumnsIndex::deleteObjects | ( | ) | [protected] |
Delete all data in the object.
void casa::ColumnsIndex::fillRowNumbers | ( | Vector< uInt > & | rows, |
uInt | start, | ||
uInt | end | ||
) | const [protected] |
Fill the row numbers vector for the given start till end in the itsUniqueIndex
vector (end is not inclusive).
uInt casa::ColumnsIndex::getRowNumber | ( | Bool & | found | ) |
Find the row number matching the key.
All keys have to be unique, otherwise an exception is thrown. If no match is found, found
is set to False. The 2nd version makes it possible to pass in your own Record instead of using the internal record via the accessKey
functions. Note that the given Record will be copied to the internal record, thus overwrites it.
uInt casa::ColumnsIndex::getRowNumber | ( | Bool & | found, |
const Record & | key | ||
) |
Find the row numbers matching the key.
It should be used instead of getRowNumber
if the same key can exist multiple times. The 2nd version makes it possible to pass in your own Record instead of using the internal record via the accessKey
functions. Note that the given Record will be copied to the internal record, thus overwrites it.
Vector<uInt> casa::ColumnsIndex::getRowNumbers | ( | const Record & | key | ) |
Vector<uInt> casa::ColumnsIndex::getRowNumbers | ( | Bool | lowerInclusive, |
Bool | upperInclusive | ||
) |
Find the row numbers matching the key range.
The boolean arguments tell if the lower and upper key are part of the range. The 2nd version makes it possible to pass in your own Records instead of using the internal records via the accessLower/UpperKey
functions. Note that the given Records will be copied to the internal records, thus overwrite them.
Vector<uInt> casa::ColumnsIndex::getRowNumbers | ( | const Record & | lower, |
const Record & | upper, | ||
Bool | lowerInclusive, | ||
Bool | upperInclusive | ||
) |
Bool casa::ColumnsIndex::isUnique | ( | ) | const [inline] |
Are all keys in the index unique?
Definition at line 417 of file ColumnsIndex.h.
References itsDataIndex, itsUniqueIndex, and casa::ArrayBase::nelements().
void casa::ColumnsIndex::makeObjects | ( | const RecordDesc & | description | ) | [protected] |
Make the various internal RecordFieldPtr
objects.
ColumnsIndex& casa::ColumnsIndex::operator= | ( | const ColumnsIndex & | that | ) |
Assignment (copy semantics).
void casa::ColumnsIndex::readData | ( | ) | [protected] |
Read the data of the columns forming the index, sort them and form the index.
void casa::ColumnsIndex::setChanged | ( | ) |
Something has changed in the table, so the index has to be recreated.
The 2nd version indicates that a specific column has changed, so only that column is reread. If that column is not part of the index, nothing will be done.
Note that the class itself is keeping track if the number of rows in the table changes.
void casa::ColumnsIndex::setChanged | ( | const String & | columnName | ) |
const Table & casa::ColumnsIndex::table | ( | ) | const [inline] |
Get the table for which this index is created.
Definition at line 421 of file ColumnsIndex.h.
References itsTable.
Bool casa::ColumnsIndex::itsChanged [private] |
Definition at line 406 of file ColumnsIndex.h.
Block<Bool> casa::ColumnsIndex::itsColumnChanged [private] |
Definition at line 405 of file ColumnsIndex.h.
Compare* casa::ColumnsIndex::itsCompare [private] |
Definition at line 408 of file ColumnsIndex.h.
Block<void*> casa::ColumnsIndex::itsData [private] |
Definition at line 400 of file ColumnsIndex.h.
Vector<uInt> casa::ColumnsIndex::itsDataIndex [private] |
Definition at line 409 of file ColumnsIndex.h.
Referenced by isUnique().
uInt* casa::ColumnsIndex::itsDataInx [private] |
Definition at line 412 of file ColumnsIndex.h.
Block<Int> casa::ColumnsIndex::itsDataTypes [private] |
Definition at line 398 of file ColumnsIndex.h.
Block<void*> casa::ColumnsIndex::itsDataVectors [private] |
Definition at line 399 of file ColumnsIndex.h.
Block<void*> casa::ColumnsIndex::itsLowerFields [private] |
Definition at line 403 of file ColumnsIndex.h.
Record* casa::ColumnsIndex::itsLowerKeyPtr [private] |
Definition at line 396 of file ColumnsIndex.h.
Referenced by accessKey(), and accessLowerKey().
Bool casa::ColumnsIndex::itsNoSort [private] |
Definition at line 407 of file ColumnsIndex.h.
uInt casa::ColumnsIndex::itsNrrow [private] |
Definition at line 395 of file ColumnsIndex.h.
Table casa::ColumnsIndex::itsTable [private] |
Definition at line 394 of file ColumnsIndex.h.
Referenced by table().
Vector<uInt> casa::ColumnsIndex::itsUniqueIndex [private] |
Definition at line 411 of file ColumnsIndex.h.
Referenced by isUnique().
uInt* casa::ColumnsIndex::itsUniqueInx [private] |
Definition at line 413 of file ColumnsIndex.h.
Block<void*> casa::ColumnsIndex::itsUpperFields [private] |
Definition at line 404 of file ColumnsIndex.h.
Record* casa::ColumnsIndex::itsUpperKeyPtr [private] |
Definition at line 397 of file ColumnsIndex.h.
Referenced by accessUpperKey().