casa
5.7.0-16
|
Sort on one or more keys, ascending and/or descending. More...
#include <Sort.h>
Public Types | |
enum | Option { DefaultSort, HeapSort, InsSort, QuickSort, ParSort, NoDuplicates } |
Enumerate the sort options: More... | |
enum | Order { Ascending, Descending } |
Enumerate the sort order: More... | |
Public Member Functions | |
Sort () | |
The default constructor can be used when the data is only passed in via function sortKey . More... | |
Sort (const void *data, uInt elementSize) | |
Construct a Sort object for the given data array with elements of elementSize bytes. More... | |
Sort (const Sort &) | |
Copy constructor (copy semantics). More... | |
~Sort () | |
Sort & | operator= (const Sort &) |
Assignment (copy semantics). More... | |
void | sortKey (const void *data, DataType, uInt increment=0, Order=Ascending) |
Define a sort key (the most significant key should be defined first). More... | |
void | sortKey (const void *data, const CountedPtr< BaseCompare > &, uInt increment, Order=Ascending) |
void | sortKey (uInt offset, DataType, Order=Ascending) |
void | sortKey (uInt offset, const CountedPtr< BaseCompare > &, Order=Ascending) |
uInt | sort (Vector< uInt > &indexVector, uInt nrrec, int options=DefaultSort, Bool tryGenSort=True) const |
Sort the data array of nrrec records. More... | |
uInt | unique (Vector< uInt > &uniqueVector, uInt nrrec) const |
Get all unique records in a sorted array. More... | |
uInt | unique (Vector< uInt > &uniqueVector, const Vector< uInt > &indexVector) const |
Private Member Functions | |
void | copy (const Sort &that) |
Copy that Sort object to this. More... | |
void | addKey (const void *data, DataType, uInt nr, int options) |
Add a sort key giving a data type or the sort key. More... | |
void | addKey (SortKey *) |
uInt | insSort (uInt nr, uInt *indices) const |
Do an insertion sort, optionally skipping duplicates. More... | |
uInt | insSortNoDup (uInt nr, uInt *indices) const |
uInt | parSort (int nthr, uInt nrrec, uInt *inx) const |
Do a merge sort, if possible in parallel using OpenMP. More... | |
void | merge (uInt *inx, uInt *tmp, uInt size, uInt *index, uInt nparts) const |
uInt | quickSort (uInt nr, uInt *indices) const |
Do a quicksort, optionally skipping duplicates (qkSort is the actual quicksort function). More... | |
uInt | quickSortNoDup (uInt nr, uInt *indices) const |
void | qkSort (Int nr, uInt *indices) const |
uInt | heapSort (uInt nr, uInt *indices) const |
Do a heapsort, optionally skipping duplicates. More... | |
uInt | heapSortNoDup (uInt nr, uInt *indices) const |
void | siftDown (Int low, Int up, uInt *indices) const |
Siftdown algorithm for heapsort. More... | |
int | compare (uInt index1, uInt index2) const |
Compare the keys of 2 records. More... | |
void | swap (Int index1, Int index2, uInt *indices) const |
Swap 2 indices. More... | |
Private Attributes | |
PtrBlock< SortKey * > | keys_p |
uInt | nrkey_p |
const void * | data_p |
uInt | size_p |
int | order_p |
Sort on one or more keys, ascending and/or descending.
Public interface
Sort
lets you sort data on one or more keys in a mix of Sort::ascending
and Sort::descending
order. Duplicates can be skipped by giving the option Sort::NoDuplicates
. Only in this case the number of output elements can be different from the number of input elements.
The unique
function offers another way of getting unique values.
Class Sort
does not sort the data themselves, but returns an index to them. This gives more flexibility and allows the sort to be stable; but it is slower.
Very fast sorting of the data themselves can be done with the functions in class GenSort. If sorting on a single key with a standard data type is done, Sort will use GenSortIndirect to speed up the sort.
Four sort algorithms are provided:
Sort::ParSort
Sort::InsSort
Sort::QuickSort
Sort::HeapSort
The default is to use QuickSort for small arrays or if only a single thread can be used. Otherwise ParSort is the default.
All sort algorithms are stable, which means that the original order is kept when keys are equal.
The sort is a four step process:
Sort
object. sortKey
must be called for each sort key (the most significant one first). The comparison object can be passed in directly, or a basic data type can be given. In the latter case the appropriate ObjCompare comparison object will be created. sort
returns an index array, which is allocated when needed. Sort
object (usually done automatically) and delete the index array. The data can be in a single array of structs, in separate arrays, or in a mix of those. Of course, all arrays must have the same length. The data can be passed to the Sort
constructor and/or to the sortKey
function. If passed to the Sort
constructor, the offset of the data item in the data array must be given to sortKey
.
In the first example we sort the data contained in two "parallel" arrays, idata
and ddata
, both of length nrdata
.
Now nr
contains the nr of records (=nrdata
) and inx
an array of (sorted) indices.
In the second example we sort the data stored in an array of structs on the double (ascending) and the string (descending). We can pass the data to the Sort
constructor, and the offsets of the struct elements to the sortKey
function.
Note that the first argument in function sortKey
gives the offset of the variable in the struct.
Alternatively, and probably slightly easier, we could pass the data to the sortKey
function and use an increment:
Finally, we could provide a comparison object for the struct.
casacore::Sort::Sort | ( | ) |
The default constructor can be used when the data is only passed in via function sortKey
.
casacore::Sort::Sort | ( | const void * | data, |
uInt | elementSize | ||
) |
Construct a Sort object for the given data array with elements of elementSize
bytes.
This data array will be used when an offset is given to the sortKey
functions. You can still pass additional data arrays to the sortKey
functions.
casacore::Sort::Sort | ( | const Sort & | ) |
Copy constructor (copy semantics).
casacore::Sort::~Sort | ( | ) |
|
private |
Add a sort key giving a data type or the sort key.
|
private |
Compare the keys of 2 records.
Do a heapsort, optionally skipping duplicates.
Do an insertion sort, optionally skipping duplicates.
|
private |
Do a merge sort, if possible in parallel using OpenMP.
Note that the env.var. OMP_NUM_TRHEADS sets the maximum nr of threads to use. It defaults to the number of cores.
Do a quicksort, optionally skipping duplicates (qkSort is the actual quicksort function).
Siftdown algorithm for heapsort.
uInt casacore::Sort::sort | ( | Vector< uInt > & | indexVector, |
uInt | nrrec, | ||
int | options = DefaultSort , |
||
Bool | tryGenSort = True |
||
) | const |
Sort the data array of nrrec
records.
The result is an array of indices giving the requested order. It returns the number of resulting records. The indices array is resized to that number.
By default it'll try if the faster GenSortIndirect can be used if a sort on a single key is used.
Referenced by casa::ScantableFieldIterator::ScantableFieldIterator(), casa::ScantableFrequenciesIterator::ScantableFrequenciesIterator(), and casa::ScantableSourceIterator::ScantableSourceIterator().
void casacore::Sort::sortKey | ( | const void * | data, |
DataType | , | ||
uInt | increment = 0 , |
||
Order | = Ascending |
||
) |
Define a sort key (the most significant key should be defined first).
The key contains:
comp
function. Ascending
(default) or Descending
; When the data array has been passed to the Sort constructor, the data pointer and the increment arguments can be replaced by a single argument: the offset of the key in each element of the array.
Referenced by casa::ScantableFieldIterator::ScantableFieldIterator(), casa::ScantableFrequenciesIterator::ScantableFrequenciesIterator(), and casa::ScantableSourceIterator::ScantableSourceIterator().
void casacore::Sort::sortKey | ( | const void * | data, |
const CountedPtr< BaseCompare > & | , | ||
uInt | increment, | ||
Order | = Ascending |
||
) |
void casacore::Sort::sortKey | ( | uInt | offset, |
const CountedPtr< BaseCompare > & | , | ||
Order | = Ascending |
||
) |
Get all unique records in a sorted array.
The array order is given in the indexVector (as possibly returned by the sort function). The default indexVector is 0..nrrec-1. The index of each first unique record is returned in the uniqueVector. They are indices in the supplied indexVector, so data[indexVector(uniqueVector(i))]
is giving the i-th unique record. Note that the records indexed by indexVector(uniqueVector(i))
till indexVector(uniqueVector(i+1))
are all the same.
It returns the number of unique records. The unique array is resized to that number.
uInt casacore::Sort::unique | ( | Vector< uInt > & | uniqueVector, |
const Vector< uInt > & | indexVector | ||
) | const |