Sort.h

Classes

SortKey -- Define a Sort key (full description)
Sort -- Sort on one or more keys, ascending and/or descending (full description)

class SortKey

Interface

Public Members
SortKey (const void* data, ObjCompareFunc*, uInt increment, int order)
SortKey (const SortKey&)
~SortKey()
SortKey& operator= (const SortKey&)
uInt tryGenSort (Vector<uInt>& indexVector, uInt nrrec, int opt) const

Description

Review Status

Reviewed By:
Friso Olnon
Date Reviewed:
1995/03/01
Programs:
Tests:

Synopsis

SortKey is a helper class for the Sort class. It holds the following information about a sort key:

Member Description

SortKey (const void* data, ObjCompareFunc*, uInt increment, int order)

Define a sort key in a given data array using the indicated comparison function, stride and sort order.

SortKey (const SortKey&)

Copy constructor (copy semantics).

~SortKey()

SortKey& operator= (const SortKey&)

Assignment (copy semantics).

uInt tryGenSort (Vector<uInt>& indexVector, uInt nrrec, int opt) const

Try if GenSort can be used for this single key. If it succeeds, it returns the resulting number of elements. Otherwise it returns 0.


class Sort

Types

enum Option

HeapSort = 1
InsSort = 2
use Heapsort algorithm
QuickSort = 4
use insertion sort algorithm
NoDuplicates = 8
use Quicksort algorithm

enum Order

Ascending = -1,Descending=1

Interface

Sort()
Sort (const void* data, uInt elementSize)
Sort (const Sort&)
~Sort()
Sort& operator= (const Sort&)
void sortKey (const void* data, DataType, uInt increment = 0, Order = Ascending)
void sortKey (const void* data, ObjCompareFunc*, uInt increment, Order = Ascending)
void sortKey (uInt offset, DataType, Order = Ascending)
void sortKey (uInt offset, ObjCompareFunc*, Order = Ascending)
uInt sort (Vector<uInt>& indexVector, uInt nrrec, int options = HeapSort) const
uInt unique (Vector<uInt>& uniqueVector, uInt nrrec) const
uInt unique (Vector<uInt>& uniqueVector, const Vector<uInt>& indexVector) const
Private Members
void copy (const Sort& that)
void addKey (const void* data, DataType, uInt nr, int options)
void addKey (const void* data, ObjCompareFunc*, uInt nr, int options)
uInt insSort (uInt nr, uInt* indices) const
uInt insSortNoDup (uInt nr, uInt* indices) const
uInt quickSort (uInt nr, uInt* indices) const
uInt quickSortNoDup (uInt nr, uInt* indices) const
void qkSort (Int nr, uInt* indices) const
uInt heapSort (uInt nr, uInt* indices) const
uInt heapSortNoDup (uInt nr, uInt* indices) const
void siftDown (Int low, Int up, uInt* indices) const
int compare (uInt index1, uInt index2) const
inline void swap (Int index1, Int index2, uInt* indices) const

Description

Review Status

Reviewed By:
Friso Olnon
Date Reviewed:
1995/03/01
Programs:
Tests:
  • tSort
  • tSort_1

Synopsis

Sort lets you sort data on one or more keys in a mix of Sort::ascending and Sort::descending order. Duplicates can be skipped by giving the option Sort::NoDuplicates. Only in this case the number of output elements can be different from the number of input elements.
The unique offers another way of getting unique values,

Class Sort does not sort the data themselves, but returns an index to them. This gives more flexibility and allows the sort to be stable; but it is slower. Very fast sorting of the data themselves can be done with the functions in class GenSort.
Three sort algorithms are provided:

Sort::InsSort
Insertion sort has O(n*n) behaviour, thus is very slow. It will only be very fast when the array is already (almost) in the right order.
Sort::QuickSort
Care has been taken to solve the well-known quicksort problems like "array already in order" or "many equal elements". The behaviour is O(n*log(n)) in all the cases tested, even in degenerated cases where the SUN Solaris qsort algorithm is O(n*n).
Sort::HeapSort
Heapsort has O(n*log(n)) behaviour. Its speed is higher than that of QuickSort, so it is the default algorithm.
All sort algorithms are stable, which means that the original order is kept when keys are equal.

The sort is a four step process:

  1. Construct the Sort object.
  2. Define the sort keys. The function sortKey must be called for each sort key (the most significant one first). The comparison function can be passed in directly, or a basic data type can be given. In the latter case the appropriate comparison function will be selected.
  3. Sort the data. The function sort returns an index array, which is allocated when needed.
  4. Destruct the Sort object (usually done automatically) and delete the index array.
The data can be in a single array of structs, in separate arrays, or in a mix of those. Of course, all arrays must have the same length. The data can be passed to the Sort constructor and/or to the sortKey function. If passed to the Sort constructor, the offset of the data item in the data array must be given to sortKey.

Example

In the first example we sort the data contained in two "parallel" arrays, idata and ddata, both of length nrdata.
    Sort sort;
    sort.sortKey (idata, TpInt);                       // define 1st sort key
    sort.sortKey (ddata, TpDouble,0,Sort::Descending); // define 2nd sort key
    uInt* inx=0;
    sort.sort (nrdata, inx);
    for (uInt i=0; i<nrdata; i++) {                    // show sorted data
        cout << idata[inx[i]] << " " << ddata[inx[i]] << endl;
    }
    delete inx;
Now nr contains the nr of records (=nrdata) and inx an array of (sorted) indices. The index array inx has to be deleted by the user.

In the second example we sort the data stored in an array of structs on the double (ascending) and the string (descending). We can pass the data to the Sort constructor, and the offsets of the struct elements to the sortKey function.

    struct Ts {
         String as;
         double ad;
    }
    uInt* inx=0;
    Sort sort (tsarr, sizeof(Ts));
    sort.sortKey ((char*)&tsarr[0].ad - (char*)tsarr, TpDouble);
    sort.sortKey ((char*)&tsarr[0].as - (char*)tsarr, TpString,
                                                       Sort::Descending);
    sort.sort (nrts, inx);
Note that the first argument in function sortKey gives the offset of the variable in the struct.

Alternatively, and probably slightly easier, we could pass the data to the sortKey function:

    struct Ts {
         String as;
         double ad;
    }
    uInt* inx=0;
    Sort sort;
    sort.sortKey (&tsarr[0].ad, TpDouble, sizeof(Ts));
    sort.sortKey (&tsarr[0].as, TpString, sizeof(Ts), Sort::Descending);
    sort.sort (nrts, inx);

Finally, we could provide a comparison function for the struct. (The function is shown inline, but should be implemented out-of-line.) This method is not recommended, because it means more work for the user. Moreover, the descending sort on the string has to be coded in the comparison function, which is not very flexible.

    struct Ts {
         String as;
         double ad;
    }
    int compareTs (const void* val1, const void* val2)
    {
        const Ts& t1 = *(Ts*)val1;
        const Ts& t2 = *(Ts*)val2;
        if (t1.ad < t2.ad) return -1;
        if (t1.ad > t2.ad) return 1;
        if (t1.as < t2.as) return 1;    // string must be descending
        if (t1.as > t2.as) return -1;
        return 0;
    }
    uInt* inx=0;
    Sort sort;
    sort.sortKey (tsarr, compareTs, sizeof(Ts)); 
    sort.sort (nrts, inx);

The last example illustrates the use of the Sort::NoDuplicates flag and an input index array. First we remove duplicate strings, and then sort the result.

    struct Ts {
         String as;
         double ad;
    }
    uInt* inxNoDup=0;
    Sort sort;
    sort.sortKey (&tsarr[0].as, TpString, sizeof(Ts));
    uInt nrout = sort.sort (nrts, inxNoDup, Sort::NoDuplicates);
    Sort sort2;
    sort2.sortKey (&tsarr[0].ad, TpDouble, sizeof(Ts));
    sort2.sortKey (&tsarr[0].as, TpString, sizeof(Ts), Sort::Descending);
    uInt* inx=0;
    sort2.sort (nrout, inxNoDup, inx);

Member Description

enum Option

Enumerate the sort options:

enum Order

Enumerate the sort order:

Sort()

The default constructor can be used when the data is only passed in via function sortKey.

Sort (const void* data, uInt elementSize)

Construct a Sort object for the given data array with elements of elementSize bytes. This data array will be used when an offset is given to the sortKey functions. You can still pass additional data arrays to the sortKey functions.

Sort (const Sort&)

Copy constructor (copy semantics).

~Sort()

Sort& operator= (const Sort&)

Assignment (copy semantics).

void sortKey (const void* data, DataType, uInt increment = 0, Order = Ascending)
void sortKey (const void* data, ObjCompareFunc*, uInt increment, Order = Ascending)
void sortKey (uInt offset, DataType, Order = Ascending)
void sortKey (uInt offset, ObjCompareFunc*, Order = Ascending)

Define a sort key (the most significant key should be defined first). The key contains:

When the data array has been passed to the Sort constructor, the data pointer and the increment arguments can be replaced by a single argument: the offset of the key in each element of the array.

uInt sort (Vector<uInt>& indexVector, uInt nrrec, int options = HeapSort) const

Sort the data array of nrrec records. The result is an array of indices giving the requested order. It returns the number of resulting records. The indices array is resized to that number.

uInt unique (Vector<uInt>& uniqueVector, uInt nrrec) const
uInt unique (Vector<uInt>& uniqueVector, const Vector<uInt>& indexVector) const

Get all unique records in a sorted array. The array order is given in the indexVector (as possibly returned by the sort function). The default indexVector is 0..nrrec-1. The index of each first unique record is returned in the uniqueVector. They are indices in the supplied indexVector, so /src>data[indexVector(uniqueVector(i))] is giving the i-th unique record. Note that the records indexed by indexVector(uniqueVector(i)) till indexVector(uniqueVector(i+1)) are all the same.
It returns the number of unique records. The unique array is resized to that number.

void copy (const Sort& that)

Copy that Sort object to this.

void addKey (const void* data, DataType, uInt nr, int options)
void addKey (const void* data, ObjCompareFunc*, uInt nr, int options)

Add a sort key giving a data type or a compare function.

uInt insSort (uInt nr, uInt* indices) const
uInt insSortNoDup (uInt nr, uInt* indices) const

Do an insertion sort, optionally skipping duplicates.

uInt quickSort (uInt nr, uInt* indices) const
uInt quickSortNoDup (uInt nr, uInt* indices) const
void qkSort (Int nr, uInt* indices) const

Do a quicksort, optionally skipping duplicates (qkSort is the actual quicksort function).

uInt heapSort (uInt nr, uInt* indices) const
uInt heapSortNoDup (uInt nr, uInt* indices) const

Do a heapsort, optionally skipping duplicates.

void siftDown (Int low, Int up, uInt* indices) const

Siftdown algorithm for heapsort.

int compare (uInt index1, uInt index2) const

Compare the keys of 2 records.

inline void swap (Int index1, Int index2, uInt* indices) const

Swap 2 indices.