casa  $Rev:20696$
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines
Public Types | Public Member Functions | Private Member Functions | Private Attributes
casa::Sort Class Reference

Sort on one or more keys, ascending and/or descending. More...

#include <Sort.h>

List of all members.

Public Types

enum  Option {
  HeapSort,
  InsSort,
  QuickSort,
  NoDuplicates
}
 Enumerate the sort options: More...
enum  Order {
  Ascending,
  Descending
}
 Enumerate the sort order: More...

Public Member Functions

 Sort ()
 The default constructor can be used when the data is only passed in via function sortKey.
 Sort (const void *data, uInt elementSize)
 Construct a Sort object for the given data array with elements of elementSize bytes.
 Sort (const Sort &)
 Copy constructor (copy semantics).
 ~Sort ()
Sortoperator= (const Sort &)
 Assignment (copy semantics).
void sortKey (const void *data, DataType, uInt increment=0, Order=Ascending)
 Define a sort key (the most significant key should be defined first).
void sortKey (const void *data, const CountedPtr< BaseCompare > &, uInt increment, Order=Ascending)
void sortKey (uInt offset, DataType, Order=Ascending)
void sortKey (uInt offset, const CountedPtr< BaseCompare > &, Order=Ascending)
uInt sort (Vector< uInt > &indexVector, uInt nrrec, int options=QuickSort) const
 Sort the data array of nrrec records.
uInt unique (Vector< uInt > &uniqueVector, uInt nrrec) const
 Get all unique records in a sorted array.
uInt unique (Vector< uInt > &uniqueVector, const Vector< uInt > &indexVector) const

Private Member Functions

void copy (const Sort &that)
 
     

void addKey (const void *data, DataType, uInt nr, int options)
 Add a sort key giving a data type or the sort key.
void addKey (SortKey *)
uInt insSort (uInt nr, uInt *indices) const
 Do an insertion sort, optionally skipping duplicates.
uInt insSortNoDup (uInt nr, uInt *indices) const
uInt quickSort (uInt nr, uInt *indices) const
 Do a quicksort, optionally skipping duplicates (qkSort is the actual quicksort function).
uInt quickSortNoDup (uInt nr, uInt *indices) const
void qkSort (Int nr, uInt *indices) const
uInt heapSort (uInt nr, uInt *indices) const
 Do a heapsort, optionally skipping duplicates.
uInt heapSortNoDup (uInt nr, uInt *indices) const
void siftDown (Int low, Int up, uInt *indices) const
 Siftdown algorithm for heapsort.
int compare (uInt index1, uInt index2) const
 Compare the keys of 2 records.
void swap (Int index1, Int index2, uInt *indices) const
 Swap 2 indices.

Private Attributes

PtrBlock< SortKey * > keys_p
uInt nrkey_p
const void * data_p
uInt size_p

Detailed Description

Sort on one or more keys, ascending and/or descending.

Intended use:

Public interface

Review Status

Reviewed By:
Friso Olnon
Date Reviewed:
1995/03/01
Test programs:
tSort, tSort_1

Synopsis

Sort lets you sort data on one or more keys in a mix of Sort::ascending and Sort::descending order. Duplicates can be skipped by giving the option Sort::NoDuplicates. Only in this case the number of output elements can be different from the number of input elements.
The unique function offers another way of getting unique values.

Class Sort does not sort the data themselves, but returns an index to them. This gives more flexibility and allows the sort to be stable; but it is slower.
Very fast sorting of the data themselves can be done with the functions in class GenSort . If sorting on a single key with a standard data type is done, Sort will use GenSortIndirect to speed up the sort.
Three sort algorithms are provided:

Sort::InsSort
Insertion sort has O(n*n) behaviour, thus is very slow. It will only be very fast if the array is already (almost) in the right order.
Sort::QuickSort
Care has been taken to solve the well-known quicksort problems like "array already in order" or "many equal elements". The behaviour is O(n*log(n)) in all the cases tested, even in degenerated cases where the SUN Solaris qsort algorithm is O(n*n).
Sort::HeapSort
Heapsort has O(n*log(n)) behaviour. Its speed is lower than that of QuickSort, so QuickSort is the default algorithm.

All sort algorithms are stable, which means that the original order is kept when keys are equal.

The sort is a four step process:

  1. Construct the Sort object.
  2. Define the sort keys. The function sortKey must be called for each sort key (the most significant one first). The comparison object can be passed in directly, or a basic data type can be given. In the latter case the appropriate ObjCompare comparison object will be created.
  3. Sort the data. The function sort returns an index array, which is allocated when needed.
  4. Destruct the Sort object (usually done automatically) and delete the index array.

The data can be in a single array of structs, in separate arrays, or in a mix of those. Of course, all arrays must have the same length. The data can be passed to the Sort constructor and/or to the sortKey function. If passed to the Sort constructor, the offset of the data item in the data array must be given to sortKey.

Example

In the first example we sort the data contained in two "parallel" arrays, idata and ddata, both of length nrdata.

       Sort sort;
       sort.sortKey (idata, TpInt);                       // define 1st sort key
       sort.sortKey (ddata, TpDouble,0,Sort::Descending); // define 2nd sort key
       Vector<uInt> inx;
       sort.sort (inx, nrdata);
       for (uInt i=0; i<nrdata; i++) {                    // show sorted data
           cout << idata[inx[i]] << " " << ddata[inx[i]] << endl;
       }

Now nr contains the nr of records (=nrdata) and inx an array of (sorted) indices.

In the second example we sort the data stored in an array of structs on the double (ascending) and the string (descending). We can pass the data to the Sort constructor, and the offsets of the struct elements to the sortKey function.

       struct Ts {
            String as;
            double ad;
       }
       Vector<uInt> inx;
       Sort sort (tsarr, sizeof(Ts));
       sort.sortKey ((char*)&tsarr[0].ad - (char*)tsarr, TpDouble);
       sort.sortKey ((char*)&tsarr[0].as - (char*)tsarr, TpString,
                                                          Sort::Descending);
       sort.sort (inx, nrts);

Note that the first argument in function sortKey gives the offset of the variable in the struct.

Alternatively, and probably slightly easier, we could pass the data to the sortKey function and use an increment:

       struct Ts {
            String as;
            double ad;
       }
       Vector<uInt> inx;
       Sort sort;
       sort.sortKey (&tsarr[0].ad, TpDouble, sizeof(Ts));
       sort.sortKey (&tsarr[0].as, TpString, sizeof(Ts), Sort::Descending);
       sort.sort (inx, nrts);

Finally, we could provide a comparison object for the struct.

       struct Ts {
            String as;
            double ad;
       }
       class MyCompare: public BaseCompare {
         virtual int comp (const void* val1, const void* val2) const
         {
           const Ts& t1 = *(Ts*)val1;
           const Ts& t2 = *(Ts*)val2;
           if (t1.ad < t2.ad) return -1;
           if (t1.ad > t2.ad) return 1;
           if (t1.as < t2.as) return 1;    // string must be descending
           if (t1.as > t2.as) return -1;
           return 0;
         }
       };
       Vector<uInt> inx;
       Sort sort;
       sort.sortKey (tsarr, compareTs, sizeof(Ts)); 
       sort.sort (inx, nrts);

Definition at line 236 of file Sort.h.


Member Enumeration Documentation

Enumerate the sort options:

Enumerator:
HeapSort 
InsSort 
QuickSort 
NoDuplicates 

Definition at line 240 of file Sort.h.

Enumerate the sort order:

Enumerator:
Ascending 
Descending 

Definition at line 246 of file Sort.h.


Constructor & Destructor Documentation

The default constructor can be used when the data is only passed in via function sortKey.

casa::Sort::Sort ( const void *  data,
uInt  elementSize 
)

Construct a Sort object for the given data array with elements of elementSize bytes.

This data array will be used when an offset is given to the sortKey functions. You can still pass additional data arrays to the sortKey functions.

casa::Sort::Sort ( const Sort )

Copy constructor (copy semantics).


Member Function Documentation

void casa::Sort::addKey ( const void *  data,
DataType  ,
uInt  nr,
int  options 
) [private]

Add a sort key giving a data type or the sort key.

void casa::Sort::addKey ( SortKey ) [private]
int casa::Sort::compare ( uInt  index1,
uInt  index2 
) const [private]

Compare the keys of 2 records.

void casa::Sort::copy ( const Sort that) [private]

     

Copy that Sort object to this.

uInt casa::Sort::heapSort ( uInt  nr,
uInt indices 
) const [private]

Do a heapsort, optionally skipping duplicates.

uInt casa::Sort::heapSortNoDup ( uInt  nr,
uInt indices 
) const [private]
uInt casa::Sort::insSort ( uInt  nr,
uInt indices 
) const [private]

Do an insertion sort, optionally skipping duplicates.

uInt casa::Sort::insSortNoDup ( uInt  nr,
uInt indices 
) const [private]
Sort& casa::Sort::operator= ( const Sort )

Assignment (copy semantics).

void casa::Sort::qkSort ( Int  nr,
uInt indices 
) const [private]
uInt casa::Sort::quickSort ( uInt  nr,
uInt indices 
) const [private]

Do a quicksort, optionally skipping duplicates (qkSort is the actual quicksort function).

uInt casa::Sort::quickSortNoDup ( uInt  nr,
uInt indices 
) const [private]
void casa::Sort::siftDown ( Int  low,
Int  up,
uInt indices 
) const [private]

Siftdown algorithm for heapsort.

uInt casa::Sort::sort ( Vector< uInt > &  indexVector,
uInt  nrrec,
int  options = QuickSort 
) const

Sort the data array of nrrec records.

The result is an array of indices giving the requested order. It returns the number of resulting records. The indices array is resized to that number.

void casa::Sort::sortKey ( const void *  data,
DataType  ,
uInt  increment = 0,
Order  = Ascending 
)

Define a sort key (the most significant key should be defined first).

The key contains:

  • A pointer to the start of the data array. --- When structs are sorted on an element in the struct, the pointer must point to that element in the first struct.
  • A pointer to the comparison object to be used. --- The comparison object can be specified in two ways:
    • by giving a basic data type , in which case the appropriate comparison object will be created automatically, or
    • by a CountedPtr of a comparison object. You may want to use the templated comparison classes ObjCompare (), but you are free to use any other class derived from BaseCompare that implements the comp function.
  • The increment from one data element to the next. --- When structs are sorted on an element in the struct, the increment should be the size of the struct. If the comparison object is automatically created using the data type specified, the default increment is the size of the data type.
  • The sort order. --- Ascending (default) or Descending;

When the data array has been passed to the Sort constructor, the data pointer and the increment arguments can be replaced by a single argument: the offset of the key in each element of the array.

void casa::Sort::sortKey ( const void *  data,
const CountedPtr< BaseCompare > &  ,
uInt  increment,
Order  = Ascending 
)
void casa::Sort::sortKey ( uInt  offset,
DataType  ,
Order  = Ascending 
)
void casa::Sort::sortKey ( uInt  offset,
const CountedPtr< BaseCompare > &  ,
Order  = Ascending 
)
void casa::Sort::swap ( Int  index1,
Int  index2,
uInt indices 
) const [inline, private]

Swap 2 indices.

Definition at line 383 of file Sort.h.

uInt casa::Sort::unique ( Vector< uInt > &  uniqueVector,
uInt  nrrec 
) const

Get all unique records in a sorted array.

The array order is given in the indexVector (as possibly returned by the sort function). The default indexVector is 0..nrrec-1. The index of each first unique record is returned in the uniqueVector. They are indices in the supplied indexVector, so data[indexVector(uniqueVector(i))] is giving the i-th unique record. Note that the records indexed by indexVector(uniqueVector(i)) till indexVector(uniqueVector(i+1)) are all the same.
It returns the number of unique records. The unique array is resized to that number.

uInt casa::Sort::unique ( Vector< uInt > &  uniqueVector,
const Vector< uInt > &  indexVector 
) const

Member Data Documentation

const void* casa::Sort::data_p [private]

Definition at line 377 of file Sort.h.

Definition at line 375 of file Sort.h.

Definition at line 376 of file Sort.h.

Definition at line 378 of file Sort.h.


The documentation for this class was generated from the following file: