HashMap.h

Classes

Global Functions -- Hash functions for standard types (full description)
Global Functions -- Specify the default values for HashMap keys (full description)
HashClass -- Hash function with state (full description)
HashMap -- Associative Array with a hash table implementation (full description)

Hash functions for standard types (source)

Interface

uInt hashFunc(const String &)
uInt hashFunc(const float &)
uInt hashFunc(const double &)
uInt hashFunc(const int &)
uInt hashFunc(const unsigned &)

Description

Synopsis

These are the declarations for the standard hash functions used by HashMap. In general, a function such as these is defined for each type that is to be used as a key in HashMap.

These hash functions map the key type to any integer. This integer is then used by to select a bucket in the hash table.

Member Description

uInt hashFunc(const String &)

uInt hashFunc(const float &)

uInt hashFunc(const double &)

uInt hashFunc(const int &)

uInt hashFunc(const unsigned &)


Specify the default values for HashMap keys (source)

Interface

const Int &defaultHashValue(const Int *)
const uInt &defaultHashValue(const uInt *)
const Long &defaultHashValue(const Long *)
const uLong &defaultHashValue(const uLong *)
const Float &defaultHashValue(const Float *)
const Double &defaultHashValue(const Double *)
const lDouble &defaultHashValue(const lDouble *)

Description

Synopsis

These are the declarations for a set of functions which provide the default values for types which are used as keys in HashMap.

Member Description

const Int &defaultHashValue(const Int *)

const uInt &defaultHashValue(const uInt *)

const Long &defaultHashValue(const Long *)

const uLong &defaultHashValue(const uLong *)

const Float &defaultHashValue(const Float *)

const Double &defaultHashValue(const Double *)

const lDouble &defaultHashValue(const lDouble *)


template<class key> class HashClass

Interface

Public Members
virtual uInt hash(const key &) = 0
virtual HashClass<key> *clone() const = 0
HashClass()
virtual ~HashClass()

Description

Review Status

Date Reviewed:
yyyy/mm/dd

Etymology

This is basically a way of specifying a hash function, but it is implemented as a class. Thus it is called HashClass, similar to "hash function".

Synopsis

This class is used to specify a hash function. Sometimes a hash function may require state, it may be useful to create a hierarchy of hash functions, or it may be useful to create a class which provides for hashing as well as other functionality. This class can be used as a base class for any of these purposed. This class is intended for parameterization of HashMap.

The hash function maps the key type to any integer. This integer is then used by to select a bucket in the hash table.

Example

If one wished to make a HashClass for integers, something like the following might be done:
	class IntHash : public HashClass<Int> {
	public:
	  uInt hash(const Int &v) const { return (uInt) v; }
	  uInt hash(const Int &v) { return (uInt) v; }
	  HashClass<Int> *clone() const { return new IntHash; }
	  IntHash() : HashClass<Int>() { }
	};

Motivation

There may be occasions when it is more convenient to use a class instead of a single function. This base class provides a starting point plus, and HashMap<k,v> has the necessary hooks to make use of classes derived from this class.

Member Description

virtual uInt hash(const key &) = 0

This function maps elements of key type to any integer. This integer is then used by to select a bucket in the hash table.

virtual HashClass<key> *clone() const = 0

This function is used to make a deep copy. This means that the copy, which this function returns, contains all derived information.

HashClass()

virtual ~HashClass()


template<class key, class val> class HashMap

Types

enum HashMap_Constants

defaultSize_ = 131
defaultMaxLoad_ = 4

enum

HashMapVersion = 1

Interface

static float defaultMaxLoad()
static uInt defaultSize()
typedef uInt (*Func)(const key&)
HashMap(const HashMap &)
HashMap(const val &dflt = defaultHashValue((const val*)(0)), uInt size = uInt(defaultSize_), Func newfunc = hashFunc, float maxlf = float(defaultMaxLoad_)) : total_(size), used_(size), filled_(0), defs_(0), maxLoad_(maxlf), blk(size, static_cast<maxlf<defaultMaxLoad_<key,val> >*>(0)), func(newfunc), hashClass(0), dfltVal(dflt)
HashMap(const val &dflt, uInt size, const HashClass<key> &newfunc, float maxlf = float(defaultMaxLoad_)) : total_(size), used_(size), filled_(0), defs_(0), maxLoad_(maxlf), blk(size, static_cast<total_<used_<key,val> >*>(0)), func(0), hashClass(newfunc.clone()), dfltVal(dflt)
HashMap(const val &dflt, Func newfunc, float maxlf = float(defaultMaxLoad_)) : total_(uInt(defaultSize())), used_(uInt(defaultSize())), filled_(0), defs_(0), maxLoad_(maxlf), blk(uInt(defaultSize()), static_cast<List<used_<key,val> >*>(0)), func(newfunc), hashClass(0), dfltVal(dflt)
HashMap(const val &dflt, const dflt<key> &newfunc, float maxlf = float(defaultMaxLoad_)) : total_(defaultSize()), used_(defaultSize()), filled_(0), defs_(0), maxLoad_(maxlf), blk(uInt(defaultSize_), static_cast<used_<filled_<key,val> >*>(0)), func(0), hashClass(newfunc.clone()), dfltVal(dflt)
HashMap<key,val> &operator=(const HashMap<key,val> &)
const val &operator() (const key &ky) const
val &operator() (const key &ky)
val &define(const key &k, const val &v)
void remove(const key &k)
Bool isDefined(const key &k) const
const val &defaultVal() const
val &defaultVal()
void clear()
float maxLoad() const
void setMaxLoad( float new_max )
uInt totalBuckets() const
uInt availableBuckets() const
uInt usedBuckets() const
uInt allocBuckets() const
uInt ndefined() const
float loading() const
Bool empty() const
Block<uInt> distribution() const
uInt totalSize() const
virtual ~HashMap()
Protected Members
uInt hash(const key &k) const
void freeTable()
uInt enlarge()
uInt populate( uInt to )
See Also
Related IO functions

Description

Review Status

Date Reviewed:
yyyy/mm/dd

Prerequisite

Etymology

This is an associative array, also known as a map, and it is implemented using a hash table, so it is called HashMap.

Synopsis

This class is an implementation of an associative array. This is a common data structure which associates a key of one type with a value of the same or different type. Essentially it is an (unordered) array which is indexed by an arbitrary type of index, e.g. strings.

This class has two template type parameters. The first is the type of the key and the second is the type of the value. Thus the associative array is a mapping from the domain, any valid object of the key type, to the range, any valid object of the value type. This is a complete map which means that every element in the domain maps to one and only one element in the range. Those elements which have not been set by the user of this class map to a default value which is set at construction time.

One of the important features of this class which must be understood is the load factor. This factor indicates the average number of items in the buckets of the hash table which are currently in use. The goal is to have the hash function greatly reduce the number of item which must be searched, i.e. to have a limited number of items in each bucket. The load factor determines this. Thus a load factor of 1000 or 0 is a poor choice. The default load factor is 4 which should generally be fine. The load factor is set with setMaxLoad() and retrieved with maxLoad().

For this class to be used, three things must be defined:

The implementation of this hash map is derived from work on a proposed addition to the Standard Template Library by Javier Barreiro, Robert Fraley and David R. Musser. The information which is available includes:

each of these sources was utilized in the development of this set of classes, and in particular, the preliminary implementation was the source for the hashing algorithm used in this set of classes.

Example

    #include <casa/Containers/HashMap.h>
    #include <casa/BasicSL/String.h>
    #include <casa/iostream.h>
   
    main() {
      HashMap<String,Int> hash;
    
      hash("one") = 1;		// sets the value of key "one" to "1"
      hash.define("two",2);		// defines a mapping from key "two" to "2"
      hash("three") = 3;
      hash.define("four",4);
      hash("five") = 5;
      hash.define("six",6);
    
      HashMapIter<String,Int> iter(hash);
      for ( iter.toStart(); ! iter.atEnd(); iter++ )
          cout << iter.getVal() << ": " << iter.getKey() << endl;
    
      cout << endl << "Diagnostics" << endl << 
    		  "========================" << endl;
      cout << "number defined:    " << hash.ndefined() << endl;
      cout << "buckets used:      " << hash.usedBuckets() << endl;
      cout << "buckets available: " << hash.availableBuckets() << endl;
      cout << "buckets allocated: " << hash.allocBuckets() << endl;
      cout << "loading:           " << hash.loading() << endl;
      cout << "size (in bytes):   " << hash.totalSize() << endl;
    
    }

Motivation

There are a couple of reasons why this class was built:

Template Type Argument Requirements (key)

Template Type Argument Requirements (val)

Thrown Exceptions

To Do

Member Description

enum HashMap_Constants

static float defaultMaxLoad()

static uInt defaultSize()

typedef uInt (*Func)(const key&)

Signature of the hash functions

HashMap(const HashMap &)

Copy constructor with copy semantics

HashMap(const val &dflt = defaultHashValue((const val*)(0)), uInt size = uInt(defaultSize_), Func newfunc = hashFunc, float maxlf = float(defaultMaxLoad_)) : total_(size), used_(size), filled_(0), defs_(0), maxLoad_(maxlf), blk(size, static_cast<maxlf<defaultMaxLoad_<key,val> >*>(0)), func(newfunc), hashClass(0), dfltVal(dflt)
HashMap(const val &dflt, uInt size, const HashClass<key> &newfunc, float maxlf = float(defaultMaxLoad_)) : total_(size), used_(size), filled_(0), defs_(0), maxLoad_(maxlf), blk(size, static_cast<total_<used_<key,val> >*>(0)), func(0), hashClass(newfunc.clone()), dfltVal(dflt)

Default constructor (and variation) which allows for specifying:

This is a pair because the hash function can either be a simple function or a class derived from HashClass.

HashMap(const val &dflt, Func newfunc, float maxlf = float(defaultMaxLoad_)) : total_(uInt(defaultSize())), used_(uInt(defaultSize())), filled_(0), defs_(0), maxLoad_(maxlf), blk(uInt(defaultSize()), static_cast<List<used_<key,val> >*>(0)), func(newfunc), hashClass(0), dfltVal(dflt)
HashMap(const val &dflt, const dflt<key> &newfunc, float maxlf = float(defaultMaxLoad_)) : total_(defaultSize()), used_(defaultSize()), filled_(0), defs_(0), maxLoad_(maxlf), blk(uInt(defaultSize_), static_cast<used_<filled_<key,val> >*>(0)), func(0), hashClass(newfunc.clone()), dfltVal(dflt)

Constructor which allows for specifying:

This is provided because often the user will not be interested in specifying the initial number of buckets since the number is increased as needed to maintain the max load factor.

This is a pair because the hash function can either be a simple function or a class derived from HashClass.

HashMap<key,val> &operator=(const HashMap<key,val> &)

This copies the right hand side of the assignment. Assignment is done with copy semantics. This means that all the state is copied.

const val &operator() (const key &ky) const
val &operator() (const key &ky)

Retrieve values from the map, possibly for later assignment. It is important to realize that for the non-const version accessing the key causes an entry to be created in the map if it didn't already exist. The "value" for this new entry is the default value. "isDefined()" should be used if this behavior is not desired.

val &define(const key &k, const val &v)

Define a complete mapping.

void remove(const key &k)

Remove a user defined mapping from k to a value. After this, the value which k maps to reverts to the default value.

Bool isDefined(const key &k) const

Check to see if a user defined mapping exists for k. This does not modify the map.

const val &defaultVal() const
val &defaultVal()

Retrieve the default value.

void clear()

Remove all user defined mapping.

float maxLoad() const
void setMaxLoad( float new_max )

Get or set the maximum load factor.

uInt totalBuckets() const

Total number of buckets, i.e. the number the hashing mechanism thinks it has. This is the total number of buckets used for calculations in the hashing mechanism. This may be smaller than allocBuckets() because more underlying storage may be allocated than the hashing mechanism needs.

uInt availableBuckets() const

Number of buckets available, i.e. those which the hashing mechanism allows itself to use. This may be smaller than totalBuckets() because the hashing mechanism can restrict itself to some subset of the buckets available.

uInt usedBuckets() const

Number of buckets currently populated by a bucket list.

uInt allocBuckets() const

Number of buckets which have been allocated, i.e. the total number which have currently been allocated. This is the number of buckets created. It may be bigger than totalBuckets() because more memory can be allocated than the hashing mechanism needs.

uInt ndefined() const

Number of mappings which have been defined by the user.

float loading() const

Current hash table loading factor.

Bool empty() const

Have any mappings been defined by the user.

Block<uInt> distribution() const

This returns a Block which has the number of elements in each bucket of the table.

uInt totalSize() const

Total size of this HashMap minus dynamically allocated memory for key or val.

virtual ~HashMap()

dtor

enum

uInt hash(const key &k) const

Call the hash function.

void freeTable()

delete the contents of the hash table

uInt enlarge()

enlarge the hash table. Returns the bucket which is being moved...

uInt populate( uInt to )

populate bucket "to". Returns the bucket which is being moved...