SSMBase.h
Classes
- SSMBase -- Base class of the Standard Storage Manager (full description)
Interface
- Public Members
- explicit SSMBase (Int aBucketSize=0, uInt aCacheSize=1)
- explicit SSMBase (const String& aDataManName, Int aBucketSize=0, uInt aCacheSize=1)
- SSMBase (const String& aDataManName, const Record& spec)
- ~SSMBase()
- virtual DataManager* clone() const
- virtual String dataManagerType() const
- virtual String dataManagerName() const
- virtual Record dataManagerSpec() const
- uInt getVersion() const
- void setCacheSize (uInt aCacheSize)
- uInt getCacheSize() const
- void clearCache()
- void showCacheStatistics (ostream& anOs) const
- void showIndexStatistics (ostream & anOs) const
- void showBaseStatistics (ostream & anOs) const
- uInt getBucketSize() const
- uInt getNRow() const
- virtual Bool canAddRow() const
- virtual Bool canRemoveRow() const
- virtual Bool canAddColumn() const
- virtual Bool canRemoveColumn() const
- static DataManager* makeObject (const String& aDataManType, const Record& spec)
- SSMColumn& getColumn (uInt aColNr)
- SSMIndex& getIndex (uInt anIdxNr)
- void setBucketDirty()
- StManArrayFile* openArrayFile (ByteIO::OpenOption anOpt)
- char* find (uInt aRowNr, uInt aColNr, uInt& aStartRow, uInt& anEndRow)
- uInt getNewBucket()
- char* getBucket (uInt aBucketNr)
- void removeBucket (uInt aBucketNr)
- uInt getRowsPerBucket (uInt aColumn) const
- SSMStringHandler* getStringHandler()
- static char* readCallBack (void* anOwner, const char* aBucketStorage)
- static void writeCallBack (void* anOwner, char* aBucketStorage, const char* aBucket)
- static void deleteCallBack (void*, char* aBucket)
- static char* initCallBack (void* anOwner)
- Private Members
- SSMBase (const SSMBase& that)
- SSMBase& operator= (const SSMBase& that)
- void recreate()
- virtual Bool flush (AipsIO&, Bool doFsync)
- virtual void create (uInt aNrRows)
- virtual void open (uInt aRowNr, AipsIO&)
- virtual void resync (uInt aRowNr)
- virtual void reopenRW()
- virtual void deleteManager()
- void init()
- uInt setBucketSize()
- uInt getNrIndices() const
- virtual void addRow (uInt aNrRows)
- virtual void removeRow (uInt aRowNr)
- virtual void addColumn (DataManagerColumn*)
- virtual void removeColumn (DataManagerColumn*)
- virtual DataManagerColumn* makeScalarColumn (const String& aName, int aDataType, const String& aDataTypeID)
- virtual DataManagerColumn* makeDirArrColumn (const String& aName, int aDataType, const String& aDataTypeID)
- virtual DataManagerColumn* makeIndArrColumn (const String& aName, int aDataType, const String& aDataTypeID)
- BucketCache& getCache()
- void makeCache()
- void readHeader()
- void readIndexBuckets()
- void writeIndex()
Review Status
- Reviewed By:
- UNKNOWN
- Date Reviewed:
- before2004/08/25
- Programs:
- Tests:
Prerequisite
Etymology
SSMBase is the base class of the Standard Storage Manager.
Synopsis
The global principles of this class are described in
StandardStMan.
The Standard Storage Manager divides the data file in equally sized
chunks called buckets. There are 3 types of buckets:
- Data buckets containing the fixed length data (scalars and
direct arrays of data type Int, Float, Bool, etc.).
For variable shaped data (strings and indirect arrays) they
contain references to the actual data position in the
string buckets or in an external file.
- String buckets containing strings and array of strings.
- Index buckets containing the index info for the data buckets.
Bucket access is handled by class
BucketCache.
It also keeps a list of free buckets. A bucket is freed when it is
not needed anymore (e.g. all data from it are deleted).
Data buckets form the main part of the SSM. The data can be viewed as
a few streams of buckets, where each stream contains the data of
a given number of columns. Each stream has an
SSMIndex object describing the
number of rows stored in each data bucket of the stream.
The SSM starts with a single bucket stream (holding all columns),
but when columns are added, new bucket streams might be created.
For example, we have an SSM with a bucket size of 100 bytes.
There are 5 Int columns (A,B,C,D,E) each taking 4 bytes per row.
Column A, B, C, and D are stored in bucket stream 1, while column
E is stored in bucket stream 2. So in stream 1 each bucket can hold
6 rows, while in stream 2 each bucket can hold 25 rows.
For a 100 row table it will result in 17+4 data buckets.
A few classes collaborate to make it work:
- Each bucket stream has an SSMIndex
object to map row number to bucket number.
Note that in principle each bucket in a stream contains the same
number of rows. However, when a row is deleted it is removed
from its bucket shifting the remainder to the left. Data in the
next buckets is not shifted, so that bucket has now one row less.
- For each column SSMBase knows to which bucket stream it belongs
and at which offset the column starts in a bucket.
Note that column data in a bucket are adjacent, which is done
to make it easier to use the
ColumnCache object in SSMColumn
and to be able to efficiently store Bool values as bits.
- Each column has an SSMColumn
object knowing how many bits each data cell takes in a bucket.
The SSMColumn objects handle all access to data in the columns
(using SSMBase and SSMIndex).
String buckets are used by class
SSMStringHandler to
store scalar strings and fixed and variable shaped arrays of strings.
The bucketnr, offset, and length of such string (arrays) are stored
in the data buckets.
Indirect arrays of other data types are also stored indirectly
and their offset is stored in the data buckets. Such arrays are
handled by class StIndArray
which uses an extra file to store the arrays.
Index buckets are used by SSMBase to make the SSMIndex data persistent.
It uses alternately 2 sets of index buckets. In that way there is
always an index availanle in case the system crashes.
If possible 2 halfs of a single bucket are used alternately, otherwise
separate buckets are used.
Motivation
The public interface of SSMBase is quite large, because the other
internal SSM classes need these functions. To have a class with a
minimal interface for the normal user, class StandardStMan
is derived from it.
StandardStMan needs an isA- instead of hasA-relation to be
able to bind columns to it in class
SetupNewTable.
To Do
- Remove AipsIO argument from open and close.
- When only 1 bucket in use addcolumn can check if there's enough
room to fit the new column (so rearange the bucket) in the free
row space.
Member Description
explicit SSMBase (Int aBucketSize=0, uInt aCacheSize=1)
Create a Standard storage manager with default name SSM.
explicit SSMBase (const String& aDataManName, Int aBucketSize=0, uInt aCacheSize=1)
Create a Standard storage manager with the given name.
Create a Standard storage manager with the given name.
The specifications are part of the record (as created by dataManagerSpec).
Clone this object.
It does not clone SSMColumn objects possibly used.
The caller has to delete the newly created object.
Get the type name of the data manager (i.e. StandardStMan).
Get the name given to the storage manager (in the constructor).
Record a record containing data manager specifications.
Get the version of the class.
Set the cache size (in buckets).
Get the current cache size (in buckets).
Clear the cache used by this storage manager.
It will flush the cache as needed and remove all buckets from it.
Show the statistics of all caches used.
Show Statistics of all indices used.
Show Statistics of the Base offsets/index etc.
Get the bucket size.
Get the number of rows in this storage manager.
virtual Bool canAddRow() const
The storage manager can add rows.
The storage manager can delete rows.
The storage manager can add columns.
The storage manager can delete columns.
Make the object from the type name string.
This function gets registered in the DataManager "constructor" map.
The caller has to delete the object.
Get access to the given column.
Get access to the given Index.
Make the current bucket in the cache dirty (i.e. something has been
changed in it and it needs to be written when removed from the cache).
(used by SSMColumn::putValue).
Open (if needed) the file for indirect arrays with the given mode.
Return a pointer to the object.
char* find (uInt aRowNr, uInt aColNr, uInt& aStartRow, uInt& anEndRow)
Find the bucket containing the column and row and return the pointer
to the beginning of the column data in that bucket.
It also fills in the start and end row for the column data.
Add a new bucket and get its bucket number.
char* getBucket (uInt aBucketNr)
Read the bucket (if needed) and return the pointer to it.
Remove a bucket from the bucket cache.
Get rows per bucket for the given column.
Return a pointer to the (one and only) StringHandler object.
static char* readCallBack (void* anOwner, const char* aBucketStorage)
Callbacks for BucketCache access.
static void writeCallBack (void* anOwner, char* aBucketStorage, const char* aBucket)
static void deleteCallBack (void*, char* aBucket)
static char* initCallBack (void* anOwner)
SSMBase (const SSMBase& that)
Copy constructor (only meant for clone function).
SSMBase& operator= (const SSMBase& that)
Assignment cannot be used.
(Re)create the index, file, and cache object.
It is used when all rows are deleted from the table.
virtual Bool flush (AipsIO&, Bool doFsync)
Flush and optionally fsync the data.
It returns a True status if it had to flush (i.e. if data have changed).
virtual void create (uInt aNrRows)
Let the storage manager create files as needed for a new table.
This allows a column with an indirect array to create its file.
virtual void open (uInt aRowNr, AipsIO&)
Open the storage manager file for an existing table, read in
the data, and let the SSMColumn objects read their data.
virtual void resync (uInt aRowNr)
Resync the storage manager with the new file contents.
This is done by clearing the cache.
Reopen the storage manager files for read/write.
The data manager will be deleted (because all its columns are
requested to be deleted).
So clean up the things needed (e.g. delete files).
Let the storage manager initialize itself (upon creation).
It determines the bucket size and fills the index.
Determine and set the bucket size.
It returns the number of rows per bucket.
Get the number of indices in use.
virtual void addRow (uInt aNrRows)
Add rows to the storage manager.
Per column it extends number of rows.
virtual void removeRow (uInt aRowNr)
Delete a row from all columns.
Do the final addition of a column.
Remove a column from the data file.
Create a column in the storage manager on behalf of a table column.
The caller has to delete the newly created object.
Create a scalar column.
Create a column in the storage manager on behalf of a table column.
The caller has to delete the newly created object.
Create a direct array column.
Create a column in the storage manager on behalf of a table column.
The caller has to delete the newly created object.
Create an indirect array column.
Get the cache object.
This will construct the cache object if not present yet.
The cache object will be deleted by the destructor.
Construct the cache object (if not constructed yet).
Read the header..
Read the index from its buckets.
Write the header and the indices.