casa
$Rev:20696$
|
The Incremental Storage Manager. More...
#include <IncrementalStMan.h>
Public Member Functions | |
IncrementalStMan (uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1) | |
Create an incremental storage manager with the given name. | |
IncrementalStMan (const String &dataManagerName, uInt bucketSize=0, Bool checkBucketSize=True, uInt cacheSize=1) | |
~IncrementalStMan () | |
Private Member Functions | |
IncrementalStMan (const IncrementalStMan &that) | |
Copy constructor cannot be used. | |
IncrementalStMan & | operator= (const IncrementalStMan &that) |
Assignment cannot be used. |
The Incremental Storage Manager.
Public interface
IncrementalStMan is the data manager storing values in an incremental way (similar to an incremental backup). A value is only stored when it differs from the previous value.
IncrementalStMan stores the data in a way that a value is only stored when it is different from the value in the previous row. This storage manager is very well suited for columns with slowly changing values, because the resulting file can be much smaller. It is not suited at all for columns with continuously changing data.
In general it can be advantageous to use this storage manager when a value changes at most every 4 rows (although it depends on the length of the data values themselves). The following simple example shows the approximate savings that can be achieved when storing a column with double values changing every CH rows.
\#rows CH normal length ISM length compress ratio 50000 5 4000000 1606000 2.5 50000 50 4000000 164000 24.5 50000 500 4000000 32800 122
There is a special test program nISMBucket
in the Tables module doing a simple, but usually adequate, simulation of the amount of storage needed for a scenario.
IncrementalStMan stores the values (and associated indices) in fixed-length buckets. A BucketCache object is used to read/write the buckets. The default cache size is 1 bucket (which is fine for sequential access), but for random access it can make sense to increase the size of the cache. This can be done using the class ROIncrementalStManAccessor .
The IncrementalStMan can hold values of any standard data type (thus from Bool to String). It can handle scalars, direct and indirect arrays. It can support an arbitrary number of columns. The values in each of them can vary at its own speed.
A bucket contains the values of several consecutive rows. At the beginning of a bucket the values of the starting row of all columns for this storage manager are repeated. In this way the value of a cell can always be found in the bucket and no references to previous buckets are needed.
A bucket should be big enough to hold all starting values and a reasonable number of other values. As a rule of thumb it should be big enough to hold at least 100 values of each column. In general the default bucket size will do. Only in special cases (e.g. when storing large variable length strings) the bucket size should be set explicitly. Giving a zero bucket size means that a suitale default bucket size will be calculated.
When a table is filled sequentially each bucket can be filled as much as possible. When writing in a random way, buckets can contain some unused space, because a bucket in the middle of the file has to be split when a new value has to be put in it.
Each column in the IncrementalStMan has the following properties to achieve the "store-different-values-only" behaviour.
add 1 row; put value in row N; add M rows;
add M+1 rows; put value in row N;
The IncrementalStMan is optimized for sequential access to a table.
- A bucket is accessed only once, because a bucket contains consecutive rows.
- For each column a copy is kept of the last value read. So the value for the next rows (with that same value) is immediately available.
For random access the performance can be improved by setting the cache size using class
Note: This class contains many public functions which are only used by other ISM classes; The only useful function for the user is the constructor;
IncrementalStMan can save a lot of storage space. Unlike the old StManMirAIO it stores the values directly in the file to save on memory usage.
This example shows how to create a table and how to attach the storage manager to some columns.
SetupNewTable newtab("name.data", tableDesc, Table::New); IncrementalStMan stman; // define storage manager newtab.bindColumn ("column1", stman); // bind column to st.man. newtab.bindColumn ("column2", stman); // bind column to st.man. Table tab(newtab); // actually create table
Definition at line 181 of file IncrementalStMan.h.
casa::IncrementalStMan::IncrementalStMan | ( | uInt | bucketSize = 0 , |
Bool | checkBucketSize = True , |
||
uInt | cacheSize = 1 |
||
) | [explicit] |
Create an incremental storage manager with the given name.
If no name is used, it is set to an empty string. The name can be used to construct a
ROIncrementalStManAccessor object (e.g. to set the cache size).
The bucket size has to be given in bytes and the cache size in buckets. Bucket size 0 means that the storage manager will set the bucket size such that it can contain about 100 rows (with a minimum size of 32768 bytes). However, if that results in a very large bucket size (>327680) it'll make it smaller. Note it uses 32 bytes for the size of variable length strings, so this heuristic may fail when a column contains large strings. When checkBucketSize
is set and Bucket size > 0 the storage manager throws an exception when the size is too small to hold the values of at least 2 rows. For this check it uses 0 for the length of variable length strings.
casa::IncrementalStMan::IncrementalStMan | ( | const String & | dataManagerName, |
uInt | bucketSize = 0 , |
||
Bool | checkBucketSize = True , |
||
uInt | cacheSize = 1 |
||
) | [explicit] |
casa::IncrementalStMan::IncrementalStMan | ( | const IncrementalStMan & | that | ) | [private] |
Copy constructor cannot be used.
IncrementalStMan& casa::IncrementalStMan::operator= | ( | const IncrementalStMan & | that | ) | [private] |
Assignment cannot be used.