The ToLocal callback function has to allocate a buffer of the correct size and to copy/convert the canonical data in the input buffer to this buffer. The pointer this newly allocated buffer has to be returned. The BucketCache class keeps this pointer in the cache block.
The FromLocal callback function has to copy/convert the data from the buffer in local format to the buffer in canonical format. It should NOT delete the buffer; that has to be done by the DeleteBuffer function.
The AddBuffer callback function has to create (and initialize) a buffer to be added to the file and cache. When the file gets extended, BucketCache only registers the new size, but does not werite anything. When a bucket is read between the actual file size and the new file size, the AddBuffer callback function is called to create a buffer and possibly initialize it.
The DeleteBuffer callback function has to delete the buffer allocated by the ToLocal function.
The functions get a pointer to the owner object, which was provided
at construction time. The callback function has to cast this to the
correct type and can use it thereafter.
C++ supports pointers to members, but it is a bit hard. Therefore pointers
to static members are used (which are simple pointers to functions).
A pointer to the owner object is also passed to let the static function
call the correct member function (when needed).
The class BucketCache provides such a cache. It can be used on a consecutive part of a file as long as that part is not simultaneously accessed in another way (including another BucketCache object).
BucketCache stores the data as given.
It uses
When a new bucket is needed and all slots in the cache are used,
BucketCache will remove the least recently used bucket from the
cache. When the dirty flag is set, it will first be written.
BucketCache maintains a list of free buckets. Initially this list is
empty. When a bucket is removed, it is added to the free list.
AddBucket will take buckets from the free list before extending the file.
Since it is possible to handle only a part of a file by a BucketCache
object, it is also possible to have multiple BucketCache objects on
the same file (as long as they access disjoint parts of the file).
Each BucketCache object can have its own bucket size. This can,
for example, be used to have tiled arrays with different tile shapes
in the same file.
Statistics are kept to know how efficient the cache is working.
It is possible to initialize and show the statistics.
Flush the cache from the given slot on.
By default the entire cache is flushed.
When the entire cache is flushed, possible remaining uninitialized
buckets will be initialized first.
A True status is returned when buckets had to be written.
Clear the cache from the given slot on.
By default the entire cache is cleared.
It will remove the buckets in the cleared part.
If wanted and needed, the buckets are flushed to the file
before removing them.
It can be used to enforce rereading buckets from the file.
Resize the cache.
When the cache gets smaller, the latter buckets are cached out.
It does not take "least recently used" into account.
Resynchronize the object (after another process updated the file).
It clears the cache (so all data will be reread) and sets
the new sizes.
Get the current nr of buckets in the file.
Get the current cache size (in buckets).
Set the dirty bit for the current bucket.
Make another bucket current.
When no more cache slots are available, the one least recently
used is flushed.
The data in the bucket is converted using the ToLocal callback
function. When the bucket does not exist yet in the file, it
gets added and initialized using the AddBuffer callback function.
A pointer to the data in converted format is returned.
Extend the file with the given number of buckets.
The buckets get initialized when they are acquired
(using getBucket) for the first time.
Add a bucket to the file and make it the current one.
When no more cache slots are available, the one least recently
used is flushed.
Remove the current bucket; i.e. add it to the beginning of the
free bucket list.
Get a part from the file outside the cached area.
It is checked if that part is indeed outside the cached file area.
Put a part from the file outside the cached area.
It is checked if that part is indeed outside the cached file area.
Get the bucket number of the first free bucket.
-1 = no free buckets.
Get the number of free buckets.
(Re)initialize the cache statistics.
Show the statistics.
Copy constructor is not possible.
Assignment is not possible.
Set the LRU information for the current slot.
Get a cache slot for the bucket.
Write a bucket.
Read a bucket.
Initialize the bucket buffer.
The uninitialized buckets before this bucket are also initialized.
It returns a pointer to the buffer.
Check if the offset of a non-cached part is correct.
Motivation
A cache may reduce IO traffix considerably.
Furthermore it is more efficient to keep a cache in local format.
In that way conversion to/from local only have to be done when
data gets read/written. It also allows for precalculations.
Example
// Define the callback function for reading a bucket.
char* bToLocal (void*, const char* data)
{
char* ptr = new char[32768];
memcpy (ptr, data, 32768);
return ptr;
}
// Define the callback function for writing a bucket.
void bFromLocal (void*, char* data, const char* local)
{
memcpy (data, local, 32768);
}
// Define the callback function for initializing a new bucket.
char* bAddBuffer (void*)
{
char* ptr = new char[32768];
for (uInt i=0; i++; i<32768) {
ptr[i] = 0;
}
return ptr;
}
// Define the callback function for deleting a bucket.
void bDeleteBuffer (void*, char* buffer)
{
delete [] buffer;
}
void someFunc()
{
// Open the filebuf.
BucketFile file(...);
file.open();
uInt i;
// Create a cache for the part of the file starting at offset 512
// consisting of 1000 buckets. The cache consists of 10 buckets.
// Each bucket is 32768 bytes.
BucketCache cache (&file, 512, 32768, 1000, 10, 0,
bToLocal, bFromLocal, bAddBuffer, bDeleteBuffer);
// Write all buckets into the file.
for (i=0; i<100; i++) {
char* buf = new char[32768];
cache.addBucket (buf);
}
Flush the cache to write all buckets in it.
cache.flush();
// Read all buckets from the file.
for (i=0; i<1000; i++) {
char* buf = cache.getBucket(i);
...
}
cout << cache.nBucket() << endl;
}
To Do
Member Description
BucketCache (BucketFile* file, Int64 startOffset, uInt bucketSize, uInt nrOfBuckets, uInt cacheSize, void* ownerObject, BucketCacheToLocal readCallBack, BucketCacheFromLocal writeCallBack, BucketCacheAddBuffer addCallBack, BucketCacheDeleteBuffer deleteCallBack)
Create the cache for (a part of) a file.
The file part used starts at startOffset. Its length is
bucketSize*nrOfBuckets bytes.
When the file is smaller, the remainder is indicated as an extension
similarly to the behaviour of function extend.
~BucketCache()
Bool flush (uInt fromSlot = 0)
void clear (uInt fromSlot = 0, Bool doFlush = True)
void resize (uInt cacheSize)
void resync (uInt nrBucket, uInt nrOfFreeBucket, Int firstFreeBucket)
uInt nBucket() const
uInt cacheSize() const
void setDirty()
char* getBucket (uInt bucketNr)
void extend (uInt nrBucket)
uInt addBucket (char* data)
When no free buckets are available, the file will be
extended with one bucket. It returns the new bucket number.
The buffer must have been allocated on the heap.
It will get part of the cache; its contents are not copied.
Thus the buffer should hereafter NOT be used for other purposes.
It will be deleted later via the DeleteBuffer callback function.
The data is copied into the bucket. A pointer to the data in
local format is returned.
void removeBucket()
void get (char* buf, uInt length, Int64 offset)
void put (const char* buf, uInt length, Int64 offset)
Int firstFreeBucket() const
uInt nFreeBucket() const
void initStatistics()
void showStatistics (ostream& os) const
BucketCache (const BucketCache&)
BucketCache& operator= (const BucketCache&)
void setLRU()
void getSlot (uInt bucketNr)
void writeBucket (uInt slotNr)
void readBucket (uInt slotNr)
void initializeBuckets (uInt bucketNr)
void checkOffset (uInt length, Int64 offset) const