When you construct a PagedArray you do not read any data into memory. Instead a disk file (ie. a Table) is created, in a place you specify, to hold the data. This means you need to have enough disk space to hold the array. Constructing a PagedArray is equivalent to opening a file.
Because the data is stored on disk it can be saved after the program, function, or task that created the PagedArray has finished. This saved array can then be read again at a later stage.
So there are two reasons for using a PagedArray:
To access the data in a PagedArray you can either:
In nearly all cases you access the PagedArray by reading a "slice" of the PagedArray into an AIPS++ Array. Because the slice is stored in memory it is important that the slice you read is not too big compared to the physical memory on your computer. Otherwise your computer will page excessively and performance will be poor.
To overcome this you may be tempted to access the PagedArray a pixel at a time. This will use little memory but the overhead of accessing a large data set by separately reading each pixel from disk will also lead to poor performance.
In general the best way to access the data in PagedArrays is to use a LatticeIterator with a cursor size that "fits" nicely into memory. Not only do the LaticeIterator classes provide a relatively simple way to read/write all the data but they optimally set up the cache that is associated with each PagedArray.
If the LatticeIterator classes do not access the data the way you want you can use the getSlice and putSlice member functions. These functions do not set up the cache for you and improved performance may be obtained by tweaking the cache using the setCacheSizeFromPath member frunction.
More Details
In order to utilise PagedArrays fully and understand many of the member
functions and data access methods in this class, you need to be familiar
with some of the concepts involved in the implementation of PagedArrays.
Each PagedArray is stored in one cell of a Table as an indirect Array (see the documentation for the Tables module for more information). This means that multiple PagedArrays can be stored in one Table. To specify which PagedArray you are referring to in a given Table you need to specify the cell using its column name and row number during construction. If a cell is not specified the default column name (as given by the defaultColumnName function) and row number (as given by the defaultRowNumber function) are used. This ability to store multiple PagedArrays's is used in the PagedImage class where the image is stored in one cell and a mask is optionally stored in a another column in the same row.
There are currently a number of limitations when storing multiple PagedArrays in the same Table.
Each PagedArray is stored on disk using the tiled cell storage manager
(TiledCellStMan). This stores the
data in tiles which are regular subsections of the PagedArray. For
example a PagedArray of shape [1024,1024,4,128] may have a tile shape of
[32,16,4,16]. The data in each tile is stored as a unit on the disk. This
means that there is no preferred axis when accessing multi-dimensional
data.
The tile shape can be specified when constructing a new PagedArray but
not when reading an old one as it is intrinsic to the way the data is
stored on disk. It is NOT recommended that you specify the tile shape
unless you can control the lifetime of the PagedArray (this includes the
time it spends on disk), or can guarantee the access pattern. For example
if you know that a PagedArray of shape [512,512,4,32] will always be
sliced plane by plane you may prefer to specify a tile shape of
[512,64,1,1] rather than the default of [32,16,4,16].
Tiles can be cached by the tile storage manager so that it does not need
to read the data from disk every time you are accessing the a pixel in a
different tile. In order to cache the correct tiles you should tell the
storage manager what section of the PagedArray you will be
accessing. This is done using the setCacheSizeFromPath member
function. Alternatively you can set the size of the cache using the
setCacheSizeInTiles member function.
By default there is no limit on how much memory the tile cache can
consume. This can be changed using the setMaximumCacheSize member
function. The tiled storage manager always tries to cache enough tiles to
ensure that each tile is read from disk only once, so setting the maximum
cache size will trade off memory usage for disk I/O. Setting the cache
size is illustrated in example 5 below.
The showCacheStatistics member function is provided to allow you to
evaluate the performance of the tile cache.
const IPosition arrayShape(4,1024,1024,4,256); const String filename("myData_tmp.array"); PagedArray<Float> diskArray(arrayShape, filename); cout << "Created a PagedArray of shape " << diskArray.shape() << " (" << diskArray.shape().product()/1024/1024*sizeof(Float) << " MBytes)" << endl << "in the table called " << diskArray.tableName() << endl; diskArray.set(0.0f); // Using the set function is an efficient way to initialize the PagedArray // as it uses a PagedArrIter internally. Note that the set function is // defined in the Lattice class that PagedArray is derived from.
PagedArray<Float> diskArray("myData_tmp.array"); IPosition shape = diskArray.shape(); // Construct a Gaussian Profile to be 10 channels wide and centred on // channel 16. Its height is 1.0. Gaussian1D<Float> g(1.0f, 16.0f, 10.0f); // Create a vector to cache a sampled version of this profile. Vector<Float> profile(shape(3)); indgen(profile); profile.apply(g); // Now put this profile into every spectral channel in the paged array. This // is best done using an iterator. LatticeIterator<Float> iter(diskArray, TiledLineStepper(shape, diskArray.tileShape(), 3)); for (iter.reset(); !iter.atEnd(); iter++) { iter.woCursor() = profile; }
Table t("myData_tmp.array", Table::Update); PagedArray<Float> da(t); const IPosition latticeShape = da.shape(); const nx = latticeShape(0); const ny = latticeShape(1); const npol = latticeShape(2); const nchan = latticeShape(3); IPosition cursorShape = da.niceCursorShape(); cursorShape(2) = 1; LatticeStepper step(latticeShape, cursorShape); step.subSection(IPosition(4,0), IPosition(4,nx-1,ny-1,0,nchan-1)); LatticeIterator<Float> iter(da, step); for (iter.reset(); !iter.atEnd(); iter++) { iter.rwCursor() *= 10.0f; }
SetupNewTable maskSetup("mask_tmp.array", TableDesc(), Table::New); Table maskTable(maskSetup); PagedArray<Bool> maskArray(IPosition(4,1024,1024,4,256), maskTable); maskArray.set(False); COWPtr<Array<Bool> > maskPtr; maskArray.getSlice(maskPtr, IPosition(4,240,240,3,0), IPosition(4,32,32,1,1), IPosition(4,1)); maskPtr.rwRef() = True; maskArray.putSlice(*maskPtr, IPosition(4,240,240,3,1));
PagedArray<Float> pa(IPosition(4,128,128,4,32)); const IPosition latticeShape = pa.shape(); cout << "The tile shape is:" << pa.tileShape() << endl; // The tile shape is:[32, 16, 4, 16] // Setup to access the PagedArray a row at a time const IPosition sliceShape(4,latticeShape(0), 1, 1, 1); const IPosition stride(4,1); Array<Float> row(sliceShape); IPosition start(4, 0); // Set the cache size to enough pixels for one tile only. This uses // 128kBytes of cache memory and takes 125 secs. pa.setCacheSizeInTiles (1); Timer clock; for (start(3) = 0; start(3) < latticeShape(3); start(3)++) { for (start(2) = 0; start(2) < latticeShape(2); start(2)++) { for (start(1) = 0; start(1) < latticeShape(1); start(1)++) { pa.getSlice(row, start, sliceShape, stride); } } } clock.show(); pa.showCacheStatistics(cout); pa.clearCache(); // Set the cache size to enough pixels for one row of tiles (ie. 4). // This uses 512 kBytes of cache memory and takes 10 secs. pa.setCacheSizeInTiles (4); clock.mark(); for (start(3) = 0; start(3) < latticeShape(3); start(3)++) { for (start(2) = 0; start(2) < latticeShape(2); start(2)++) { for (start(1) = 0; start(1) < latticeShape(1); start(1)++) { pa.getSlice(row, start, sliceShape, stride); } } } clock.show(); pa.showCacheStatistics(cout); pa.clearCache(); // Set the cache size to enough pixels for one plane of tiles // (ie. 4*8). This uses 4 MBytes of cache memory and takes 2 secs. pa.setCacheSizeInTiles (4*8); clock.mark(); for (start(3) = 0; start(3) < latticeShape(3); start(3)++) { for (start(2) = 0; start(2) < latticeShape(2); start(2)++) { for (start(1) = 0; start(1) < latticeShape(1); start(1)++) { pa.getSlice(row, start, sliceShape, stride); } } } clock.show(); pa.showCacheStatistics(cout); pa.clearCache();
Construct a new PagedArray with the specified shape. A new Table with the specified filename is constructed to hold the array. The Table will remain on disk after the PagedArray goes out of scope or is deleted.
Construct a new PagedArray with the specified shape. A scratch Table is created in the current working directory to hold the array. This Table will be deleted automatically when the PagedArray goes out of scope or is deleted.
Construct a new PagedArray, with the specified shape, in the default row and column of the supplied Table.
Construct a new PagedArray, with the specified shape, in the specified row and column of the supplied Table.
Reconstruct from a pre-existing PagedArray in the default row and column of the supplied Table with the supplied filename.
Reconstruct from a pre-existing PagedArray in the default row and column of the supplied Table.
Reconstruct from a pre-existing PagedArray in the specified row and column of the supplied Table.
The copy constructor which uses reference semantics. Copying by value doesn't make sense, because it would require the creation of a temporary (but possibly huge) file on disk.
The destructor flushes the PagedArrays contents to disk.
The assignment operator with reference semantics. As with the copy constructor assigning by value does not make sense.
Make a copy of the object (reference semantics).
A PagedArray is always persistent.
A PagedArray is always paged to disk.
Is the PagedArray writable?
Returns the shape of the PagedArray.
Return the current Table name. By default this includes the full path. The path preceeding the file name can be stripped off on request.
Functions to resize the PagedArray. The old contents are lost. Usage of
this function is NOT currently recommended (see the
Returns the current table name (ie. filename) of this PagedArray.
Return the current table object.
Returns the current Table column name of this PagedArray.
Returns the default TableColumn name for a PagedArray.
Returns an accessor to the tiled storage manager.
Returns the current row number of this PagedArray.
Returns the default row number for a PagedArray.
Returns the current tile shape for this PagedArray.
Returns the maximum recommended number of pixels for a cursor. This is
the number of pixels in a tile.
Set the maximum allowed cache size for all Arrays in this column of the
Table. The actual value used may be smaller. A value of zero means
that there is no maximum.
Return the maximum allowed cache size (in pixels) for all Arrays in
this column of the Table. The actual cache size may be smaller. A
value of zero means that no maximum is currently defined.
Set the actual cache size for this Array to be big enough for the
indicated number of tiles. This cache is not shared with PagedArrays
in other rows and is always clipped to be less than the maximum value
set using the setMaximumCacheSize member function.
Tiles are cached using a first in first out algorithm.
Set the actual cache size for this Array to "fit" the indicated
path. This cache is not shared with PagedArrays in other rows and is
always less than the maximum value. The sliceShape is the cursor or
slice that you will be requiring (with each call to
{get,put}Slice). The windowStart and windowLength delimit the range of
pixels that will ultimatly be accessed. The AxisPath is described in
the documentation for the LatticeStepper class.
Clears and frees up the tile cache. The maximum allowed cache size is
unchanged from when setMaximumCacheSize was last called.
Generate a report on how the cache is doing. This is reset every
time clearCache is called.
Return the value of the single element located at the argument
IPosition.
Note that Lattice::operator() can also be used.
Put the value of a single element.
A function which checks for internal consistency. Returns False if
something nasty has happened to the PagedArray. In that case
it also throws an exception.
This function is used by the LatticeIterator class to generate an
iterator of the correct type for a specified Lattice. Not recommended
for general use.
Do the actual getting of an array of values.
Do the actual getting of an array of values.
Get the best cursor shape.
Handle the (un)locking.
Resynchronize the PagedArray object with the lattice file.
This function is only useful if no read-locking is used, ie.
if the table lock option is UserNoReadLocking or AutoNoReadLocking.
In that cases the table system does not acquire a read-lock, thus
does not synchronize itself automatically.
Flush the data (but do not unlock).
Temporarily close the lattice.
It will be reopened automatically on the next access.
Explicitly reopen the temporarily closed lattice.
const String& tableName() const
Table& table()
const Table& table() const
const String& columnName() const
static String defaultColumn()
const ROTiledStManAccessor& accessor() const
uInt rowNumber() const
static uInt defaultRow()
IPosition tileShape() const
virtual uInt advisedMaxPixels() const
virtual void setMaximumCacheSize (uInt howManyPixels)
virtual uInt maximumCacheSize() const
virtual void setCacheSizeInTiles (uInt howManyTiles)
virtual void setCacheSizeFromPath (const IPosition& sliceShape, const IPosition& windowStart, const IPosition& windowLength, const IPosition& axisPath)
virtual void clearCache()
virtual void showCacheStatistics (ostream& os) const
virtual T getAt (const IPosition& where) const
virtual void putAt (const T& value, const IPosition& where)
virtual Bool ok() const
virtual LatticeIterInterface<T>* makeIter (const T& navigator, Bool useRef) const
virtual Bool doGetSlice (Array<T>& buffer, const Slicer& section)
virtual void doPutSlice (const Array<T>& sourceBuffer, const IPosition& where, const IPosition& stride)
virtual IPosition doNiceCursorShape (uInt maxPixels) const
virtual Bool lock (FileLocker::LockType, uInt nattempts)
virtual void unlock()
virtual Bool hasLock (FileLocker::LockType) const
virtual void resync()
virtual void flush()
virtual void tempClose()
virtual void reopen()
void setTableType()
Set the data in the TableInfo file
void makeArray (const TiledShape& shape)
make the ArrayColumn
void makeTable (const String& filename, Table::TableOption option)
Make a Table to hold this PagedArray
static String defaultComment()
The default comment for PagedArray Colums
ArrayColumn<T>& getRWArray()
Get the writable ArrayColumn object. It is created when needed.
void makeRWArray()
Create the writable ArrayColumn object.
It reopens the table for write when needed.
void doReopen() const
Do the reopen of the table (if not open already).
void tempReopen() const