Sizing Files and Indexes
Files
When you first create a Reality file, you specify a modulo that defines the number of groups (frames) allocated to the file. On a properly sized file, the hashing algorithm used enables in-group items to be accessed in just one disk read.
However, as the file grows, some items will be moved into overflow frames. The more a group overflows, the more disk reads it takes to access items within the group, and the less efficient access becomes. Also, if deleting or updating an item causes a group to shrink, any unused frames outside the primary space are returned to the pool of available space, which can result in inefficient allocation of disk space.
There are two ways of dealing with this:
-
Use automatic file sizing. This allows files to expand by splitting each group in half, one at a time. Then, when all the groups have been split, the modulo is doubled and the splitting process starts again.
Note that the splitting process only normally occurs when the file is updated, but they can be forced to expand if required. Nevertheless, it can take a long time for a badly sized file to reach the best size, and it is therefore recommended that a suitable initial modulo should be chosen before creating a new file or converting an existing file to automatic sizing (for existing files, the initial modulo must be set manually as described below).
Files do not contract automatically - a TCL command is available to do this.
-
Use fixed modulos and manually resize when necessary. For normal files this can be done in three ways:
- You can create a new file with the required modulo and copy the data into it. This is the best method if you need to resize individual files.
-
You can set the file's reallocation parameters and then save and restore the file using a logical save and restore procedure. Note, however, that this is a time consuming operation and the data is not available during the file restore.
This is the best method to use if you want to resize one or more complete accounts or a whole database.
- On a partition database, you can resize files while they are in use. However, this method has severe restrictions.
A fourth method is available for resizing system files.
When deciding which of these methods to use, you should consider the following:
- The best performance will be obtained from a correctly sized file with a fixed modulo.
- An automatically sized file gives better performance than a badly sized file with a fixed modulo.
- If you simply convert a badly sized file to automatic sizing, performance will improve, but it may take a long time to reach its best performance. It is therefore best to set the initial modulo to an appropriate value when converting a file to automatic sizing.
- When creating new files, you should choose the best modulo based on the type of data and the expected file size, even if you intend to use automatic sizing.
- If you expect your file to grow, automatic sizing is the best choice.
- Automatically sized files do not automatically contract.
- A modulo should ideally be a prime number.
- An automatically sized file cannot be contracted if it has an odd numbered modulo, so if you think that you might need to contract a file to below its initial size, you should give it an initial modulo that is an even number.
Indexes
The effectiveness of an index depends on selecting the optimum modulo (number of groups). If the modulo is too small, more disk reads will be necessary when using the index, while if it is too large, the index will occupy an excessive amount of disk space.
Because of this, it is recommended that indexes should use automatic file sizing. By default, new indexes (created with CREATE-INDEX) are automatically sized, or you can convert an existing index with the AFS-ENABLE command. When creating a new index, the modulo will be optimised as the index is populated. Note, however, that creating an index will be much quicker if the initial modulo is close to the optimum value - Selecting the Modulo for an Index describes how to calculate the best initial modulo.