User's Reference > General > File and File Index Management > Sizing Files and Indexes > Selecting the Modulo for a File

Comment on this topic

Documentation Comments

Use this form to comment on this topic. You can also provide any general observations about the Online Documentation, or request that additional information be added in a future release.

RealityV15.1Online Documentation (MoTW) Revision 7

Selecting the Modulo for a File (File and File Index Management) (m604805+selectfilemodulo.htm)

To

Reality

Version

Topic

Submitted by

Company

Location

Email address

Comment

 

 

Selecting the Modulo for a File

There are two ways to calculate the best modulo for a file data section:

Calculating the Modulo Manually

To calculate a modulo that will facilitate efficient storage and retrieval, you should do the following:

Note: Before determining the modulo for a data section, see File Sizing Considerations.

  1. Calculate the average size of an item-id (Aid).

  2. Calculate the average size of the item data (Adata).

  3. Use the file modulo calculator to calculate an approximate modulo. You must supply the frame size, the values that you estimated in steps 1 and 2, and the number of item in the data section.

  4. Pass the result to the SHOW-MODULI TCL command to obtain the next prime number.

Note: A modulo of 11 is not recommended.

The file modulo calculator uses the following algorithm:

  1. Use the following formula to calculate the average size (S) of an in-group item part.

    If (Adata) > FS / 2 then S = (Aid + 16)  else S = (Aid + 16 + Adata)

    where FS is the frame size of the database.

  2. Determine the number of items per group (I). This can be determined from the average item size, as follows:

    If S > (FS / 3), then I =2

    If (FS / 5) < S <= (FS / 3), then I = 3

    Otherwise, I = INT[((FS - 16) - (S * 1.5)) / S], where INT[ ] means 'Integer part of'.

  3. Calculate the modulo, as follows:

    modulo = next-prime[N / I]

    where,

    N = Number of items in file,

    next-prime[ ] means 'Next higher prime number after'.

Example 1

FS = 1Kb; (FS / 3) < S <= (FS / 2)

N = 3000
Aid = 10
Adata = 323
S = 349
I = 2
N/I = 3000/2 = 1500
modulo = next-prime[1500] = 1511

Example 2

FS = 4Kb; (FS / 5) < S <= (FS / 3)

N = 3000
Aid = 10
Adata = 892
S = 918
I = 3
N/I = 3000/3 = 1000
modulo = next-prime[1000] = 1009

Example 3

FS = 1Kb;  S <= (FS / 5)

N = 3000
Aid = 10
Adata = 23
S = 49
I = INT[((1024 - 16) - (49 * 1.5)) / 49]
  = INT[(1008 - 73.5) / 49]
  = INT[934.5 / 49]
  = INT[19.1]
  = 19
N/I = 3000/19 = 158
modulo = next-prime[158] = 163

File Sizing Goals

There are two goals in selecting the best modulo:

  1. To optimize access time.
  2. To use disk space most efficiently.

These two factors conflict with each other. If you do not give a data section enough primary space, much of its data is stored in overflow (linked) frames. The more linked frames there are, the more disk reads the system must perform to find items on the disk which can be very expensive in terms of response time. On the other hand, a file with too much unused disk space is wasteful.

File Sizing Considerations

There are two circumstances under which you might consider a modification to the way you calculate the modulo of a data section:

File Statistics

A file statistics report tells you how a data section's items are distributed into groups. A report is generated each time you:

For a discussion of file statistics, refer to File Statistics Reports.

You can use the ISTAT and HASH-TEST commands to display file hashing statistics and an optional histogram.

Separation Parameter

Separation is the number of contiguous frames allocated to a group. For compatibility with proprietary versions of Reality, the CREATE-FILE command accepts an optional separation parameter. On current versions of Reality, this is ignored and is always set to 1.

RealityV15.1 (MoTW) Revision 7Comment on this topic