ISTAT

Produces hashing statistics and, optionally, a histogram for items in a selected file.

Note

HASH-TEST is a similar command that provides hashing statistics for a specified file based on a user-specified test modulo.

Command class

English command.

Syntax

ISTAT file-specifier {item-list} {selection-criteria} {(options}

Syntax elements

file-specifier The file for which you want statistics.

item-listA list of items for which statistics are required (see Item Lists). This can be provided by a list-generating command executed immediately before ISTAT.

selection-criteria Item selection criteria. These can limit the items for which statistics are provided.

Restrictions

Hyper files cannot be used with this command; the underlying physical data files need to be analysed separately.

Options

H Displays a histogram.

J Displays the hashing statistics for an index as a report (see Example 2). item-list and selection-criteria are ignored. You will be prompted to enter the name of the index and, at the end of the report, to enter one of the commands listed in Commands.

N No automatic paging of output to terminal.

P Sends output to printer.

R Suppresses 'Item not on file'  messages.

U Displays the hashing statistics as a report, as shown in Example 1. item-list and selection-criteria are ignored. At the end of the report you will be prompted to enter one of the commands listed in Commands.

Default display

When used without the H, J or U options, ISTAT displays the following information:

Field

Description

File

The name of the file.

Groups

The number of groups in the file.

Type

The type of hashing algorithm used for the file; one of the following:

1 Pre-V9.0 (MultiValue compatible) hashing algorithm.

2 V9.0 hashing algorithm (supports long item-ids) without auto sizing.

12 Case-insensitive hashing algorithm without auto sizing.

21 Pre-V9.0 (MultiValue compatible) hashing algorithm with auto sizing.

22 V9.0 hashing algorithm (supports long item-ids) with auto sizing.

32 Case-insensitive hashing algorithm with auto sizing.

IIDs

Whether the item-ids in the file are case-sensitive or case-insensitive.

Item count

The total number of items in the file.

byte count

The size of the file (bytes).

avg. bytes/item

The average size of an item (bytes).

avg. items/group

The average number of items in each group.

std. deviation

The standard deviation of the number of items in each group.

avg. bytes/group

The average amount of data in each group (bytes).

frames in use

The number of frames used by the file.

sec. frames

The number of secondary (overflow) frames used by the file.

empty groups

The number of unused groups in the file.

ind. items

The number of indirect (out of group) items in the file.

ind. frames

The number of additional frames allocated to hold the indirect items.

For example:

:ISTAT SYSPL
File: SYSPL Groups: 19 Type: 02 IIDs: Sensitive     12:37:38 30 Jun 2009 Item count= 18, byte count= 2244, avg. bytes/item= 124.6
avg. items/group= .9, std. deviation= 1.0, avg. bytes/group= 118.1
frames in use= 20, sec. frames= 0, empty groups= 7
ind. items= 10, ind. frames= 37

Refer to the topic File Structure for descriptions of groups, frames, etc.

Histogram

The histogram display (H option) is similar to the default display, but includes additional information about the sizes of the items in the file. For example:

:ISTAT ACCPAY (H

File: ACCPAY Groups: 19 Type: 02 IIDs: Sensitive       12:50:16  30 Jun 2009
  BYTES ITMS
   1281   17 *>>>>>>>>>>>>>>>>>
   1865   20 *>>>>>>>>>>>>>>>>>>>>
    505   12 *>>>>>>>>>>>>
   1353   22 *>>>>>>>>>>>>>>>>>>>>>>
    417   13 *>>>>>>>>>>>>>
   1361   15 *>>>>>>>>>>>>>>>
   1353   13 *>>>>>>>>>>>>>
Item count=         112, byte count=      8128, avg. bytes/item=     72.5
avg. items/group=   6.5, std. deviation=   3.8, avg. bytes/group=  1161.1
frames in use=       12, sec. frames=        1, empty groups=           0
ind. items=          12, ind. frames=       14

Alternative reports using U and J options

When you use ISTAT with the U or J option, it displays the hashing statistics in an alternative way and allows you to interactively change the report displayed by entering commands at the Command: prompt. Commands are available for generating different hashing statistics for different moduli and for changing the number of rows in the report. See Commands below.

An example of the alternative report, displayed by ISTAT with the U option, is shown in Example 2. The report displayed for an index using the J option is similar (see Example 3). It comprises four columns, each with n rows, where n can be specified at the Command: prompt using the command nn. The default number of rows is 0 to 11.

Header

For normal hashed data sections and indexes, the header at the top of the displays the following:

Field

Description

Modulus

The number of groups.

Frame-size

The number of bytes available for data in each frame (the frame size is set when the database is created - see Frame Size).

Hash-type

The same as the Type field on the default display.

For automatically sized data sections and indexes, the header at the top of the displays the following:

Column

Description

Num-groups

The current number of groups.

Frame-size

The number of bytes available for data in each frame (set when the database is created - see Frame Size).

Hash-type

As above.

Dynamic-mod

The current modulo.

Next-split

The number of the group that will be split when the file or index is next expanded.

Reserved

The number of frames allocated for expansion.

Slots-free

The number of times the primary space can be expanded.

Main display

The main part of the display consists of the following columns:

Column

Description

N

The value of N for each row.

Number of groups with N items.

This column shows how Items are distributed within the file's groups. So row 0 shows how many groups are empty; row 1, how many groups contain 1 Item; row 2, how many groups contain 2 Items, and so on. The last row shows how many groups contain N or more items.

Number of groups with N used IG frames.

This column shows how file space is distributed among the file's groups. Each group is allocated one frame when a file is created. As files grow, additional 'overflow' frames are allocated and linked onto the ends of the groups as needed. So row 0 shows how many groups are empty; row 1, how many groups use 1 frame; row 2, how many groups use 2 frames (that is, 1 overflow frame); row 3, how many groups use 3 frames and so on. The last row shows how many groups use N or more frames.

Number of groups with N % usage of IG frames

This column shows the percentage of utilised space for in-group frames; that is, how the in-group frames are utilised. Row 0 shows how many groups are empty (0%); row 1, how many groups are 1% used, etc. Row 10 therefore shows how many groups are 10% full; row 50, how many are 50% full; and row 100, how many are full.

If you have chosen to display less than 100 rows, the last row shows the number of groups that are N% full or greater.

Note

To simplify the report, lines where all group values are zero are suppressed.

Summary

The summary at the bottom of the display contains the following information:

Field

Description

Total items

The number of items or index node structures stored in the hashed file allocation.

Total bytes

The number of bytes used within the file (Includes base allocation, overflow and out-of-group frames).

Total frames

The number of frames used, including out-of-group and overflow frames.

IG bytes

The number of in-group bytes used.

Overflow frames

The number of overflow frames used.

Unused bytes

The number of bytes still free in all frames allocated to the file.

OG items

The number of out-of-group items. That is, items that are larger than half a frame and for which only a header is stored within the in-group space. The remaining data is stored in an out-of-group frame or linked set of frames (see Item Structure for more details).

OG frames

The number of frames used to hold out-of-group items (see above).

Groups used

The number of file groups that contain items.

Refer to the sections File Structure and Item Structure for details of the structures of Reality files and file items, and the meanings of the terms 'in-group' and 'out-of-group'.

Commands

When using the J and U options, the following commands can be entered at the Command: prompt:

H{n} Use hash type n to generate hashing statistics. If n is omitted the original hash type is restored.

Mn Use modulo n to generate hashing statistics. 0 restores original.

Nn Enter new maximum value n of row number (N) to generate hashing statistics.

P Calculate the optimum modulo for best performance. This may result in a larger file, but fewer overflow frames and consequently fewer disk reads.

Q Quit.

S Calculate the modulo that will produce the smallest possible file. This may result in more overflow frames being used.

Comments

ISTAT can be used to determine the distribution of data within a file. This information is useful when you are trying resize files.

Example 1

:ISTAT ERRMSG (U
File='ERRMSG'
Modulus=119 Frame-size=1008 Hash-type=2 (*11)
        |       Number of groups with :-
     N  | N items  N used IG frames  N % usage of IG frames
     0  -     0             0          0        (No empty groups)
     1  -     0           113          0
     2  -     0             6          0        (6 groups have one overflow frame)
     3  -     0             0          0
     4  -     0             0          0
     5  -     4             0          0
     6  -     4             0          0
     7  -     7             0          0
     8  -    16             0          0        (16 groups have 8 items)
     9  -    18             0          0
    10  -    12             0          0
    11+ -    58             0        119        (All groups are at least 11% full)
Total items=       1265, Total bytes=     86597, Total frames=       141
IG bytes=         76924, Overflow frames=     6, Unused bytes=     49076 (38%)
OG items=            10, OG frames=          16, Groups used=        119
Command:

Example 2

This example shows the screen output when the using the following command:

:ISTAT EMPLOYEE (J
Index name:NAME

followed by:

Command: N100
File='EMPLOYEE' Index='NAME'
Num-groups=127 Frame-size=1008 Hash-type=22 (AFS, *11)
AFS details: Dynamic-mod 112 Next-split 16 Reserved 97 Slots-free 114
 
        |       Number of groups with :-
     N  | N items  N used IG frames  N % usage of IG frames
 
     0  -     13            13            13
     1  -     44           114             0
     2  -     70             0             0
    49  -      0             0            24
    50  -      0             0            17
    53  -      0             0             1
    72  -      0             0             1
    93  -      0             0             1
    97  -      0             0             7
    98  -      0             0            29
    99  -      0             0            27
   100+ -      0             0             4
 
Total items=        184, Total bytes=     89724, Total frames=       127
IG bytes=         89724, Overflow frames=     0, Unused bytes=     38292 (29%)
OG items=             0, OG frames=           0, Groups used=        114
 
Command:

This example shows the following:

Header This index uses automatic file sizing. Its size is currently 127 groups, based on a modulo of 112. 97 more groups are currently available for expansion; when these have been used, more disk space will have to be allocated. The space allocated for the index can be expanded 114 more times.

Column 2: N items
Out of a total of 127 groups (the modulo equals 127), the index contains 13 empty groups, 44 groups with 1 item and 70 groups with 2 index node structures. About 10% of the groups are therefore empty, with no groups containing more than 2 node structures.

Column 3: N Used IG frames
The index contains 13 empty groups, and 114 groups that use a single 1Kb frame. No groups have used any overflow frames.

Column 4: N % usage of IG frames
13 frames are empty (about 10%), 41 (24 + 17) frames (about 32%) are about 50% full and 64 (1+7+29+27) frames (about 50%) are over 90% full. 4 frames (about 3%) are full and overflow frames might therefore be needed if the index increases in size.

Refer to Indexing for a description of indexing and of how to use this information to size your index.