Application Index Design
This topic discusses the use of item-ids and indexes to optimise the performance of SQL queries, including the rules for optimisation, and makes recommendations for designing indexes to facilitate the optimisation process.
Difference between SQL and Reality Indexes
There is no fundamental difference between a Reality index and an SQL index. Both are really "Reality indexes" and can, potentially, be used equally well by English, SQL, DataBasic and TCL commands such as SELECT-INDEX.
However, there are two different mechanisms for creating indexes. In Reality, an index is created using the TCL commands DEFINE-INDEX and CREATE-INDEX, whereas in SQL, an index is created using the SQL command CREATE INDEX.
Two-Step Index Creation Using TCL
Creating an index using TCL commands is a two-step process:
-
Define the Index
Firstly an index definition is constructed using the DEFINE-INDEX command.
DEFINE-INDEX uses the data definition items stored in the file's dictionary to construct an index definition item. This item is saved in the file's dictionary under the index name. The data definitions are compiled into an internal format (the "compiled string") in the index definition.
-
Create the Index
Secondly, and possibly much later, an actual index is created using the CREATE-INDEX command.
CREATE-INDEX uses the index definition item from the file's dictionary to construct an actual index and the index definition (the "compiled string") is saved inside the index. All further reference to this index (inquiries and updates) use the index definition saved inside the index.
Changes to Data Definitions
Changes to Data Dictionary items are not automatically reflected either in the index definition item in the file's dictionary, or in the index definition saved inside the index.
Updating an Index Definition in a Dictionary
The index definition item in the dictionary can be updated by executing a subsequent DEFINE-INDEX.
Updating the Index Definition String within an Index
The compiled index definition string within an index cannot be updated directly. To change this, the index must be deleted and re-created using an updated index definition item.
Incompatibility between Data Definitions, Index Definitions and Indexes
Thus, it is entirely possible for data definition items, the index definition item in the dictionary and the index itself all to say different things. (This is an extreme example, but it can occur.) The significance of this will become clear later when we see how English and SQL attempt to choose indexes to optimise queries.
One-Step Index Creation Using SQL
Creating an index using SQL is a one-step process. The SQL command CREATE INDEX does everything in one go. It creates an index definition item using the SQL column definitions and saves it to the file's dictionary. It then goes straight on to use this definition item to create the underlying Reality index. It also saves some information about this index in the SQL catalog files.
Differences Between TCL and SQL Creation Mechanisms
There are two significant differences between the TCL and SQL mechanisms:
- The SQL command uses column definitions from the SQL catalog to build the index definition item, whereas the TCL commands use the English data definition items from the file's dictionary.
- The SQL command records information about the index in the SQL catalog. This is the same information that can be added manually by using SQLM.
The importance of the first difference will become clearer when optimisation is discussed. The second difference is fairly unimportant. It means that the index will be visible to any client application that inquires about indexes. In practice, few applications make this inquiry and it is unclear that they could make much use of the information returned anyway. Although ODBC and JDBC both provide a mechanism for client applications to ask the question, there is no standard way to make explicit use of it.
Processing a Query
When English and SQL process a query, they both start by compiling the query into an internal form. English compiles information from the data definition items in the file's dictionary into this internal representation. SQL compiles information from the Table and Column definitions in the SQL catalog into its internal representation. Both internal representations contain similar information about attribute numbers, left/right alignment, size, conversion codes nad so on.
Optimising a Query using Indexes
To provide a simple explanation of the optimisation of a query using indexes, only single table SQL queries and optimisation of the WHERE clauses are discussed.
To optimise a query, both English and SQL inspect the indexes associated with the file being queried to see if they can find one that "matches" the query.
Note
Selective indexes, that is, indexes that index only part of the file, are ignored.