===================================================================== Indexes in 4.5 ===================================================================== PRODUCT: R:BASE VERSION: 4.5 ===================================================================== AREA: SQL CATALOG: DB DESIGN, STRUCTURE, INTEGRITY ===================================================================== R:BASE's new indexing scheme provides speed, flexibility and improved data integrity. One of the bigger changes is the way indexes on TEXT datatype columns are handled. Indexes on TEXT datatype columns are no longer automatically converted to a hashed value, now they can be stored as the actual data value. All non-TEXT indexes are stored as their actual data values, indexes on DOUBLE and CURRENCY data types are no longer hashed. Multi-column indexes can be defined for up to eight columns and any combination of data types. R:BASE 4.5 also has constraints. Constraints are similar to rules in that they indicate valid values or conditions that must be met in a table or column before data can be added or changed. Some constraints automatically define indexes when they are created; they use the index to quickly check the condition. For example, a primary key is a constraint but an index is created on the primary key column(s). A constraint may use an index, but it is not an index, constraints and indexes are created and listed separately. See the article "Using Constraints in 4.5" for information on defining and using constraints. Note that the terms index and key are no longer interchangeable. In previous versions of R:BASE, the term key was also used to refer to an indexed column. In 4.5, key refers specifically to a defined primary key or foreign key constraint. Index characteristics There are two basic types of indexes: * single column * multi column (maximum of 8 columns) A single column index as it's name implies is defined referencing a single column only. A multi-column index specifies up to eight different columns. A multi-column index is sorted in the order the columns are specified. When retrieving data from a multi-column index, the first column must always be included. A multi-column index is handy when you are always retrieving a set of the same columns from a table. Define a multi-column index for the group of columns, then when you select the data R:BASE will be able to do an index-only retrieval, much faster than having to actually retrieve the data from file 2. An index-only retrieval is when R:BASE retrieves the actual data from file 3 where it is stored as an index value rather than from file 2. Index-only retrieval is valid only for full-text indexes and indexes on non-text data types. In addition to single column and multi-column, indexes can have the following characteristics: * full-text (maximum of 200 bytes) The actual data value is stored in the index file. On large text columns, this can result in a very large file 3. In effect you are storing the data value twice, once in file 2 and once in file 3. Full-text indexes can be used for index-only retrieval. It may be slower, however, to do a text comparison on a large full-text index than to find the 4-byte hashed value. There are tradeoffs to be considered with full-text indexes. * partial text You specify the number of unhashed characters to retain at the beginning of the text value. All data past the specified number of characters is hashed. For example, specify a size of 20 on a text 50 column and the first 20 characters are stored with their actual value, the next 30 are hashed. Partial text indexes cannot be used for index-only retrieval. * hashed If the size is specified as 0 for a text data type index, the entire value is converted to a hashed value, similar to the text indexing in 3.x and 4.0. Hashed indexes cannot be used for index-only retrieval. Hashed indexes require less storage space than full or partial text indexes, they take up only 4 bytes of space in file 3 for each index value regardless of the text length. * sorted The option to order the index values is available for full-text indexes and non-text indexes. The sorted order will be used when "walking" the index, for example, selecting a range of values with the BETWEEN operator. Index-only retrieval will display the data in the order the index is sorted. Partial text and hashed indexes are not sorted. A sort order is specified on all indexes, but is only used on full-text and non-text. The sort order is either ascending or descending. * unique The UNIQUE characteristic is available on full-text and non-text indexes only. You can't create a unique index from the menus, only from the R> prompt using the CREATE INDEX command. A unique index requires that data entered into the column be unique, duplicate values are not allowed. A uniqueness rule can be replaced by a unique index definition. The unique index will check faster than the uniqueness rule. * case sensitive Unique text indexes can also be defined as case sensitive. Only a unique index can be case sensitive. The case sensitivity option is added at the time a unique index is defined with the CREATE INDEX command. Case sensitivity is defined by the collating table which is stored with the database. Listing indexes All indexes in 4.5 are named. The LIST INDEXES command displays the index name, the referenced table and the column that index is built on or MULTI-COLUMN if the index is defined using more than one column. LIST TABLE tblname will display SINGLE-COLUMN or MULTI-COLUMN for indexed columns. LIST INDEXES will not indicate if index is full-text or partial. Nor does it show sort order. Use LIST INDEX indexname or LIST INDEXES FOR tblname to see all the index characteristics. LIST INDEXES All indexes in the database are listed. The indexes with numbers for names (15, 16, 19 etc.) are converted indexes. Number of Indexes in Database CONCOMP is 26. Index Name Table Name Column Name ------------------ ------------------ ------------------ 15 component compid 16 contact custid l_name employee emplname f_name employee empfname empid employee empid empid customer empid 19 transdetail model 18 transdetail transid 21 compused compid 20 compused model 22 prodlocation model 23 product model date_empid transmaster MULTI-COLUMN LIST INDEXES FOR employee Indexes for the specified table only are listed. In addition to the column name, the datatype of the column, the sort order and the index size are listed. The heading of Index Name and Table Name is repeated for each index in the table. The size is blank for non-text and full-text indexes. Index Name: l_name Table Name: employee Column Name Datatype Order Size ------------------ --------- ----- ---- emplname TEXT ASC 10 Index Name: f_name Table Name: employee Column Name Datatype Order Size ------------------ --------- ----- ---- empfname TEXT ASC Index Name: empid Table Name: employee Column Name Datatype Order Size ------------------ --------- ----- ---- empid INTEGER ASC LIST INDEX FOR date_empid The characteristics for a specified index only are listed. For a multi-column index, all column names, sizes and sort orders are listed. The columns are listed in the order they are defined in the index. Index Name: date_empid Table Name: transmaster Column Name Datatype Order Size ------------------ --------- ----- ---- transdate DATE DESC empid INTEGER ASC Index-only retrieval If it can, R:BASE will do index-only retrieval. That is the fastest method of retrieving data. When the columns selected for display are limited to the column or columns in the index used in the WHERE clause, R:BASE will retrieve the data as it reads the index information from file 3. It will not need to look at the data stored in file 2. Index-only retrieval is done only when the columns to be retrieved are all included in the index. If you routinely select the same set of columns (up to eight) consider defining a multi-column index. In addition to selecting only the column defined in the index, only full-text indexes and non-text indexes can be used for index-only retrieval. Only full-text indexes and non-text indexes store the actual data values in file 3. Multiple Indexes There is an order of precedence where indexes and constraints are involved. A primary key or foreign key index is used first. You can define a primary or foreign key constraint on a column that already has an index defined. The primary or foreign key index is used in preference to the other index. LIST TABLE tblname will show both an index and a constraint on the same column. This is not recommended, however. Multiple indexes on the same column can degrade performance. All indexes need to be updated when data is added or modified. The primary key index will be used when retrieving data from the table with a WHERE clause referencing the primary key column. A separate index on the column is not needed. Case folding and collating tables Case folding and collating tables are now stored with the database for use with case sensitive and sorted indexes. The case folding and collating tables are not new to R:BASE, in 3.x and 4.0 they were included in the RBASE.CFG file. Most people have no need to change the tables from the defaults. Only when working with international characters might you need to change the tables. Changes are made to the tables in the RBASE.CFG file using RBEDIT or an ASCII text editor. The changes are stored in the database by using the PACK WITH USER CASE option. The case folding table is used to correlate upper and lower case characters when CASE is set OFF. It equates A and a, for example, so they will test as equal when CASE is OFF. When CASE is ON, these are different characters. The case folding table is used for equality testing. The case folding table is made up of pairs of ASCII character codes. The first character listed is mapped to the second. So CASEP 97 65 indicates that lower case a is equivalent to upper case A. The collating table is used for sorting and inequality testing. Like the case folding table, each line has pairs of ASCII character codes. The first character listed is considered to be the equivalent of the second character for sorting sequence. So COLLATE 97 65 indicates that lower case a and upper case A are not differentiated for sorting.