=====================================================================
Indexes in 4.5
=====================================================================
PRODUCT: R:BASE VERSION: 4.5
=====================================================================
AREA: SQL CATALOG: DB DESIGN, STRUCTURE, INTEGRITY
=====================================================================
R:BASE's new indexing scheme provides speed, flexibility and improved
data integrity. One of the bigger changes is the way indexes on TEXT
datatype columns are handled. Indexes on TEXT datatype columns are no
longer automatically converted to a hashed value, now they can be
stored as the actual data value. All non-TEXT indexes are stored as
their actual data values, indexes on DOUBLE and CURRENCY data types
are no longer hashed. Multi-column indexes can be defined for up to
eight columns and any combination of data types.
R:BASE 4.5 also has constraints. Constraints are similar to rules in
that they indicate valid values or conditions that must be met in a
table or column before data can be added or changed. Some constraints
automatically define indexes when they are created; they use the index
to quickly check the condition. For example, a primary key is a
constraint but an index is created on the primary key column(s). A
constraint may use an index, but it is not an index, constraints and
indexes are created and listed separately. See the article "Using
Constraints in 4.5" for information on defining and using constraints.
Note that the terms index and key are no longer interchangeable. In
previous versions of R:BASE, the term key was also used to refer to an
indexed column. In 4.5, key refers specifically to a defined primary
key or foreign key constraint.
Index characteristics
There are two basic types of indexes:
* single column
* multi column (maximum of 8 columns)
A single column index as it's name implies is defined referencing a
single column only. A multi-column index specifies up to eight
different columns. A multi-column index is sorted in the order the
columns are specified. When retrieving data from a multi-column index,
the first column must always be included. A multi-column index is
handy when you are always retrieving a set of the same columns from a
table. Define a multi-column index for the group of columns, then when
you select the data R:BASE will be able to do an index-only retrieval,
much faster than having to actually retrieve the data from file 2. An
index-only retrieval is when R:BASE retrieves the actual data from
file 3 where it is stored as an index value rather than from file 2.
Index-only retrieval is valid only for full-text indexes and indexes
on non-text data types.
In addition to single column and multi-column, indexes can have the
following characteristics:
* full-text (maximum of 200 bytes)
The actual data value is stored in the index file. On large text
columns, this can result in a very large file 3. In effect you are
storing the data value twice, once in file 2 and once in file 3.
Full-text indexes can be used for index-only retrieval. It may be
slower, however, to do a text comparison on a large full-text index
than to find the 4-byte hashed value. There are tradeoffs to be
considered with full-text indexes.
* partial text
You specify the number of unhashed characters to retain at the
beginning of the text value. All data past the specified number of
characters is hashed. For example, specify a size of 20 on a text 50
column and the first 20 characters are stored with their actual value,
the next 30 are hashed. Partial text indexes cannot be used for
index-only retrieval.
* hashed
If the size is specified as 0 for a text data type index, the entire
value is converted to a hashed value, similar to the text indexing in
3.x and 4.0. Hashed indexes cannot be used for index-only retrieval.
Hashed indexes require less storage space than full or partial text
indexes, they take up only 4 bytes of space in file 3 for each index
value regardless of the text length.
* sorted
The option to order the index values is available for full-text
indexes and non-text indexes. The sorted order will be used when
"walking" the index, for example, selecting a range of values with
the BETWEEN operator. Index-only retrieval will display the data in
the order the index is sorted. Partial text and hashed indexes are
not sorted. A sort order is specified on all indexes, but is only
used on full-text and non-text. The sort order is either ascending
or descending.
* unique
The UNIQUE characteristic is available on full-text and non-text
indexes only. You can't create a unique index from the menus, only
from the R> prompt using the CREATE INDEX command. A unique index
requires that data entered into the column be unique, duplicate
values are not allowed. A uniqueness rule can be replaced by a unique
index definition. The unique index will check faster than the
uniqueness rule.
* case sensitive
Unique text indexes can also be defined as case sensitive. Only a
unique index can be case sensitive. The case sensitivity option is
added at the time a unique index is defined with the CREATE INDEX
command. Case sensitivity is defined by the collating table which is
stored with the database.
Listing indexes
All indexes in 4.5 are named. The LIST INDEXES command displays the
index name, the referenced table and the column that index is built on
or MULTI-COLUMN if the index is defined using more than one column.
LIST TABLE tblname will display SINGLE-COLUMN or MULTI-COLUMN for
indexed columns. LIST INDEXES will not indicate if index is full-text
or partial. Nor does it show sort order. Use LIST INDEX indexname or
LIST INDEXES FOR tblname to see all the index characteristics.
LIST INDEXES
All indexes in the database are listed. The indexes with numbers for
names (15, 16, 19 etc.) are converted indexes.
Number of Indexes in Database CONCOMP is 26.
Index Name Table Name Column Name
------------------ ------------------ ------------------
15 component compid
16 contact custid
l_name employee emplname
f_name employee empfname
empid employee empid
empid customer empid
19 transdetail model
18 transdetail transid
21 compused compid
20 compused model
22 prodlocation model
23 product model
date_empid transmaster MULTI-COLUMN
LIST INDEXES FOR employee
Indexes for the specified table only are listed. In addition to the
column name, the datatype of the column, the sort order and the index
size are listed. The heading of Index Name and Table Name is repeated
for each index in the table. The size is blank for non-text and
full-text indexes.
Index Name: l_name
Table Name: employee
Column Name Datatype Order Size
------------------ --------- ----- ----
emplname TEXT ASC 10
Index Name: f_name
Table Name: employee
Column Name Datatype Order Size
------------------ --------- ----- ----
empfname TEXT ASC
Index Name: empid
Table Name: employee
Column Name Datatype Order Size
------------------ --------- ----- ----
empid INTEGER ASC
LIST INDEX FOR date_empid
The characteristics for a specified index only are listed. For a
multi-column index, all column names, sizes and sort orders are
listed. The columns are listed in the order they are defined in the
index.
Index Name: date_empid
Table Name: transmaster
Column Name Datatype Order Size
------------------ --------- ----- ----
transdate DATE DESC
empid INTEGER ASC
Index-only retrieval
If it can, R:BASE will do index-only retrieval. That is the fastest
method of retrieving data. When the columns selected for display are
limited to the column or columns in the index used in the WHERE clause,
R:BASE will retrieve the data as it reads the index information from
file 3. It will not need to look at the data stored in file 2.
Index-only retrieval is done only when the columns to be retrieved are
all included in the index. If you routinely select the same set of
columns (up to eight) consider defining a multi-column index. In
addition to selecting only the column defined in the index, only
full-text indexes and non-text indexes can be used for index-only
retrieval. Only full-text indexes and non-text indexes store the
actual data values in file 3.
Multiple Indexes
There is an order of precedence where indexes and constraints are
involved. A primary key or foreign key index is used first. You can
define a primary or foreign key constraint on a column that already
has an index defined. The primary or foreign key index is used in
preference to the other index. LIST TABLE tblname will show both an
index and a constraint on the same column. This is not recommended,
however. Multiple indexes on the same column can degrade performance.
All indexes need to be updated when data is added or modified. The
primary key index will be used when retrieving data from the table
with a WHERE clause referencing the primary key column. A separate
index on the column is not needed.
Case folding and collating tables
Case folding and collating tables are now stored with the database for
use with case sensitive and sorted indexes. The case folding and
collating tables are not new to R:BASE, in 3.x and 4.0 they were
included in the RBASE.CFG file. Most people have no need to change the
tables from the defaults. Only when working with international
characters might you need to change the tables. Changes are made to
the tables in the RBASE.CFG file using RBEDIT or an ASCII text editor.
The changes are stored in the database by using the PACK WITH USER
CASE option.
The case folding table is used to correlate upper and lower case
characters when CASE is set OFF. It equates A and a, for example, so
they will test as equal when CASE is OFF. When CASE is ON, these are
different characters. The case folding table is used for equality
testing. The case folding table is made up of pairs of ASCII
character codes. The first character listed is mapped to the second.
So CASEP 97 65 indicates that lower case a is equivalent to upper
case A.
The collating table is used for sorting and inequality testing. Like
the case folding table, each line has pairs of ASCII character codes.
The first character listed is considered to be the equivalent of the
second character for sorting sequence. So COLLATE 97 65 indicates that
lower case a and upper case A are not differentiated for sorting.