DOCUMENT #712
=======================================================================
UNDERSTANDING DATA POINTERS
=======================================================================
PRODUCT: R:BASE VERSION : 3.1 or Higher
=======================================================================
CATALOG: Data Integrity AREA : Pointers
=======================================================================
Each R:BASE database is made up of three files regardless of the
number of tables. File 1 (dbname1.RBF) holds the database structure
(table and column names), the location of the data in file 2 and the
location of the indexes in file 3. File 2 (dbname2.RBF) holds the data.
File 3 (dbname3.RBF) holds the indexes. All three files are a unit,
a single database.
Note: R:BASE 4.5 names the three database files dbname.rb1, dbname.rb2
and dbname.rb3.
What's a pointer?
===================
A pointer is simply an address to a row. R:BASE finds a row of data by
using an address, just as you use an address to find a house. Each row
in the database resides at a specific address in file 2. File 1
contains the starting row address for each table. Each row in file 2
contains the address of the next row in the table. All the rows in a
table are linked together by these addresses or pointers.
Each row stored in file 2 has two pointers - the "next row pointer"
(zero if it's the last row) and the "previous row pointer" (zero if
it's the first row). A next row pointer is the address of the next
row - it points to the next row in the table. A previous row pointer
is the address of the previous row - it points to the previous row in
the table. R:BASE follows the chain of next row pointers to find all
the rows in a table.
Indexes store values & addresses
====================================
For each row in a table that contains an indexed column, R:BASE stores
the index value and corresponding row address in file 3. An index on
a column works like an index in a book. R:BASE looks in the index
(file 3) to get the row address for the specified data value. Then,
R:BASE goes directly to that row address in file 2 instead of
following the pointer chain through all the rows.
Whether or not a table has an indexed column is up to you. Usually
database designers index one or more of the columns in a table for
fast access to rows in that table. Typically, you index columns
that uniquely identify rows in the table or that link that table to
other tables.
Searching for data
=====================
R:BASE uses either the pointer chain or indexes to locate rows of
data in a table. These can be thought of as the forward pointing
method and the direct indexing method.
Forward pointing method
-----------------------
File 1 points at (holds the address of) the first row of data in
each table. That first row stores the address for the next row,
which in turn stores the address for the next, and so on. File 2
holds all the data - including the data you entered, and the
definitions for rules, forms, reports, labels, and views.
Many people think of their data as being in nice, neat, separate
files or tables in file 2. But actually, R:BASE adds data to file 2
as you enter it. Therefore, the rows of data for any given table are
not always contiguous (right next to each other) in file 2 or on the
hard disk. One row may point to a next row that is in fact separated
by a lot of the other tables. Therefore, the addresses for any one
table can often leap-frog data from other tables.
Note: R:BASE 4.5 stores data for a table in 8K blocks. Each 8K block
holds rows from one table only, and the rows are contiguous within
that 8K block. Tables will have many 8K blocks in file 2. The 8K blocks
are not always contiguous.
Each row in a table points to the next row in sequence for that table,
so it's called a "forward pointing address method." This is how R:BASE
searches for data unless indexed columns are used.
Direct indexing method
----------------------
In addition to the forward pointing address method, R:BASE can locate
data by using the "direct indexing method." As mentioned earlier, file
3 holds each row's index value and file 2's address for an indexed
column in a table. R:BASE uses that address to go directly to the row
in file 2. The pointer chain is not used to find the rows of data.
When an index is initially created, R:BASE follows the pointer chain
through all the data in file 2 for that table collecting the addresses
from the next row pointers and recording the addresses and index values
in file 3. When you add a new row to a table, R:BASE automatically
updates both the pointer chain and the index.
Broken pointers kink the chain
==================================
A "broken pointer" is the term used to indicate that the chain of next
row pointers is damaged. Maybe only a single row is damaged (broken),
but even one broken pointer causes R:BASE to lose its place in the
chain that links all the rows in the table. R:BASE doesn't know where
to look for the next row because the next row pointer is invalid.
R:BASE can only follow the chain and find rows up to the point of the
break, the rest of the rows in the table are "lost."
Sometimes, you can find the lost rows by using indexes. R:BASE goes
directly to a row address by using indexes, it doesn't need to use
the pointer chain.
Using indexes to recover data
=================================
You can find out how many rows are lost by comparing the number of
rows R:BASE can count with the number of rows stored in file 1. Both
RBCHECK and R:SCOPE will report the two numbers when checking a table.
The difference is the number of rows that are lost. In R:BASE, find
these numbers by using the command COMPUTE ALL (1) FROM tblname.
The value for rows is the number stored in file1, count is the number
of rows R:BASE could find by following the pointer chain. When these
numbers are different there is a broken pointer.
If the table has an indexed column, you may be able to jump over the
damage in the pointer chain and find the good rows on the other side.
This is because R:BASE stores row addresses in file 3 with corres-
ponding index values. By searching for the data using an index value,
R:BASE does not need to follow the broken pointer chain.
You can't add an indexed column to a table with a broken pointer and
use it to find data. Remember that when you initially create an index,
R:BASE uses the pointer chain to find the row addresses.
An indexed four-byte column (INTEGER, TEXT 4, REAL, DATE, or TIME)
stores the actual data value along with the row address. Indexed
columns of other data types (TEXT > 4, DOUBLE, CURRENCY) store a
hashed representation of the actual data value. Indexes on four-byte
columns are called "walkable" indexes, R:BASE can walk the index file
and find all the row addresses in much the way it follows the pointer
chain in file 2.
If you can find the lost data by using a four-byte indexed column,
try using the following set of commands to recover all the missing
data from the table. The procedure makes a copy of the table and
populates the copy with data found by using an indexed search. Before
doing this, or any other data recovery procedure, make sure you have
a copy of the database.
RENAME TABLE tblname TO xx NOCHECK
PROJECT tblname FROM xx USING ALL WHERE index_column +
BETWEEN 0 AND 10000
INSERT INTO tblname SELECT * FROM xx WHERE index_column IS NULL
DROP TABLE xx
CREATE INDEX ON tblname index_column
The RENAME...NOCHECK command preserves all your form and report data.
The PROJECT command uses the addresses stored in file 3 with each
index value to find all the rows because its WHERE clause uses indexes.
Make the range large enough to encompass all possible data values. It's
okay if the range is larger than the actual data values. The INSERT
command adds any rows where the indexed column is null. You need it
because null values won't be found with the PROJECT. The DROP TABLE
command erases the broken table (XX). Then the CREATE INDEX command
reindexes the index_column in the "new" table. Any additional indexed
columns will also need to be recreated.
If you're going to use an indexed column to recover data, it must be a
four-byte column; that is it must be defined as an indexed INTEGER,
REAL, DATE, TIME, or TEXT 4 column. Indexes on larger TEXT columns
like COMPANY or LASTNAME are useful for speed reasons, but they won't
help you recover data.
Using R:SCOPE to recover data
=================================
If you don't have an appropriate indexed column in a table with a
broken pointer, you must either restore your backup copy of the
database, or use PACK to rebuild the pointers up to the break and
then reenter the missing data. Or you can use R:SCOPE to try and
correct the broken pointer. You may be able to use R:SCOPE's Autofix
feature to quickly repair the broken pointer. R:SCOPE lets you see
the raw data and pointers stored in file 2 and has a manual fix option
also. R:SCOPE can be used to fix broken pointers and other database
errors that the index column method fails on.
Programs such as The Norton Utilities or PC Tools can also see raw
R:BASE data, but they have no mechanism for making sense of it, and
can't help you fix broken pointers.
You can purchase R:SCOPE by calling Microrim at 800-628-6990. The
cost of purchasing R:SCOPE to fix a table will often be less than
the cost of reentering the data.
Causes for broken pointers
==============================
Broken pointers are primarily caused by an interruption of some kind
while R:BASE writes or modifies data in file 2. The source of the
interruption could be a power fluctuation, network connection problems,
interference from memory-resident programs, turning the computer off
while in R:BASE, or some electromagnetic interference such as a sudden
static discharge or radio frequencies coming from speakers placed too
close to your PC. Often it's impossible to know exactly what caused the
broken pointers.
Maintaining database health
===============================
Databases need to be regularly checked and backed up. Procedures
should be in place for both. A broken pointer can be easily fixed if
it's found right away. If you usually use indexes to retrieve data you
may not notice a broken pointer during database operation. A regular
database check would find the problem. If you don't check the database
before backing it up, it's possible to backup a "broken" database. The
longer a broken pointer is left in a database the more difficult it can
be to recover data. One simple broken pointer can evolve into other,
more complex, database problems if left in place.
Use one of the following methods to regularly check your database for
broken pointers and other errors before doing a backup. The article
"Checking Database Integrity" in this Exchange has additional
information on checking your database.
- COMPUTE ALL (1) FROM tblname
- RBCHECK
- AUTOCHK
- R:SCOPE
Back Up, Back Up, Back Up
============================
Back up your database regularly. First, check database integrity. Then
back it up by using the DOS COPY or BACKUP command, or a third-party
backup utility. The few minutes it takes to make a backup may save you
hours, even months, of work. Often the quickest way to recover from a
broken pointer or other database error is to restore yesterday's backup
and reenter today's data, but not if it's last month's backup. Regular,
valid backups are a necessity. Make sure your backup system works.