DOCUMENT #712
     =======================================================================
     UNDERSTANDING DATA POINTERS
     =======================================================================
     PRODUCT:  R:BASE                   VERSION :  3.1 or Higher
     =======================================================================
     CATALOG:  Data Integrity           AREA    :  Pointers       
     =======================================================================
 
 
     Each R:BASE database is made up of three files regardless of the 
     number of tables. File 1 (dbname1.RBF) holds the database structure 
     (table and column names), the location of the data in file 2 and the 
     location of the indexes in file 3. File 2 (dbname2.RBF) holds the data. 
     File 3 (dbname3.RBF) holds the indexes. All three files are a unit, 
     a single database.
 
     Note: R:BASE 4.5 names the three database files dbname.rb1, dbname.rb2 
     and dbname.rb3.
 
     What's a pointer?
     ===================
     A pointer is simply an address to a row. R:BASE finds a row of data by 
     using an address, just as you use an address to find a house. Each row 
     in the database resides at a specific address in file 2. File 1 
     contains the starting row address for each table. Each row in file 2 
     contains the address of the next row in the table. All the rows in a 
     table are linked together by these addresses or pointers.
 
     Each row stored in file 2 has two pointers - the "next row pointer"
     (zero if it's the last row) and the "previous row pointer" (zero if 
     it's the first row). A next row pointer is the address of the next 
     row - it points to the next row in the table. A previous row pointer 
     is the address of the previous row - it points to the previous row in 
     the table. R:BASE follows the chain of next row pointers to find all 
     the rows in a table.
 
     Indexes store values & addresses
     ====================================
     For each row in a table that contains an indexed column, R:BASE stores 
     the index value and corresponding row address in file 3. An index on 
     a column works like an index in a book. R:BASE looks in the index 
     (file 3) to get the row address for the specified data value. Then, 
     R:BASE goes directly to that row address in file 2 instead of 
     following the pointer chain through all the rows.
 
     Whether or not a table has an indexed column is up to you. Usually 
     database designers index one or more of the columns in a table for 
     fast access to rows in that table. Typically, you index columns 
     that uniquely identify rows in the table or that link that table to 
     other tables.
 
     Searching for data
     =====================
     R:BASE uses either the pointer chain or indexes to locate rows of 
     data in a table. These can be thought of as the forward pointing 
     method and the direct indexing method.
 
     Forward pointing method
     -----------------------
     File 1 points at (holds the address of) the first row of data in 
     each table. That first row stores the address for the next row, 
     which in turn stores the address for the next, and so on. File 2 
     holds all the data - including the data you entered, and the 
     definitions for rules, forms, reports, labels, and views.
 
     Many people think of their data as being in nice, neat, separate 
     files or tables in file 2. But actually, R:BASE adds data to file 2 
     as you enter it. Therefore, the rows of data for any given table are 
     not always contiguous (right next to each other) in file 2 or on the 
     hard disk. One row may point to a next row that is in fact separated 
     by a lot of the other tables. Therefore, the addresses for any one 
     table can often leap-frog data from other tables.
 
     Note: R:BASE 4.5 stores data for a table in 8K blocks. Each 8K block 
     holds rows from one table only, and the rows are contiguous within 
     that 8K block. Tables will have many 8K blocks in file 2. The 8K blocks 
     are not always contiguous.
 
     Each row in a table points to the next row in sequence for that table, 
     so it's called a "forward pointing address method." This is how R:BASE 
     searches for data unless indexed columns are used.
 
     Direct indexing method
     ----------------------  
     In addition to the forward pointing address method, R:BASE can locate 
     data by using the "direct indexing method." As mentioned earlier, file 
     3 holds each row's index value and file 2's address for an indexed 
     column in a table. R:BASE uses that address to go directly to the row 
     in file 2. The pointer chain is not used to find the rows of data.
 
     When an index is initially created, R:BASE follows the pointer chain 
     through all the data in file 2 for that table collecting the addresses 
     from the next row pointers and recording the addresses and index values 
     in file 3. When you add a new row to a table, R:BASE automatically 
     updates both the pointer chain and the index.
 
     Broken pointers kink the chain
     ==================================
     A "broken pointer" is the term used to indicate that the chain of next 
     row pointers is damaged. Maybe only a single row is damaged (broken), 
     but even one broken pointer causes R:BASE to lose its place in the 
     chain that links all the rows in the table. R:BASE doesn't know where 
     to look for the next row because the next row pointer is invalid. 
     R:BASE can only follow the chain and find rows up to the point of the 
     break, the rest of the rows in the table are "lost."
 
     Sometimes, you can find the lost rows by using indexes. R:BASE goes 
     directly to a row address by using indexes, it doesn't need to use 
     the pointer chain. 
 
     Using indexes to recover data
     =================================
     You can find out how many rows are lost by comparing the number of 
     rows R:BASE can count with the number of rows stored in file 1. Both 
     RBCHECK and R:SCOPE will report the two numbers when checking a table. 
     The difference is the number of rows that are lost. In R:BASE, find 
     these numbers by using the command COMPUTE ALL (1) FROM tblname. 
     The value for rows is the number stored in file1, count is the number 
     of rows R:BASE could find by following the pointer chain. When these 
     numbers are different there is a broken pointer.
 
     If the table has an indexed column, you may be able to jump over the 
     damage in the pointer chain and find the good rows on the other side. 
     This is because R:BASE stores row addresses in file 3 with corres-
     ponding index values. By searching for the data using an index value, 
     R:BASE does not need to follow the broken pointer chain. 
 
     You can't add an indexed column to a table with a broken pointer and 
     use it to find data. Remember that when you initially create an index, 
     R:BASE uses the pointer chain to find the row addresses. 
 
     An indexed four-byte column (INTEGER, TEXT 4, REAL, DATE, or TIME) 
     stores the actual data value along with the row address. Indexed 
     columns of other data types (TEXT > 4, DOUBLE, CURRENCY) store a 
     hashed representation of the actual data value. Indexes on four-byte 
     columns are called "walkable" indexes, R:BASE can walk the index file 
     and find all the row addresses in much the way it follows the pointer 
     chain in file 2.
 
     If you can find the lost data by using a four-byte indexed column, 
     try using the following set of commands to recover all the missing 
     data from the table. The procedure makes a copy of the table and 
     populates the copy with data found by using an indexed search. Before 
     doing this, or any other data recovery procedure, make sure you have 
     a copy of the database.
 
       RENAME TABLE tblname TO xx NOCHECK
       PROJECT tblname FROM xx USING ALL WHERE index_column +
        BETWEEN 0 AND 10000
       INSERT INTO tblname SELECT * FROM xx WHERE index_column IS NULL
       DROP TABLE  xx
       CREATE INDEX ON tblname index_column
 
     The RENAME...NOCHECK command preserves all your form and report data. 
     The PROJECT command uses the addresses stored in file 3 with each 
     index value to find all the rows because its WHERE clause uses indexes. 
     Make the range large enough to encompass all possible data values. It's 
     okay if the range is larger than the actual data values. The INSERT 
     command adds any rows where the indexed column is null. You need it 
     because null values won't be found with the PROJECT. The DROP TABLE 
     command erases the broken table (XX). Then the CREATE INDEX command 
     reindexes the index_column in the "new" table. Any additional indexed 
     columns will also need to be recreated.
 
     If you're going to use an indexed column to recover data, it must be a 
     four-byte column; that is it must be defined as an indexed INTEGER, 
     REAL, DATE, TIME, or TEXT 4 column. Indexes on larger TEXT columns 
     like COMPANY or LASTNAME are useful for speed reasons, but they won't 
     help you recover data.
 
     Using R:SCOPE to recover data
     =================================
     If you don't have an appropriate indexed column in a table with a 
     broken pointer, you must either restore your backup copy of the 
     database, or use PACK to rebuild the pointers up to the break and 
     then reenter the missing data. Or you can use R:SCOPE to try and 
     correct the broken pointer. You may be able to use R:SCOPE's Autofix 
     feature to quickly repair the broken pointer. R:SCOPE lets you see 
     the raw data and pointers stored in file 2 and has a manual fix option 
     also. R:SCOPE can be used to fix broken pointers and other database 
     errors that the index column method fails on.
 
     Programs such as The Norton Utilities or PC Tools can also see raw 
     R:BASE data, but they have no mechanism for making sense of it, and 
     can't help you fix broken pointers.
 
     You can purchase R:SCOPE by calling Microrim at 800-628-6990. The 
     cost of purchasing R:SCOPE to fix a table will often be less than 
     the cost of reentering the data.
 
     Causes for broken pointers
     ==============================
     Broken pointers are primarily caused by an interruption of some kind 
     while R:BASE writes or modifies data in file 2. The source of the 
     interruption could be a power fluctuation, network connection problems,
     interference from memory-resident programs, turning the computer off 
     while in R:BASE, or some electromagnetic interference such as a sudden 
     static discharge or radio frequencies coming from speakers placed too 
     close to your PC. Often it's impossible to know exactly what caused the 
     broken pointers.
 
     Maintaining database health
     ===============================
     Databases need to be regularly checked and backed up. Procedures 
     should be in place for both. A broken pointer can be easily fixed if 
     it's found right away. If you usually use indexes to retrieve data you 
     may not notice a broken pointer during database operation. A regular 
     database check would find the problem. If you don't check the database 
     before backing it up, it's possible to backup a "broken" database. The 
     longer a broken pointer is left in a database the more difficult it can 
     be to recover data. One simple broken pointer can evolve into other, 
     more complex, database problems if left in place. 
 
     Use one of the following methods to regularly check your database for 
     broken pointers and other errors before doing a backup. The article 
     "Checking Database Integrity" in this Exchange has additional 
     information on checking your database.
 
      - COMPUTE ALL (1) FROM tblname
      - RBCHECK
      - AUTOCHK
      - R:SCOPE
 
     Back Up, Back Up, Back Up
     ============================
     Back up your database regularly. First, check database integrity. Then 
     back it up by using the DOS COPY or BACKUP command, or a third-party 
     backup utility. The few minutes it takes to make a backup may save you 
     hours, even months, of work. Often the quickest way to recover from a 
     broken pointer or other database error is to restore yesterday's backup 
     and reenter today's data, but not if it's last month's backup. Regular, 
     valid backups are a necessity. Make sure your backup system works.