DOCUMENT #712 ======================================================================= UNDERSTANDING DATA POINTERS ======================================================================= PRODUCT: R:BASE VERSION : 3.1 or Higher ======================================================================= CATALOG: Data Integrity AREA : Pointers ======================================================================= Each R:BASE database is made up of three files regardless of the number of tables. File 1 (dbname1.RBF) holds the database structure (table and column names), the location of the data in file 2 and the location of the indexes in file 3. File 2 (dbname2.RBF) holds the data. File 3 (dbname3.RBF) holds the indexes. All three files are a unit, a single database. Note: R:BASE 4.5 names the three database files dbname.rb1, dbname.rb2 and dbname.rb3. What's a pointer? =================== A pointer is simply an address to a row. R:BASE finds a row of data by using an address, just as you use an address to find a house. Each row in the database resides at a specific address in file 2. File 1 contains the starting row address for each table. Each row in file 2 contains the address of the next row in the table. All the rows in a table are linked together by these addresses or pointers. Each row stored in file 2 has two pointers - the "next row pointer" (zero if it's the last row) and the "previous row pointer" (zero if it's the first row). A next row pointer is the address of the next row - it points to the next row in the table. A previous row pointer is the address of the previous row - it points to the previous row in the table. R:BASE follows the chain of next row pointers to find all the rows in a table. Indexes store values & addresses ==================================== For each row in a table that contains an indexed column, R:BASE stores the index value and corresponding row address in file 3. An index on a column works like an index in a book. R:BASE looks in the index (file 3) to get the row address for the specified data value. Then, R:BASE goes directly to that row address in file 2 instead of following the pointer chain through all the rows. Whether or not a table has an indexed column is up to you. Usually database designers index one or more of the columns in a table for fast access to rows in that table. Typically, you index columns that uniquely identify rows in the table or that link that table to other tables. Searching for data ===================== R:BASE uses either the pointer chain or indexes to locate rows of data in a table. These can be thought of as the forward pointing method and the direct indexing method. Forward pointing method ----------------------- File 1 points at (holds the address of) the first row of data in each table. That first row stores the address for the next row, which in turn stores the address for the next, and so on. File 2 holds all the data - including the data you entered, and the definitions for rules, forms, reports, labels, and views. Many people think of their data as being in nice, neat, separate files or tables in file 2. But actually, R:BASE adds data to file 2 as you enter it. Therefore, the rows of data for any given table are not always contiguous (right next to each other) in file 2 or on the hard disk. One row may point to a next row that is in fact separated by a lot of the other tables. Therefore, the addresses for any one table can often leap-frog data from other tables. Note: R:BASE 4.5 stores data for a table in 8K blocks. Each 8K block holds rows from one table only, and the rows are contiguous within that 8K block. Tables will have many 8K blocks in file 2. The 8K blocks are not always contiguous. Each row in a table points to the next row in sequence for that table, so it's called a "forward pointing address method." This is how R:BASE searches for data unless indexed columns are used. Direct indexing method ---------------------- In addition to the forward pointing address method, R:BASE can locate data by using the "direct indexing method." As mentioned earlier, file 3 holds each row's index value and file 2's address for an indexed column in a table. R:BASE uses that address to go directly to the row in file 2. The pointer chain is not used to find the rows of data. When an index is initially created, R:BASE follows the pointer chain through all the data in file 2 for that table collecting the addresses from the next row pointers and recording the addresses and index values in file 3. When you add a new row to a table, R:BASE automatically updates both the pointer chain and the index. Broken pointers kink the chain ================================== A "broken pointer" is the term used to indicate that the chain of next row pointers is damaged. Maybe only a single row is damaged (broken), but even one broken pointer causes R:BASE to lose its place in the chain that links all the rows in the table. R:BASE doesn't know where to look for the next row because the next row pointer is invalid. R:BASE can only follow the chain and find rows up to the point of the break, the rest of the rows in the table are "lost." Sometimes, you can find the lost rows by using indexes. R:BASE goes directly to a row address by using indexes, it doesn't need to use the pointer chain. Using indexes to recover data ================================= You can find out how many rows are lost by comparing the number of rows R:BASE can count with the number of rows stored in file 1. Both RBCHECK and R:SCOPE will report the two numbers when checking a table. The difference is the number of rows that are lost. In R:BASE, find these numbers by using the command COMPUTE ALL (1) FROM tblname. The value for rows is the number stored in file1, count is the number of rows R:BASE could find by following the pointer chain. When these numbers are different there is a broken pointer. If the table has an indexed column, you may be able to jump over the damage in the pointer chain and find the good rows on the other side. This is because R:BASE stores row addresses in file 3 with corres- ponding index values. By searching for the data using an index value, R:BASE does not need to follow the broken pointer chain. You can't add an indexed column to a table with a broken pointer and use it to find data. Remember that when you initially create an index, R:BASE uses the pointer chain to find the row addresses. An indexed four-byte column (INTEGER, TEXT 4, REAL, DATE, or TIME) stores the actual data value along with the row address. Indexed columns of other data types (TEXT > 4, DOUBLE, CURRENCY) store a hashed representation of the actual data value. Indexes on four-byte columns are called "walkable" indexes, R:BASE can walk the index file and find all the row addresses in much the way it follows the pointer chain in file 2. If you can find the lost data by using a four-byte indexed column, try using the following set of commands to recover all the missing data from the table. The procedure makes a copy of the table and populates the copy with data found by using an indexed search. Before doing this, or any other data recovery procedure, make sure you have a copy of the database. RENAME TABLE tblname TO xx NOCHECK PROJECT tblname FROM xx USING ALL WHERE index_column + BETWEEN 0 AND 10000 INSERT INTO tblname SELECT * FROM xx WHERE index_column IS NULL DROP TABLE xx CREATE INDEX ON tblname index_column The RENAME...NOCHECK command preserves all your form and report data. The PROJECT command uses the addresses stored in file 3 with each index value to find all the rows because its WHERE clause uses indexes. Make the range large enough to encompass all possible data values. It's okay if the range is larger than the actual data values. The INSERT command adds any rows where the indexed column is null. You need it because null values won't be found with the PROJECT. The DROP TABLE command erases the broken table (XX). Then the CREATE INDEX command reindexes the index_column in the "new" table. Any additional indexed columns will also need to be recreated. If you're going to use an indexed column to recover data, it must be a four-byte column; that is it must be defined as an indexed INTEGER, REAL, DATE, TIME, or TEXT 4 column. Indexes on larger TEXT columns like COMPANY or LASTNAME are useful for speed reasons, but they won't help you recover data. Using R:SCOPE to recover data ================================= If you don't have an appropriate indexed column in a table with a broken pointer, you must either restore your backup copy of the database, or use PACK to rebuild the pointers up to the break and then reenter the missing data. Or you can use R:SCOPE to try and correct the broken pointer. You may be able to use R:SCOPE's Autofix feature to quickly repair the broken pointer. R:SCOPE lets you see the raw data and pointers stored in file 2 and has a manual fix option also. R:SCOPE can be used to fix broken pointers and other database errors that the index column method fails on. Programs such as The Norton Utilities or PC Tools can also see raw R:BASE data, but they have no mechanism for making sense of it, and can't help you fix broken pointers. You can purchase R:SCOPE by calling Microrim at 800-628-6990. The cost of purchasing R:SCOPE to fix a table will often be less than the cost of reentering the data. Causes for broken pointers ============================== Broken pointers are primarily caused by an interruption of some kind while R:BASE writes or modifies data in file 2. The source of the interruption could be a power fluctuation, network connection problems, interference from memory-resident programs, turning the computer off while in R:BASE, or some electromagnetic interference such as a sudden static discharge or radio frequencies coming from speakers placed too close to your PC. Often it's impossible to know exactly what caused the broken pointers. Maintaining database health =============================== Databases need to be regularly checked and backed up. Procedures should be in place for both. A broken pointer can be easily fixed if it's found right away. If you usually use indexes to retrieve data you may not notice a broken pointer during database operation. A regular database check would find the problem. If you don't check the database before backing it up, it's possible to backup a "broken" database. The longer a broken pointer is left in a database the more difficult it can be to recover data. One simple broken pointer can evolve into other, more complex, database problems if left in place. Use one of the following methods to regularly check your database for broken pointers and other errors before doing a backup. The article "Checking Database Integrity" in this Exchange has additional information on checking your database. - COMPUTE ALL (1) FROM tblname - RBCHECK - AUTOCHK - R:SCOPE Back Up, Back Up, Back Up ============================ Back up your database regularly. First, check database integrity. Then back it up by using the DOS COPY or BACKUP command, or a third-party backup utility. The few minutes it takes to make a backup may save you hours, even months, of work. Often the quickest way to recover from a broken pointer or other database error is to restore yesterday's backup and reenter today's data, but not if it's last month's backup. Regular, valid backups are a necessity. Make sure your backup system works.