==================================================================
     OPTIMIZING APPLICATION CODE FOR SPEED
     ==================================================================
     PRODUCT :  R:BASE                  VERSION    :  3.1 AND HIGHER
     CATEGORY:  PROGRAMMING             SUBCATEGORY:  PERFORMANCE
     ==================================================================
 
     Because the most obvious or logical way to approach a programming task
     is not necessarily the fastest, optimizing application code is a
     necessary skill. This article gives some specific examples of different
     ways to accomplish the same tasks and shows the speed differences between
     them. It also provides other general techniques for improving the perfor-
     mance of your applications.
 
     How can you tell what the fastest way to accomplish a programming task
     is?  The only way to tell for sure is to try each command or set of
     commands, time it, and compare the times. Often a sequence of commands
     will be fast on one database but slow on another because the structure
     (tables and columns) and data of the databases are different. Speed can
     vary from database to database and from machine to machine.
 
     The times given below for different programming techniques are for
     comparison purposes only. Actual times that you will see using the same
     methods will vary on the basis of your hardware and your database.
 
 
     OPTIMIZATION TECHNIQUES
 
 
     Use the most current version of R:BASE 3.1
     ==========================================
     Microrim is continually making speed improvements, and each new release
     is faster than the previous one. For example, building indexes in R:BASE
     3.1C is significantly faster than in previous versions. PACKing a data-
     base in 3.1C took just 2 1/2 hours; R:BASE 3.1B took almost six hours to
     pack the same database.
 
 
     Think globally instead of row by row
     ====================================
     This is one of the best ways to achieve some significant speed improve-
     ments. Instead of doing row-by-row processing using a DECLARE CURSOR,
     try using a single UPDATE or INSERT command, which will take advantage
     of the SQL select capabilities of 3.1C.
 
     <> For example, suppose you want to create a table to hold information
        about how many of each product a customer has purchased. The most
        logical approach is to set up a cursor to go through the detail table
        and count the rows for each customer/product combination and then
        insert that data into the new table. On a table with 3500 rows,
        processing all the rows took 1:15:00 (one hour, 15 minutes). By
        replacing the DECLARE CURSOR loop with a single insert command, the
        time required to process all the rows was cut to 0:09:00 (nine
        minutes).
 
     Compare the two pieces of code:
 
     DECLARE c1 CURSOR FOR SELECT cust#,partno FROM cproddet +
      GROUP BY cust#,partno
     OPEN c1
     FETCH c1 INTO vcust ind1,vpartno ind2
     WHILE SQLCODE < > 100 THEN
       SELECT COUNT(*) INTO vcount FROM cproddet WHERE +
        cust#=.vcust AND partno=.vpartno
       INSERT INTO custprod (cust#,partno,copies) VALUES +
       (.vcust,.vpartno,.vcount)
       FETCH c1 INTO vcust ind1,vpartno ind2
     ENDW
     DROP CURSOR c1
 
     vs.
 
     INSERT INTO custprod (cust#,partno,copies) SELECT cust#,+
      partno,count(*) FROM cproddet GROUP BY cust#,partno
 
 
     Be Creative
     ===========
     R:BASE almost always offers two or more ways of doing something. Look
     for an alternative method for accomplishing a task - it might just be
     considerably faster.
 
     <> For example, following are three ways to delete 600 rows from a
        table of 8,000 rows. Notice the time differential among the three
        methods. We can cut the time in half (or more) by being creative.
 
     First, a straightforward use of the DELETE ROWS command, shown below,
     took 06:40; SET CLEAR OFF reduced the time slightly to 06:29.
 
     DELETE ROWS FROM transx WHERE invid=10
 
     Second, using the PROJECT command to create a new table with the rows to
     keep took 06:00; SET CLEAR OFF significantly reduced the time to just
     01:02. This method is shown below:
 
     PROJECT newtrans FROM transx USI ALL WHERE invid < > 10
     DROP TAB transx
     REN TABLE newtrans TO transx NOCHECK
     CREATE INDEX ON transx invid
     CREATE INDEX ON transx prodid
 
     Finally, UNLOADing the data to an ASCII file, then reloading the table
     with that data using the LOAD AS ASCII feature of 3.1B took only 03:56;
     SET CLEAR OFF cut the time about in half to 02:01. This method is shown
     below:
 
     OUTPUT transx.dat
     UNLOAD DATA FROM transx AS ASCII WHERE invid < > 10
     OUTPUT SCREEN
     DEL ROW FROM transx
     SET RULE OFF
     DROP INDEX invid IN transx
     DROP INDEX prodid IN transx
     LOAD TRANSX FROM transx.dat AS ASCII
     CREATE INDEX ON transx invid
     CREATE INDEX ON transx prodid
     DEL transx.dat
 
     <> Here's another example of creative programming. Use the SELECT command
        to write a file of R:BASE commands, then RUN that file. This is a
        faster method of posting (updating a column in one table with values
        from another table) than using a DECLARE CURSOR command.
 
     Use the SELECT command to create a file of UPDATE commands, then execute
     the UPDATE commands by running the file. This technique, shown below,
     took just 00:40 (40 seconds). See the January/February 1992 and November/
     December 1990 issues of the R:BASE Exchange for other examples.
 
     SET HEADINGS OFF
     OUTPUT updinv.dat
     SELECT 'UPD INVOICE SET INVTOTAL = '=28,sum(tprice), +
      'WHERE invid = '=16,invid FROM transx GROUP BY invid
     OUTPUT SCREEN
     SET HEADINGS ON
     RUN updinv.dat
 
     In contrast, doing a DECLARE CURSOR loop took 01:54, about three times
     as long. This code is shown below:
 
     DECLARE C1 CURSOR FOR SELECT invid,sum(tprice) FROM transx +
      GROUP BY invid
     OPEN c1
     FETCH c1 INTO vinvid ind1,vprice ind2
     WHILE sqlcode < > 100 THEN
       UPDATE invoice SET invtotal=.vprice WHERE invid=.vinvid
       FETCH c1 INTO vinvid ind1,vprice ind2
     ENDW
 
 
     Change the environment
     ======================
     The R:BASE environment settings that have the most effect on performance
     are SET RULES, SET CLEAR, and SET MESSAGES.
 
     <> If you can do a process in a single-user environment, you'll improve
        performance by using SET CLEAR OFF before executing commands that
        change or add rows to the database. SET CLEAR OFF sets up a 5K buffer
        to hold changes. Changes are  written to disk only when you SET CLEAR
        ON, when the buffer is full or is needed for the next page of data,
        or when you disconnect the database or exit from R:BASE.
 
     <> If you know that the data being loaded or changed is correct, SET
        RULES OFF. You can see a small performance increase even if you don't
        have any rules. If many rules have to be checked you will see a larger
        performance increase.
 
     <> SET MESSAGES OFF eliminates the time required for screen display of
        messages. This is particularly noticeable if you are doing repetitive
        UPDATE or INSERT commands.
 
 
     Reduce Number of Expressions in Reports
     =======================================
     Because all report expressions are processed for each row of data
     read from the reports driving table or view, reducing the number of
     expressions in a report increases the printing speed of a report.
 
     <> If you have many lookups in a report, basing the report on a view
        instead of on a table will usually be faster. A view combines all the
        data just once; a lookup needs to find the data for each row in the
        driving table, even if the lookup is printing only on a header line.
 
     A sample report, based on a table with 8000 rows and containing 14 lookup
     expressions (24 total expressions), took 37:00 to print.
 
     As an alternative, create a four-table view and base the report on the
     view. Remove the lookup expressions from the report, leaving only 10
     expressions to be processed each row. Printing the report by using a view
     instead of lookup expressions takes only 23:00, almost twice as fast.
 
     <> Because R:BASE 3.1 no longer requires a global variable to be equated
        to a report variable, you can locate global variables on the report or
        reference them directly in an expression. For example, you don't need
        to define RPTDTE=. #DATE and then locate RPTDATE; just locate #DATE
        directly. Set constant variables outside the report; they'll be
        evaluated only once instead of for every row.
 
 
     Abbreviate commands to four characters
     ======================================
     R:BASE parses command lines into 4-byte tokens - the fewer tokens to
     read, the faster the command executes.
 
 
     Combine commands whenever possible
     ==================================
     Many R:BASE commands allow access to more than one column or variable
     and don't require using separate commands for each. This reduces the
     number of times R:BASE must parse and read a command. The fewer commands
     to read, the faster R:BASE executes.
 
     <> The most frequently used commands that fit into this category are SET
        VAR, UPDATE, SELECT, COMPUTE, and WRITE.
 
     For example, in 3.1 you can assign data types variables and set the
     variables to values in one command, like this:
 
     SET VAR v1 TEXT = 'abcd', v2 INTEGER=100, v3 DATE = .#date
 
     The COMPUTE command was modified in R:BASE 3.1A to allow multiple
     computations in one command, as was the WRITE command. See the Upgrade
     Express booklet for syntax.
 
 
     Make cursors faster
     ===================
     Sometimes a task can be accomplished only by stepping through each row
     in a table. If you need to use a DECLARE CURSOR routine, make it as fast
     as possible by using the following tips:
 
     <> Include all columns and computations in the DECLARE CURSOR statement
        so you can FETCH the values once instead of doing SET VAR commands
        repeatedly in the WHILE loop. Take advantage of SQL SELECT when doing
        your DECLARE CURSOR. This can reduce the number of times the commands
        in the WHILE loop must be repeated, thus improving performance.
 
     The following example took about  three minutes on an 8,000-row trans-
     action table:
 
     DECLARE C1 CURSOR FOR SELECT invid,sum(tprice) FROM transx +
       GROUP BY invid
     OPEN c1
     FETCH c1 INTO vinvid ind1,vprice ind2
     WHILE sqlcode < > 100 THEN
       UPDATE invoice SET invtotal=.vprice WHERE invid=.vinvid
       FETCH c1 INTO vinvid ind1,vprice ind2
     ENDW
 
     This example, by contrast, took about 12 minutes on an 8,000-row trans-
     action table:
 
     SET VAR vtotprice CURR=0
     SET VAR voldinv=invid IN transx WHERE LIMIT=1
     DECLARE C1 CURSOR FOR SELECT invid,tprice FROM transx ORDER BY invid
     OPEN c1
     FETCH c1 INTO vinvid ind1,vprice ind2
     WHILE sqlcode < > 100 THEN
       SET VAR vtotprice=(.vtotprice+.vprice)
       IF vinvid < > voldinv THEN
         UPDATE invoice SET invtotal=.vtotprice WHERE invid=.voldinv
         SET VAR vtotprice=0
       ENDIF
       SET VAR voldinv=.vinvid
       FETCH c1 INTO vinvid ind1,vprice ind2
     ENDW
 
     Fetch the index columns into variables and then use WHERE colname=
     .varname instead of WHERE CURRENT OF. Using an indexed where clause is
     slightly faster.
 
     <> When nesting cursors, do all the DECLAREs at the top of the file.
        Nest the OPEN and FETCH commands, not the DECLARE CURSOR. A good
        example of nesting cursors is presented in Supercharging R:BASE.
 
     <> When using DECLARE CURSOR routines, SET CLEAR to OFF. Doing so
        typically improves performance by at least half.
 
 
     Use correlated sub-selects or multi-table selects instead of sub-selects
     ========================================================================
     A multi-table select uses indexes to join the tables. A correlated sub-
     select uses indexes to restrict the list of values to be compared within
     the sub-select. A plain sub-select will not use indexes. Each row of data
     from the main query must be compared to every row selected by the sub-
     select.
 
     To illustrate, compare the results obtained by using the different
     techniques on two tables, each with 1,000 unique rows (each row in table1
     has only one match in table2).
 
     (1) A multi-table select, shown below, took 00:05.
 
         SELECT collist FROM table1,table2 WHERE table1.linkcol=table2.linkcol
 
     (2) A correlated sub-select, which looked at every row in table1 but
         needed to look at only one row in table2 for each row in table1, took
         00:10. The example is shown below:
 
         SELECT collist FROM table1 WHERE linkcol IN +
          (SELECT linkcol FROM table2 WHERE table2.linkcol=table1.linkcol)
 
     (3) A plain sub-select, which had to look for a match in every row in
         table2 for each of the 1,000 rows of table1, took 3:00 (three minutes)
 
         to complete. This example, by far the slowest, is shown below:
 
         SELECT collist FROM table1 WHERE linkcol IN +
          (SELECT linkcol FROM table2)
 
     SELECT is a command in which your data really affects the performance.
     The results you see and the method you find best will depend on the
     number of rows and distribution of data in the tables you are combining.
 
 
     Try using manual optimization
     =============================
     The multi-table selects discussed above were done with MANOPT set ON to
     show time differences based on table order. With MANOPT set ON, R:BASE
     joins tables in the order they are listed in the FROM clause of the
     SELECT command. The R:BASE optimizer generally picks the most efficient
     way to join the tables. You have the option in R:BASE 3.1C of manually
     specifying the order by using the MANOPT setting. Doing so might improve
     the performance of your multi-table selects and views. See your Upgrade
     Express booklet for complete information on using MANOPT.
 
 
     Practice smart indexing
     =======================
     In a WHERE clause, R:BASE uses only one index (except when joining
     tables). If more than one indexed column is referenced in a WHERE
     clause, R:BASE looks for the one that places the greatest restriction
     on the number of rows returned. It first looks for an indexed column
     using =, then checks for IS NULL, and finally looks at BETWEEN. It uses
     the first index it finds with the most restrictive operator.
 
     <> The following example uses the index on indexcol2 because it uses =
        and R:BASE assumes that that will restrict the query to fewer rows
        than the BETWEEN:
 
        WHERE indexcol1 BETWEEN .var1 AND + .var2 AND indexcol2 = .var3
 
     The next example uses the index on indexcol1 because both conditions use
     = and indexcol1 is first in the WHERE clause:
 
     WHERE indexcol1 =.var1 AND indexcol2 = .var2
 
     The third example won't use indexes because the conditions are joined
     with OR (R:BASE won't use indexes if conditions are joined with OR or if
     the indexed column is being compared to an expression):
 
     WHERE indexcol1 =.var1 OR indexcol2 = .var2
 
     By knowing which indexed column R:BASE will use in a WHERE clause, you
     can structure your conditions so that R:BASE uses the most unique (and
     thus most helpful) index. This is where knowing your data helps you to
     modify commands and gain speed. Check your WHERE clauses to make sure the
     most unique index is the one being used. See INDEXES entry in the Command
     Dictionary section of the Reference Manual for more information on how
     R:BASE chooses indexed columns.
 
     <> Version 3.1A added the IHASH function to R:BASE. You can use IHASH
        instead of text indexes for faster access to data. See your Upgrade
        Express booklet for complete information on using IHASH.
 
     <> If you are loading many rows of data at one time, it is considerably
        faster to drop the indexes, load the data, and then rebuild the
        indexes. It is much faster for R:BASE to build the indexes in one
        chunk rather than having them updated with each row as it is loaded.
        And 3.1C has improved the speed of the CREATE INDEX command by about
        half over previous versions. The four-minute process shown in the
        delete row example would have taken 18 minutes had the indexes not
        been dropped.
 
 
     Use UDFs
     ========
     R:BASE 3.1C includes the UDF function, which allows you to call C
     routines from within your R:BASE appli- cation, form, report, or table.
     A C routine can be faster than the same function written in the R:BASE
     program- ming language. See the article "UDFs Now Available" in this
     issue of the R:BASE Exchange for a listing of ready-made UDFs available
     for purchase from Microrim.
 
 
     Review your code
     ================
     Did you write that piece of code three months ago, six months ago, or
     perhaps longer ago still? As you learn more about R:BASE, about your
     data, and about how people are using the database and your application,
     you'll easily recognize areas in your code you can modify for performance
     improvements. You'll want to incorporate new techniques and new features
     from the current version of R:BASE.
 
     <> For example, the January/February 1991 R:BASE Exchange included an
        article showing how to create a Yes/No dialog box. It's about 70 lines
        of code. That same option was added to the DIALOG command in R:BASE
        3.1B. You could replace 70 lines of code with a single line and thus
        speed up that section of the application<197>but you have to review
        your code to find such opportunities.
 
     <> Also, have someone else review your code. Others can see areas for
        improvement and suggest other techniques that you miss because you
        are so familiar with your work. It is true that two heads are better
        than one.
 
 
     Other Sources
     =============
     These are just some of the techniques you can use to optimize your code to
 
     achieve the maximum speed potential of R:BASE 3.1.  Some of these
     techniques will provide dramatic improvements in speed; others will
     provide smaller and less noticeable speed improvements.  Taken together,
     these techniques will significantly increase the overall speed of your
     application.  Remember that the times given here are relative - the
     actual times you see will vary on the basis of your database (structure
     and data) and your hardware.
 
     You can find other information about optimization techniques in SUPER-
     CHARGING R:BASE (available from the Microrim Sales Department, 1-425-649-9500) 
     and in various issues of the R:BASE EXCHANGE.