814.TXT ===================================================================== Selecting Random Data ===================================================================== PRODUCT: R:BASE VERSION: ALL ===================================================================== CATALOG: General Information AREA : Data Manipulation ===================================================================== The easiest way to select a group of random records from a table is to use the MOD function. When records are selected using the MOD function in a WHERE clause, you select a truly random set of data from throughout the data table. Both old and new records are automatically included and you can easily generate different numbers of random records. The MOD function calculates a remainder. To use the MOD function for selecting random data, you need a unique INTEGER column in the table, such as an ID number. If the ID number column is not an INTEGER data type, add an INTEGER autonumber column to the table. To select random rows of data, you select rows where the ID number is evenly divisible by another number_the MOD is 0. The smaller the number you divide by, the more records you select. For example, the following command selects rows where the ID number is evenly divisible by 3: SELECT * FROM CUSTOMER WHERE (MOD(custid,3)) = 0 The most records that can be selected using the MOD function is when two is used as the divisor. The expression (MOD(custid,2)) = 0 returns approximately half the rows of data from a sequentially numbered column. To select fewer records, increase the number. For example: SELECT * FROM CUSTOMER WHERE (MOD(custid,9)) = 0 To determine the number of records selected, use the aggregate function COUNT: SELECT COUNT(*) FROM CUSTOMER + WHERE (MOD(custid,9)) = 0 The MOD function can be used with any command that accepts a WHERE clause; you can select data, print a report, browse a view, or perform calculations on the selected set of data. You can even add additional WHERE clause conditions to the query. For example, to send a mailing to a random group of customers in a particular state use the following command: LBLPRINT mlabel4 + WHERE (MOD(custid,3)) = 0 AND custstate = 'CA' When you use the MOD function in a WHERE clause, the data is selected, but there is no record of which ID numbers were chosen. Use the same WHERE clause on an INSERT command to store the ID number of the randomly selected records in a table for future reference. An Example Let's look at how using the MOD function to select random records can be applied to a situation where promotional mailings are sent to groups of customers_every month a mailing is sent to customers who haven't ordered anything in at least 30 days. Two tables are needed to store information about the mailings: mailers_information about the mailer, and mailed_information about when and to whom the mailer was sent. The mailers table has the following columns: Column Data Description Name Type MailID INTEGER An autonumber column uniquely identifying each promotional mailer. Mail_Code TEXT 8 An internal company code used to identify the mailer. This code is printed on the address label and links with the orders table. Descr NOTE A description of the mailer. Number_Sen INTEGER The number of customers sent this t particular mailing. One row of data is entered in the mailers table for each promotional mailer. Depending on the type of mailings, additional columns can be added to the mailers table, such as columns for storing a promotion start and end date. The mailed table has the following columns: Column Data Description Name Type MailID INTEGER Identifies the mailer that was sent, links with the mailers table. Custid INTEGER Identifies the customer the mailer was sent to, links with the customer table. Mail_Date DATE The date the mailer was sent. Many rows of data are entered in the mailed table for each mailer; one row corresponds to each customer who was sent the mailer. Using this table, you can determine the effectiveness of your mailers. When a mailer is prepared, the mailers table is loaded with a row of data. The Number_Sent column is left empty until the mailer is actually sent. When a mailer is sent, labels are printed using the MOD function in the WHERE clause to identify a random group of customers, the customer ID numbers are inserted into the mailed table, and the mailers table is updated with the number sent. The same WHERE clause is used on each command. *(mailprt.cmd) -- print the labels for a random group of -- customers who haven't purchased anything in -- at least 30 days LBLPRINT mlabel + WHERE (MOD(custid,3)) = 0 AND custid IN + (SELECT custid FROM orders WHERE (#DATE-30) > + (SELECT MAX(orderdate) FROM orders T1 WHERE + T1.custid = orders.custid) ) -- insert rows into the mailed table INSERT INTO mailed + SELECT custid, .#DATE, 'L1201-3' FROM customer + WHERE (MOD(custid,3)) = 0 AND custid IN + (SELECT custid FROM orders WHERE (#DATE-30) > + (SELECT MAX(orderdate) FROM orders T1 WHERE + T1.custid = orders.custid) ) -- count the number of customers selected SELECT COUNT(*) INTO vnum i1 FROM CUSTOMER + WHERE (MOD(custid,3)) = 0 AND custid IN + (SELECT custid FROM orders WHERE (#DATE-30) > + (SELECT MAX(orderdate) FROM orders T1 WHERE + T1.custid = orders.custid) ) -- update the mailers table with the number sent UPDATE mailers SET number sent = .vnum + WHERE mail_code = 'L1201-3' To look for customers who have received more than one mailing, use the following command: BROWSE * FROM customer + WHERE custid IN + (SELECT custid FROM mailed + GROUP BY custid HAVING COUNT(*) > 1) To see if customers who received mailings then placed an order, check the data in the orders table using the following command. BROWSE * FROM orders + WHERE custid IN (SELECT custid FROM mailed)