DOCUMENT #690
     =======================================================================
     SPORTS STATISTICS
     =======================================================================
     PRODUCT:  R:BASE                   VERSION :  3.1B or Higher
     =======================================================================
     CATALOG:  Programming In R:BASE    AREA    :  Statistical Analyst
     =======================================================================
 
 
     R:BASE quickly and easily stores data about team or individual sports 
     and computes statistics. Use forms to enter the data and then views and 
     reports to display statistical information. Following are examples for 
     both team (baseball) and individual (bicycling) sports.
 
     
     Baseball
     ========
     A common team statistic is won/lost percentages. With team sports, you 
     also want to compare players. Some of the individual player statistics 
     are batting average and on-base percentage. All of these can be quickly 
     calculated using views.
 
     Formulas to compute these statistics are as follows:
 
     won/lost percentage     =       (total games won)
                                      ---------------  
                                     (total games played)
 
     batting average =               (total hits)
                                      ----------
                                     (total at bats)
 
     on-base percentage      =       (total hits + total bases on balls)
                                      ---------------------------------
                                     (total at bats + total bases on balls)
 
     The word total in the above formulas indicates where to do a sum. The 
     rest is simple addition and division.
 
     Create two tables, one to hold team data, one to hold the individual 
     player data. You'll have other tables to hold more information about 
     the players and teams. For example, league, division, player's nickname, 
     age, right or left handed, position etc. The team_stats table is loaded 
     with one row for each team per game. The Won and Lost columns are auto-
     matically computed from the score. The player_stats table is loaded 
     with one row of information per player per game.
 
        Table: team_stats
 
        Column definitions
        # Name               Type            Index   Expression
        1 Team_name          TEXT   20
        2 Team_score         INTEGER
        3 Opponent           TEXT   20
        4 Opp_score          INTEGER
        5 Won                INTEGER        (IFGT(team_score,opp_score,1,0))
        6 Lost               INTEGER        (IFLT(team_score,opp_score,1,0))
        8 Game_date          DATE
        
    Sample data from team_stats
 
    Team_name        Team_score Opponent         Opp_score Won Lost Game_Date
    ---------------- ---------- ---------------- --------- --- ---- ---------
    Oakland A's      12         NY Yankees       7         1   0    04/12/1993
    NY Yankees       7          Oakland A's      12        0   1    04/12/1993
    Oakland A's      2          Seattle Mariners 3         0   1    04/20/1993
    Seattle Mariners 3          Oakland A's      2         1   0    04/20/1993
    NY Yankees       2          Seattle Mariners 9         0   1    04/21/1993
    Seattle Mariners 9          NY Yankees       2         1   0    04/21/1993
    Texas Rangers    0          Seattle Mariners 1         0   1    05/01/1993
    Seattle Mariners 1          Texas Rangers    0         1   0    05/01/1993
    NY Yankees       2          Texas Rangers    3         0   1    04/15/1993
    Texas Rangers    3          NY Yankees       2         1   0    04/15/1993
    Texas Rangers    2          Oakland A's      0         1   0    04/17/1993
    Oakland A's      0          Texas Rangers    2         0   1    04/17/1993
  
 
       Table: player_stats
 
       Column definitions
       # Name                Type            Index   Expression
       1 Player_Name         TEXT    20
       2 AtBats             INTEGER
       3 Hits                INTEGER
       4 HomeRuns           INTEGER
       5 Triples             INTEGER
       6 Doubles             INTEGER
       7 Singles             INTEGER
       8 BaseOnBalls       INTEGER
       9 Gamedate           DATE
 
     Sample data from player_stats
 
Player_Name     AtBats Hits HomeRuns Triples Doubles Singles BaseOnBalls Gamedt
 
--------------  ------ ---- -------- ------- ------- ------- ----------- ------
 
Ken Griffey Jr. 5      3    0        1       1       1       1         04/15/93
 
Mickey Mantle   3      1    0        0       0       1       0         04/15/93
 
Bobby Bonds     5      4    2        0       0       2       1         04/15/93
 
Willie Mays     3      3    0        1       2       0       2         04/17/93
 
Henry Aaron     5      4    2        1       1       0       1         04/17/93
 
Joe DiMaggio    4      3    0        0       0       3       2         04/17/93
 
Ken Griffey Jr. 4      3    2        1       0       0       2         04/21/93
 
Mickey Mantle   2      0    0        0       0       0       1         04/21/93
 
Bobby Bonds     3      1    0        0       0       1       0         04/21/93
 
Willie Mays     3      1    0        0       0       1       1         05/01/93
 
Henry Aaron     4      2    0        1       0       1       0         05/01/93
 
Joe DiMaggio    3      3    0        1       0       2       1         05/01/93
 
 
 
     The statistics are calculated using functions available with the SELECT 
     command and stored as views. The FORMAT function is used to format 
     output for display, but can't have computations as arguments. So two 
     views are used; one to do the computations, one to format the output. 
 
     Use the following two views to look at the team statistics. After 
     creating the views, SELECT * FROM teams_2 displays the team standings.
 
     CREATE VIEW teams_1 +
     (team,won,lost,pct) AS SELECT +
     team_name,SUM(won),SUM(lost), +
     (SUM(won)/(COUNT(*))) FROM team_stats +
     GROUP BY team_name ORDER BY 4 DESC
 
     CREATE VIEW teams_2 +
     (team,won,lost,pct) AS SELECT +
     team,won,lost,(FORMAT(pct,'.900')) +
     FROM teams_1
 
      team                   won      lost   pct
      ----------------       ---      ----   ----- 
      Seattle Mariners       3        0      1.000
      Texas Rangers          2        1      0.666
      Oakland A's            1        2      0.333
      NY Yankees             0        3      0.000
     
     Use the next two views to look at the player statistics. After creating 
     the views, SELECT or BROWSE from player_2 to display data about all the 
     players.
 
     CREATE VIEW player_1 +
     (PlayerName, BatAvg, OnBase, AtBats, +
     HR, Triples,Doubles, Singles, Walks) +
     AS SELECT +
     player_name,(SUM(hits)/SUM(at_bats)), +
     ((SUM(hits)+SUM(base_on_balls))/ +
     (SUM(at_bats)+SUM(base_on_balls))), +
     SUM (at_bats),SUM (home_runs), +
     SUM (triples),SUM (doubles), +
     SUM(singles),SUM (base_on_balls) +
     FROM player_stats GROUP BY player_name
 
     
     CREATE VIEW player_2 +
     (PlayerName, BatAvg, OnBase, AtBats, +
     HR, Triples,Doubles, Singles, Walks) +
     AS SELECT +
     PlayerName,(FORMAT(BatAvg,'.900')),+
     (FORMAT(OnBase,'.900')), AtBats,HR, +
     Triples,Doubles,Singles,Walks +
     FROM player_1 ORDER BY 2 DESC
 
     PlayerName      BatAvg OnBase AtBats HR Triples Doubles Singles Walks 
     --------------- ------ ------ ------ -- ------- ------- ------- -----
     Willie Mays     .727   .800   11     1  2       3       2       4
     Joe DiMaggio    .692   .764   13     0  1       2       6       4
     Bobby Bonds     .666   .700   18     3  0       2       7       2
     Ken Griffey Jr. .666   .736   15     2  3       3       2       4
     Henry Aaron     .666   .687   15     3  3       2       2       1
     Mickey Mantle   .363   .500   11     0  0       2       2       3
 
 
     The view Player_2 automatically sorts by batting average. The player 
     with the highest batting average is listed first. To see the players 
     ordered by on base percentage use the following command:
 
     SELECT PlayerName,BatAvg,OnBase FROM player_2 ORDER BY OnBase DESC
 
      PlayerName       BatAvg  OnBase
      ---------------  ------  ------
      Willie Mays      .727    .800
      Joe DiMaggio     .692    .764
      Ken Griffey Jr.  .666    .736
      Bobby Bonds      .666    .700
      Henry Aaron      .666    .687
      Mickey Mantle    .363    .500
 
 
     Bicycling
     ---------
     Statistics for individual sports are different from team sports. 
     Instead of comparing player to player, the comparison is yourself 
     against the clock, and how well did you do this week compared to last 
     week. By tracking the distance traveled and the time it took, R:BASE 
     reports can compute your speed. The reports are printed for a specified 
     time period to track progress and performance. 
 
     Reports are used instead of views because a sequence of expressions are 
     needed to do the computations. The time must be converted to seconds 
     and then hours in order to calculate the speed (miles per hour). This 
     is easier to do in a report than in a SELECT command in a view. A report 
     also lets you do multiple levels of totals.
 
     Create a table to hold information about each outing. Record the 
     distance traveled, the time it took and the date. Add columns to hold 
     other relevant information as desired, for example, route, weather etc.
 
        Table: CYCLE_LOG
 
        Column definitions
        # Name               Type            Index   Expression
        1 Ride_Date          DATE            *
        2 Mileage            INTEGER
        2 Ride_Time          TIME
        4 Route              TEXT    20
        5 Ride_Notes         NOTE
        
     Create a report with the following layout and expressions:
 
     RH
     PH
     PH      DATE            ROUTE          MILEAGE      TIME       SPEED
     PH      ------- ---------------------  ---------    ------     -------
     D       ride_date       route          mileage      ride_time  vspeed MpH
     PF
     RF      TOTALS: vmileage                vavg_speed Mph
 
       1: INTEGER    vsec = (RIDE_TIME - 00:00:00)
       2: DOUBLE     vhours = (.vsec / 3600)
       3: DOUBLE     vspeed = (MILEAGE / .vhours)
       4: INTEGER    vmileage = SUM OF MILEAGE
       5: INTEGER    vtotal_sec = (.vtotal_sec + .vsec)
       6: DOUBLE     vtotal_hours = (.vtotal_sec / 3600)
       7: DOUBLE     vavg_speed = (.vmileage / .vtotal_hours)
     
     The variable vsec converts the time to seconds, then the seconds are 
     converted to hours (vhours) to compute the speed in miles per hour 
     (vspeed). Variables vmileage, vtotal_sec, vtotal_hours are used to 
     compute the average speed, vavg_speed, for all rides.
 
     Output from the report looks like this:
 
      DATE           ROUTE               MILEAGE TIME      SPEED
      --------   ----------------------  ------- -------   --------  
      01/11/92   Vashon Island           25      1:22:00   18.2 MpH
      03/01/92   Bainbridge Island       26      1:31:00   17.1 MpH
      03/29/92   Vashon Island           25      1:17:22   19.3 MpH
      06/06/92   Whidbey Island          126     6:55:00   18.2 MpH
      06/20/92   Vashon Island           25      1:14:30   20.1 MpH
      06/27/92   STP                     195     11:45:00  16.5 MpH
      07/03/92   Southworth              60      3:06:00   9.3 MpH
      07/15/92   Vashon Island           25      1:13:42   20.3 MpH
      07/23/92   Vashon Island           25      1:15:26   19.8 MpH
      07/30/92   RAMROD                  154     9:23:00   16.4 MpH
      08/02/92   Vashon Island           25      1:16:50   19.5 MpH
      08/25/92   Vashon Island           25      1:08:47   21.8 MpH
      09/13/92   Vashon Island           25      1:15:23   19.8 MpH
      10/03/92   West Seattle/I-90       30      3:00:00   10.0 MpH
      10/04/92   West Seattle/I-90       15      1:30:00   10.0 MpH
     
     TOTALS: 806               17.0 Mph
     
     To do a summary report by month, add an expression to find the month, 
     vmonth = (TMON(ride_date)), and break on the variable. Change the Detail 
     line to a Break Footer and reset the accumulator variables, vmileage and 
     vtotal_sec, at the break. Add a second group of expressions to do report 
     totals. The report generates summary data to track monthly performance.
 
     Expressions for summary report:
 
       1: INTEGER    vsec = (RIDE_TIME - 00:00:00)
       2: INTEGER    vtotal_sec = (.vtotal_sec + .vsec)
       3: DOUBLE     vhours = (.vtotal_sec / 3600)
       4: INTEGER    vmileage = SUM OF MILEAGE
       5: DOUBLE     vspeed = (.vmileage / .vhours)
       6: TEXT               vmonth = (TMON(ride_date))
       7: INTEGER    vtotal_miles = SUM OF MILEAGE 
       8: INTEGER    vrtotal_sec = (.vrtotal_sec + .vsec)
       9: DOUBLE     vtotal_hours = (.vrtotal_sec / 3600)
      10: DOUBLE     vavg_speed = (.vtotal_miles/.vtotal_hours)
 
     When printing the monthly summary report, order by the date so it will 
     break correctly on month.
  
     R>PRINT mthcycle ORDER BY Ride_Date
 
      MONTH           MILEAGE         SPEED
      ---------       ---------       -------
      January         25              18.2 MpH
      March           51              18.1 MpH
      June            346             17.3 MpH
      July            264             17.6 MpH
      August          50              20.6 MpH
      September       25           19.8 MpH
      October         45              10.0 MpH
 
     Totals:         806             17.0 Mph
 
     This structure can easily be adapted to other individual sports such as 
     running, swimming, kyaking etc.