CPP 526: Foundations of Data Science I


This lab introduces the R Shiny package, which allows you to create dynamic graphics that ask for input from the user and change based upon the selection.

This lab builds on the previous lab, but you will now add the drop-down menu similar to the original NYT version, and highlight season statistics from that team (the yellow line).



Dynamic Graphics

Shiny allows you to build interactive components in graphics or dashboards. The terms "dynamic" or "interactive" mean specifically that a set of widgets are used to gather user input, and that input changes the analysis or visualization accordingly.

You can try it out using the Shiny Demo file or the example from the Shiny tutorial:

When you include "runtime: shiny" in the header, you do not need to load the shiny package in your document setup steps.

Note that running shiny creates an object called input which stores the values from your input widgets. Whatever you assign for the input widget ID will be the address for the value, which you can then access at input$your_id.

selectInput( 
             inputId="my_widget_id",
             label="My Shiny Widget",
             choices=c("A","B","C"),
             selected="B"   
           )

Shiny automatically creates an object called "input" for you. When you create a widget, shiny will perform the following assignment implicitly in the background:

input$my_widget_id <- "B"  # user selected value

Before the user selects a value from the dropdown menu, the "selected=" argument specifies the default value. In this case, input$my_widget_id would store the value "B" by default. When the user makes a new selection, that value is updated with whatever option the user selected.

In the rest of your code, you can then reference the user input value at:

input$my_widget_id


Adding a Team Trend to the Graph

The solutions for last week's lab include an example of how you can add a specific team to the graph. It generalizes the team by using a generic team.name variable that stores the value.

Note that in the subsequent code instead of using the specific team name for data steps and graphing functions it references the placeholder variable team.name. This enables you to change all of the team name references in one place, simplifying your code.

## TEAM NAME HARD-CODED: APPEARS MULTIPLE TIMES IN CODE

# adding season stats for the team:
dat.one.team <- filter( Teams, name == "Chicago Cubs" )
points( x=dat.one.team$yearID, y=dat.one.team$ave.so, 
        pch=20,  cex=1.2, col="goldenrod1", type="b" )

# adding the team name to the legend:
segments( x0=1900, y0=8.3, x1=1904, y1=8.32, col="goldenrod1", lwd = 1.5 )
points( x=c(1900,1904), y=c(8.3,8.3), col="goldenrod1", pch=20, cex=1.2 )
text ( x=1912, y=8.3, col="goldenrod1", labels="Chicago Cubs", cex=0.8 )


## GENERALIZED TEAM NAME:

team.name <- "Chicago Cubs"   # only place it appears

# adding season stats for the team:
dat.one.team <- filter( Teams, name == team.name )
points( x=dat.one.team$yearID, y=dat.one.team$ave.so, 
        pch=20,  cex=1.2, col="goldenrod1", type="b" )

# adding the team name to the legend:
segments( x0=1900, y0=8.3, x1=1904, y1=8.32, col="goldenrod1", lwd = 1.5 )
points( x=c(1900,1904), y=c(8.3,8.3), col="goldenrod1", pch=20, cex=1.2 )
text ( x=1912, y=8.3, col="goldenrod1", labels=team.name, cex=0.8 )

After you build your widget you can make the team selection dynamic by linking the user selection to your generic team.name variable.

team.name <- input$my_widget_id  # input widget value



Lab Instructions

This lab requires the following packages:

install.packages( "shiny" )
install.packages( "flexdashboard" )

Note that you need to install packages before starting your lab. You cannot include an install commands inside of your RMD file.

You will likely use these packages as well:

library( dplyr )
library( pander )
library( Lahman )
data( Teams )

To get started, download the Lab-04 RMD template:



Step (1)

In the provided template, place your code from last week into the code chunk with the renderPlot() function:

renderPlot({
  
# your code here
  
})

Step (2)

Create a user-input widget on the left side bar by adding the proper values to these arguments:

selectInput( inputId=,   # creates slot in input to hold values: input$your_inputId
             label=,     # title of the input widget
             choices=,   # options for the user
             selected=   # set the default option
           )

Note that "choices=" will determine the teams visible in the drop-down box. You will be selecting from baseball team names.

  • Use the variable "name" in the dataset instead of teamID since it is more user-friendly.
  • Find a reasonable way to select specific teams so that you are only using 10-25 team names in the drop-down menu (see below for hints).
  • Combine your preferred teams into a character vector: choices=c("team1","team2",...,"teamN").

Step (3)

Similar to the NYT graphic, add the team trend line to the graphic in orange.

Note that you would add a single team to the graphic something like this:

Teams$ave.so <- Teams$SO / Teams$G  # to make subsetable, add back to data frame
one.team <- filter( Teams, name == "Cincinnati Reds" )
points( x=one.team$yearID, y=one.team$ave.so, type="b", pch=19, col="darkorange4" )

This code will also go inside the renderPlot() function in Step 1. You first need to replace the specific team name with the generalized input value from your selectInput() widget.



Notes on identifying team unique names.

Note there are 139 unique team names in the dataset:

unique( Teams$name ) %>% head()
## [1] "Boston Red Stockings"    "Chicago White Stockings"
## [3] "Cleveland Forest Citys"  "Fort Wayne Kekiongas"   
## [5] "New York Mutuals"        "Philadelphia Athletics"
unique( Teams$name ) %>% length()
## [1] 139

You do NOT want to use "choices=Teams$name" for the argument because each team name is repeated multiple times. This drop-down box would then have several thousand choices.

Instead select 10-25 teams, and include their full name in the choices list only once.

You might favor teams that have longer trend lines. The Cincinnati Reds and Pittsburgh Pirates have been in the dataset from the beginning, and the rest of the teams have been added along the way.

This table reports the total number of seasons they have each appeared:

Teams %>%
  count( name ) %>%
  arrange( - n ) %>% 
  head( 10 ) %>%
  pander()
name n
Cincinnati Reds 130
Pittsburgh Pirates 130
Philadelphia Phillies 129
St. Louis Cardinals 121
Chicago White Sox 120
Detroit Tigers 120
Chicago Cubs 118
Boston Red Sox 113
New York Yankees 108
Cleveland Indians 106

Some of these professional baseball teams you might not recognize because they only lasted for a short time:

Teams %>%
  count( name ) %>%
  arrange( - n ) %>% 
  tail( 10 ) %>%
  pander()
  name n
130 Rockford Forest Citys 1
131 Seattle Pilots 1
132 St. Louis Perfectos 1
133 St. Louis Red Stockings 1
134 St. Paul White Caps 1
135 Toledo Blue Stockings 1
136 Toledo Maumees 1
137 Washington Blue Legs 1
138 Washington Statesmen 1
139 Wilmington Quicksteps 1



How to Submit

Use the following instructions to submit your assignment, which may vary depending on your course's platform.


Knitting to HTML

When you have completed your assignment, click the "Run App" button to ensure your dashboard functions.


Special Instructions

Perform the following depending on your course's platform:

  • Canvas: Upload your .RMD file to the appropriate link
  • Blackboard or iCollege: Compress your .RMD file in a .ZIP file and upload to the appropriate link


Before You Submit

Remember to ensure the following before submitting your assignment.

  1. Name your files using this format: Lab-##-LastName.rmd and Lab-##-LastName.html
  2. Follow appropriate styling conventions, e.g. spaces after commas, etc.
  3. Above all, ensure that your conventions are consistent

See Google's R Style Guide for examples of common conventions.



Common Knitting Issues

.RMD files are knit into .HTML and other formats procedural, or line-by-line.

  • An error in code when knitting will halt the process; error messages will tell you the specific line with the error
  • Certain functions like install.packages() or setwd() are bound to cause errors in knitting
  • Altering a dataset or variable in one chunk will affect their use in all later chunks
  • If an object is "not found", make sure it was created or loaded with library() in a previous chunk