Tag Archives: data visualization

My First (R) Shiny App: An Annotated Tutorial

Image Credit: Doug Buckley of http://hyperactive.to

Image Credit: Doug Buckley of http://hyperactive.to

I’ve been meaning to learn Shiny for 2 years now… and thanks to a fortuitous email from @ImADataGuy this morning and a burst of wild coding energy about 5 hours ago, I am happy to report that I have completely fallen in love again. The purpose of this post is to share how I got my first Shiny app up and running tonight on localhost, how I deployed it to the http://shinyapps.io service, and how you can create a “Hello World” style program of your own that actually works on data that’s meaningful to you.

If you want to create a “Hello World!” app with Shiny (and your own data!) just follow these steps:

0. Install R 3.2.0+ first! This will save you time.
1. I signed up for an account at http://shinyapps.io.
2. Then I clicked the link in the email they sent me.
3. That allowed me to set up my https://radziwill.shinyapps.io location.
4. Then I followed the instructions at https://www.shinyapps.io/admin/#/dashboard
(This page has SPECIAL SECRET INFO CUSTOMIZED JUST FOR YOU ON IT!!) I had lots 
of problems with devtools::install_github('rstudio/shinyapps') - Had to go 
into my R directory, manually delete RCurl and digest, then 
reinstall both RCurl and digest... then installing shinyapps worked.
Note: this last command they tell you to do WILL NOT WORK because you do not have an app yet! 
If you try it, this is what you'll see:
> shinyapps::deployApp('path/to/your/app')
Error in shinyapps::deployApp("path/to/your/app") : 
C:\Users\Nicole\Documents\path\to\your\app does not exist
5. Then I went to http://shiny.rstudio.com/articles/shinyapps.html and installed rsconnect.
6. I clicked on my name and gravatar in the upper right hand corner of the 
https://www.shinyapps.io/admin/#/dashboard window I had opened, and then clicked 
"tokens". I realized I'd already done this part, so I skipped down to read 
"A Demo App" on http://shiny.rstudio.com/articles/shinyapps.html
7. Then, I re-installed ggplot2 and shiny using this command:
install.packages(c('ggplot2', 'shiny'))
8. I created a new directory (C:/Users/Nicole/Documents/shinyapps) and used
setwd to get to it.
9. I pasted the code at http://shiny.rstudio.com/articles/shinyapps.html to create two files, 
server.R and ui.R, which I put into my new shinyapps directory 
under a subdirectory called demo. The subdirectory name IS your app name.
10. I typed runApp("demo") into my R console, and voila! The GUI appeared in 
my browser window on my localhost.
-- Don't just try to close the browser window to get the Shiny app 
to stop. R will hang. To get out of this, I had to use Task Manager and kill R.
--- Use the main menu, and do Misc -> Stop Current Computation
11. I did the same with the "Hello Shiny" code at http://shiny.rstudio.com/articles/shinyapps.html. 
But what I REALLY want is to deploy a hello world app with MY OWN data. You know, something that's 
meaningful to me. You probably want to do a test app with data that is meaningful to you... here's 
how you can do that.
12. A quick search shows that I need jennybc's (Github) googlesheets package to get 
data from Google Drive viewable in my new Shiny app.
13. So I tried to get the googlesheets package with this command:
devtools::install_github('jennybc/googlesheets')
but then found out it requires R version 3.2.0. I you already have 3.2.0 you can skip 
to step 16 now.
14. So I reinstalled R using the installr package (highly advised if you want to 
overcome the agony of upgrading on windows). 
See http://www.r-statistics.com/2013/03/updating-r-from-r-on-windows-using-the-installr-package/
for info -- all it requires is that you type installR() -- really!
15. After installing R I restarted my machine. This is probably the first time in a month that 
I've shut all my browser windows, documents, spreadsheets, PDFs, and R sessions. I got the feeling 
that this made my computer happy.
16. Then, I created a Google Sheet with my data. While viewing that document, I went to 
File -> "Publish to the Web". I also discovered that my DOCUMENT KEY is that 
looooong string in the middle of the address, so I copied it for later:
1Bs0OH6F-Pdw5BG8yVo2t_VS9Wq1F7vb_VovOmnDSNf4
17. Then I created a new directory in C:/Users/Nicole/Documents/shinyapps to test out 
jennybc's googlesheets package, and called it jennybc
18. I copied and pasted the code in her server.R file and ui.R file
from https://github.com/jennybc/googlesheets/tree/master/inst/shiny-examples/01_read-public-sheet 
into files with the same names in my jennybc directory
19. I went into my R console, used getwd() to make sure I was in the
C:/Users/Nicole/Documents/shinyapps directory, and then typed
 runApp("jennybc")
20. A browser window popped up on localhost with her test Shiny app! I played with it, and then 
closed that browser tab.
21. When I went back into the R console, it was still hanging, so I went to the menu bar 
to Misc -> Stop Current Computation. This brought my R prompt back.
22. Now it was time to write my own app. I went to http://shiny.rstudio.com/gallery/ and
found a layout I liked (http://shiny.rstudio.com/gallery/tabsets.html), then copied the 
server.R and ui.R code into C:/Users/Nicole/Documents/shinyapps/my-hello -- 
and finally, tweaked the code and engaged in about 100 iterations of: 1) edit the two files, 
2) type runApp("my-hello") in the R console, 3) test my Shiny app in the 
browser window, 4) kill browser window, 5) do Misc -> Stop Current Computation 
in R. ALL of the computation happens in server.R, and all the display happens in ui.R:

server.R:

library(shiny)
library(googlesheets)
library(DT)

my_key <- "1Bs0OH6F-Pdw5BG8yVo2t_VS9Wq1F7vb_VovOmnDSNf4"
my_ss <- gs_key(my_key)
my_data <- gs_read(my_ss)

shinyServer(function(input, output, session) {
 output$plot <- renderPlot({
 my_data$type <- ordered(my_data$type,levels=c("PRE","POST"))
 boxplot(my_data$score~my_data$type,ylim=c(0,100),boxwex=0.6)
 })
 output$summary <- renderPrint({
 aggregate(score~type,data=my_data, summary)
 })
 output$the_data <- renderDataTable({
 datatable(my_data)
 })

})

ui.R:

library(shiny)
library(shinythemes)
library(googlesheets)

shinyUI(fluidPage(
 
 # Application title
 titlePanel("Nicole's First Shiny App"),
 
 # Sidebar with controls to select the random distribution type
 # and number of observations to generate. Note the use of the
 # br() element to introduce extra vertical spacing
 sidebarLayout(
 sidebarPanel(
     helpText("This is my first Shiny app!! It grabs some of my data 
from a Google Spreadsheet, and displays it here. I      
also used lots of examples from"),
     h6(a("http://shiny.rstudio.com/gallery/", 
href="http://shiny.rstudio.com/gallery/", target="_blank")),
     br(),
     h6(a("Click Here for a Tutorial on How It Was Made", 
href="http://qualityandinnovation.com/2015/12/08/my-first-shin     
y-app-an-annotated-tutorial/",
      target="_blank")),
      br()
 ),
 
 # Show a tabset that includes a plot, summary, and table view
 # of the generated distribution
 mainPanel(
    tabsetPanel(type = "tabs", 
    tabPanel("Plot", plotOutput("plot")), 
    tabPanel("Summary", verbatimTextOutput("summary")), 
    tabPanel("Table", DT::dataTableOutput("the_data"))
 )
 )
 )
))


23. Once I decided my app was good enough for my practice round, it was time to 
deploy it to the cloud.
24. This part of the process requires the shinyapps and dplyr 
packages, so be sure to install them:

devtools::install_github('hadley/dplyr')
library(dplyr)
devtools::install_github('rstudio/shinyapps')
library(shinyapps)
25. To deploy, all I did was this: setwd("C:/Users/Nicole/Documents/shinyapps/my-hello/")
deployApp()

CHECK OUT MY SHINY APP!!

Contingency Tables with gmodels in R

crosstable-outputContingency tables provide a way to display the frequencies and relative frequencies of observations, which are classified according to two categorical variables. The elements of one category are displayed across the columns; the elements of the other category are displayed over the rows. 

For many semesters now, I’ve asked my students to prepare contingency tables that include row percentages and column percentages. Oh, and also the marginal distributions… you know, the totals on the right margin and the bottom margin. There’s an easy way to do this using Minitab, but I’m not a fan of proprietary software… I prefer open source whenever possible, and I wasn’t aware of a way to do this in R. As a result, I let them build their contingency tables by hand, and type them up in Microsoft Word. (Yeah, not efficient at all.)

Then I found gmodels. After installing gmodels and using:

library(gmodels)

to bring the package into active memory, I was able to create a contingency table SO easily that I can’t bear to think about all the hours I spent doing this sort of thing manually. First, I loaded some data describing the colors and defects associated with over 1200 M&M candies that my students observed. This data set has four variables: student (who collected the data), id (the number of the M&M that the student observed, in order of when they encountered that particular M&M), color (whether the candy was Blue, Red, BRown, Green, Orange, or Yellow), and whether there were defects observed (Letter incomplete or missing, Chipped or Cracked, Multiple defects, or No defects):

> mnms <- read.csv("mnm-clean.csv",header=T)
> head(mnms)
 student id color defect
1 wilburld 1 B L
2 wilburld 2 B N
3 wilburld 3 B N
4 wilburld 4 B N
5 wilburld 5 B N
6 wilburld 6 B C

Then, I constructed a really fancy contingency table IN JUST ONE LINE!!! This was very exciting.

> CrossTable(mnms$color, mnms$defect, prop.t=TRUE, prop.r=TRUE, prop.c=TRUE)

You can control whether row percentages (prop.r), column percentages (prop.c), or table percentages (prop.t) show up by making them TRUE in your call to CrossTable. Here’s what it looked like:

crosstable-output

You can also do a full Chi-square test of independence WHILE you’re displaying your contingency table… all you need to do is specify the chisq=TRUE argument to CrossTable. Here’s what I got for that:

Statistics for All Table Factors
Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 = 14.47214 d.f. = 15 p = 0.4900641

So, with a p-value that high, the color of M&M and whether it has a defect are INDEPENDENT. That makes sense. If they were not independent, then maybe there’s a problem with the production process.

I also found this fantastic paper that describes how one researcher is exploring alternative (and hopefully better!) ways to visualize categorical data. In addition to being an interesting read, it demonstrates alternatives like the mosaic.

Postscript: I just put a copy of the M&M data on a GitHub repository. I think I’ve started a new habit… this is fantastic. I can actually pull my data directly into R from GitHub. It’s like magic!! Here is the incantation:

library(RCurl)
url <- "https://raw.githubusercontent.com/NicoleRadziwill/Data-"
url <- paste(url,"for-R-Examples/master/mnm-clean.csv",sep="")
x <- getURL(url,ssl.verifypeer=FALSE)
mnms <- read.csv(text = x)