Plotting Graphs with ggvis

Grammar of Graphics

In linguistics, grammar is the set of structural rules governing the composition of clauses, phrases, and words in any given natural language. (https://en.wikipedia.org/wiki/Grammar)

The Grammar of graphics  is a tool that basically use  the same concept but instead of build sentences that are the foundation of paragraphs which lead on to works of literature we are building graphs.

One grammar graphic tool is ggvis; a data visualization package for R.

The grammar for ggvis is

graph =  data + coordinate system + properties + mark

[pre]

<data>  %>% 
  ggvis(~<x property>,~<y property>, 
        fill = ~<fill property>, size=~<size property>) %>% 
  layer_<marks>()

[/pre]

3 common charts are going to be shown in this tutorial

  • Bar Charts
  • Line Charts
  • Scatter Charts

Bar Charts

The bar chart is used when comparing the mean or percentages of 8 or more different groups.

[pre]

mtcars%>% ggvis(~ wt, ~mpg) %>% layer_bars()

[/pre]

mtcars_bar

Line Charts

Line charts are used to illustrate trends over time.

[pre]

mtcars%>% ggvis(~ wt, ~mpg) %>% layer_lines()


[/pre]

mtcars_lines.png

Scatter Plots

Scatter plots are used to depict how different objects settle around a mean based on 2 to 3 different dimensions. This allows for quick and easy comparisons between competing variables. Scatter plots show how much one variable is affected by another.

[pre]

mtcars%>% ggvis(~ wt, ~mpg) %>% layer_points()


[/pre]
mpg_points.png

First I exported data from the Basketball-Reference site. For this example I am going to use
Jimmy Butler's statistics from 2015-2015. I am just going to plot Butler's game score for each game.
This statistic was invented by John Hollinger to provide a rough measure of a player's 
performance in a given game.  The scale upon which the player's game score is based is 
the same as points scored.  If a player has a game score of 40, that is amazing, 
while a game score of 10 is average.(http://www.sportingcharts.com/dictionary/nba/game-score-statistic.aspx)

Install the ggvis and call the library in order to use the package
[pre]
install.packages("ggvis")
library(ggvis)
[/pre]
Import the data using the read.csv function. Make sure you specify stringsAsFactors optional parameter as false.

[pre]
butler<- read.csv("jimmy_butler.csv", stringsAsFactors=FALSE)
[/pre]

Explore the data. In this instance I am looking at the column that selects Jimmy Butler's game score.

[pre]
butler$GmSc 
[/pre]

Subset the observations
[pre]
butler2<- butler[1:65,]
[/pre]

Attach the search path to the environment.
The attach() function in R can be used to make objects within dataframes accessible in R with fewer keystrokes.
I noticed when I was coercing the Game Score data to become numeric I got an error invalid subscript type integer error in r using ggvis.
After search through the pages of stack overflow. I've learned that The dplyr package doesn't like the usage of '$'. Try instead using '[', e.g.:
[pre]
attach(butler2)
butler2 %>% ggvis(~butler2$G,~as.numeric(GmSc)) %>% layer_points()
butler_points
butler2 %>% ggvis(~G,~as.numeric(GmSc)) %>% layer_bars()
butler_bar
butler2 %>% ggvis(~G,~as.numeric(GmSc)) %>% layer_lines()
butler_line
[/pre]
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s