A short introduction
to dplyr
Juan Natera
Los Angeles R Meetup
09/04/2014
Slide 2
Slide 2 text
A bit about me
• Software Engineer
• Interested in R and its use for gaining
insights about data
• Open Source enthusiast
• Baseball fanatic
Slide 3
Slide 3 text
About dplyr
• Developed by Hadley Wickham, Chief
Scientist @ Rstudio.
• Part of a suite of packages meant to
facilitate working on the “data pipeline”.
Slide 4
Slide 4 text
Why?
• People spend a lot of time getting data
ready for analysis
• Almost no learning curve (just need to
learn 5 verbs)
• Improves readability
• It's FAST
No learning curve, how?
• First parameter is always a data.frame
• Other parameters describe what you want
to do with it.
• Always returns a new data.frame