Slide 1

Slide 1 text

My “Best of” List for the useR! 2014 Conference Daniel D. Gutierrez, Data Scientist AMULET Analytics November 11, 2014

Slide 2

Slide 2 text

/ page 2 Most Honored The Most Honored award must go to the main keynote address for the conference by the father of R, John Chambers and his talk “Interfaces, Efficiency and Big Data.” He identified three promising projects in the R community: Rcpp, LLVM for R: Compiling toolkit for R, and h2o: Interface and Java-based computations for big data. He summarized his talk with the simple points outlined in the above image.

Slide 3

Slide 3 text

/ page 3 Most Impressive (Genius Award) Of all the talks I attended, the Most Impressive designation gord to “Adaptive Resampling in a Parallel World” by Dr. Max Kuhn of Pfizer Global R&D Nonclinical Statistics. Dr. Kuhn made a very cogent presentation, fast-paced and clearly detailed. I valued the insight he provided about adaptive resampling as summarized here:  If the training set size is big enough, adaptive resampling can generate quality models.  If the computationally complexity is large, it can also generate significant speed-ups.  Parallel processing does not obviate the gains generated from adaptive resampling Very impressive talk. You can download his slides HERE and his preprint paper HERE. Interview with DataScience.LA -

Slide 4

Slide 4 text

/ page 4 Most Interesting In light of my previous life in astrophysics research, I found the Most Interesting talk to be “R in the Midst of Exploding Stars: Distributed, Time-Domain Transient Classification,” presented by JPL’s Thomas J. Fuchs who talked about a novel framework for time domain astronomy using R and machine learning algorithms for an iterative, dynamical classification of astronomical transient events such as supernovae. I had a brief chat with Thomas after the talk and found out he is not an astronomer, which lends credence to something I’ve known for a while – data scientists can contribute in meaningful ways to many diverse problem domains.

Slide 5

Slide 5 text

/ page 5 Most Anticipated I was Most Anticipating the tutorial (access the materials for “Data Manipulation dplyr” HERE) and short talk on dplyr by Hadley Wickham: “dplyr: a grammar of data manipulation.” As one of the main contributors to the R environment, Wickham is a powerhouse all to himself, albeit with much modesty (see Tweet below). Interview with DataScience.LA -

Slide 6

Slide 6 text

/ page 6 Most Congenial My Most Congenial award goes to Hilary Parker of Etsy who presented a poster about a new R package her group created called “testdat” (available on GitHub) for unit testing of tabular data. I found Hilary to be the most pleasant person at the conference to speak to, very welcoming and informative.

Slide 7

Slide 7 text

/ page 7 Most Inspirational Most Inspirational was an easy choice! I was very inspired by the talk “Practical use of R by blind people,” by A. Jonathan R. Godfrey PhD. (who is blind), a lecturer in statistics at the University of New Zealand. This was the first time I ever saw a blind person do a PowerPoint presentation and he did it with such ease that there was little difference with one delivered by a sighted person. He used a high-speed audio segment before each slide to jog his memory. You had to see it to appreciate it, this guy is a real superstar! Dr. Godfrey is a co-author of the BrailleR package.

Slide 8

Slide 8 text

/ page 8 Most Enthusiastic Another award goes to a poster presenter, Gergely Daroczi of (from Budapest), who did a lot of work on his project to track attendees of the useR! conference over the ages. His poster included a cool plot showing the overall number of attendees for all useR! conferences in the last 10 years. He really seemed to love what he was working on, and handed out some very high-quality hard copy reproductions of his poster – a nice touch. I really enjoyed hearing Gergley describe his work with such enthusiasm. He also designed the conference t-shirt containing the useR! logo and R code to display it.

Slide 9

Slide 9 text

/ page 9 Most Useful The Most Useful award goes to several presenters with useful technologies I personally plan to use: • “10 R packages to win Kaggle competitions,” by Xavier Conort of Data Robot • “RForcecom: an R package which provides a connection to and,” by Takekatsu Hiramura • “Deploying R into Business Intelligence and Real-time Applications,” by Louis Bajuk-Yorgan of TIBCO • “data.table: fast and flexible data manipulation,” by Matt Dowle

Slide 10

Slide 10 text

/ page 10 Most Fun Another poster presentation that I found to be the Most Fun to learn about was “Package ATPR for Statistical Analyses of Men’s Professional Tennis,” by Stephanie Kovalchik, Ph.D., a statistician from Rand Corporation. I found this research particularly interesting since I had been watching a lot of Wimbledon tournament matches in the last couple of weeks. Here are some slides describing Stephanie’s research in the area of tennis. Stephanie is a regular at the Los Angeles R User Group, so it was great to see her work firsthand.

Slide 11

Slide 11 text

/ page 11 Most Appreciated Random Meeting I was very pleased to run into Norm Matloff, author of The Art of R Programming, my favorite R text. He was presenting “An R Package for Parallel Matrix Powers.” I got the opportunity to chat with Norm about how much I liked his book, and that it is the one I usually recommend to people wishing to get up to speed with R (since I am a TA for Coursera, students often ask for a good R text). He was very gracious and I urged him on to come out with a 2nd Edition.

Slide 12

Slide 12 text

Thank you! Follow me: @AMULETAnalytics Contact me: