Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Championing Analytic Infrastructure

Fd59f90efdaa9dea8f7d9c2f0c930a2b?s=47 kellobri
April 27, 2019

Championing Analytic Infrastructure

An Introduction to R Administration

Fd59f90efdaa9dea8f7d9c2f0c930a2b?s=128

kellobri

April 27, 2019
Tweet

Transcript

  1. Championing Analytic Infrastructure An R User’s Journey into Linux

  2. Advocating for Analytic Infrastructure - Why it matters - Ideas

    for exploring the space and developing the skill set
  3. Solutions Engineer

  4. What happens when... Industrial Research Business Management Human Resources Government

    Work Regulated Environments Big Data Applications Cloud Infrastructure R in Production What is there to learn? What are the needs? What can we build? The R for Data Science Workflow Drops in Solutions Engineers!
  5. Things work out well when... Ideal: Someone with Linux experience

    and the vision to understand what data scientists need and how to enable them Commonly: This person does not exist
  6. Why it matters “It” = Analytic Infrastructure Meaning… All the

    How, Where and with What, that goes into your daily data science work.
  7. Lower the Cost of Turning Ideas into Realities If you

    have something of value and a PayPal account - you can: Start an online business in minutes
  8. Lower the Cost of Turning Ideas into Realities If you

    have data analysis of value and the R data product toolchain - you can: Share that analysis with everyone you know in minutes
  9. Lessons from the world of DevOps Tactical (dismissible) metric: code

    deployment lead time How long does it take you to get from raw materials (data) to some kind of finished product? How many teams do you have to traverse to make a real impact with the product of your work?
  10. Lessons from the world of DevOps 1. Architecture is what

    enables teams to deliver value through decreasing code deployment lead time 2. Architecture dominates how daily work is performed The improvement of daily work is more important than daily work itself Possibly from the DevOps handbook?? - Gene Kim @RealGeneKim
  11. “R Admin” - Analytic Administrator Role A data scientist who:

    Onboards new tools, deploys solutions, supports existing standards Works closely with IT to maintain, upgrade and scale analytic environments Influences others in the organization to be more effective Passionate about making R a legitimate analytic standard within the organization Check out Nathan Stephens on the RViews Blog - Analytics Administration for R
  12. My personal journey through platforms and tools Undergrad: R Terminal

    Grad School: RStudio IDE (local) + shinyapps.io (free account) My first “real” engineering job: - AWS Cloud $ - Open Source RStudio and Shiny Server (free!) Solutions Engineer: - All the clouds, all the products - limited by imagination (privileged)
  13. Exploring the space & Developing the skills

  14. The R Admin Goals RViews: Making R legitimate part of

    or your organization - Nathan Stephens
  15. Lay of the Land RStudio - view of the world

    We build tools that you can use to design an excellent platform for data scientists - Three core products - Run on your Linux servers Open source or professional
  16. The R Admin Playbook Build a Sandbox (Proof of Concept)

    Invest in Learning Develop Best Practices Extend your Domain Integrate and Interoperate
  17. Start with a sandbox! RStudio RViews Blog

  18. Learning Environments for Zero Dollars rstudio.com/products/quickstart

  19. Sandboxes - Learning Resources github.com/sol-eng/data-science-lab • A modern Linux operating

    system • An internet connection • Sudo access Step-by Guide: Instance + RStudio + Integration Sean Kross
  20. Learning Environments for Zero Dollars Virtual Machines vs. Containers

  21. command line utility for managing the lifecycle of virtual machines

  22. github.com/sol-eng/vagrant-ansible-sandbox Graduate from Sandboxes to Data Science Labs - Start

    translating your experience into recipes - Recipes are scripts for teaching other people what you know - Configuration management tools are a powerful way to communicate with IT through recipes
  23. The improvement of daily work is more important than daily

    work itself THIS IS A JOURNEY.
  24. If doesn’t bring you joy, don’t force it Please don’t

    force yourself to become an R Admin if the work is tedious to you. But also don’t ignore it. Getting the right tools matters. Seek out an analytic administrator or encourage that growth in someone around you.
  25. None