Slide 1

Slide 1 text

Championing Analytic Infrastructure An R User’s Journey into Linux

Slide 2

Slide 2 text

Advocating for Analytic Infrastructure - Why it matters - Ideas for exploring the space and developing the skill set

Slide 3

Slide 3 text

Solutions Engineer

Slide 4

Slide 4 text

What happens when... Industrial Research Business Management Human Resources Government Work Regulated Environments Big Data Applications Cloud Infrastructure R in Production What is there to learn? What are the needs? What can we build? The R for Data Science Workflow Drops in Solutions Engineers!

Slide 5

Slide 5 text

Things work out well when... Ideal: Someone with Linux experience and the vision to understand what data scientists need and how to enable them Commonly: This person does not exist

Slide 6

Slide 6 text

Why it matters “It” = Analytic Infrastructure Meaning… All the How, Where and with What, that goes into your daily data science work.

Slide 7

Slide 7 text

Lower the Cost of Turning Ideas into Realities If you have something of value and a PayPal account - you can: Start an online business in minutes

Slide 8

Slide 8 text

Lower the Cost of Turning Ideas into Realities If you have data analysis of value and the R data product toolchain - you can: Share that analysis with everyone you know in minutes

Slide 9

Slide 9 text

Lessons from the world of DevOps Tactical (dismissible) metric: code deployment lead time How long does it take you to get from raw materials (data) to some kind of finished product? How many teams do you have to traverse to make a real impact with the product of your work?

Slide 10

Slide 10 text

Lessons from the world of DevOps 1. Architecture is what enables teams to deliver value through decreasing code deployment lead time 2. Architecture dominates how daily work is performed The improvement of daily work is more important than daily work itself Possibly from the DevOps handbook?? - Gene Kim @RealGeneKim

Slide 11

Slide 11 text

“R Admin” - Analytic Administrator Role A data scientist who: Onboards new tools, deploys solutions, supports existing standards Works closely with IT to maintain, upgrade and scale analytic environments Influences others in the organization to be more effective Passionate about making R a legitimate analytic standard within the organization Check out Nathan Stephens on the RViews Blog - Analytics Administration for R

Slide 12

Slide 12 text

My personal journey through platforms and tools Undergrad: R Terminal Grad School: RStudio IDE (local) + shinyapps.io (free account) My first “real” engineering job: - AWS Cloud $ - Open Source RStudio and Shiny Server (free!) Solutions Engineer: - All the clouds, all the products - limited by imagination (privileged)

Slide 13

Slide 13 text

Exploring the space & Developing the skills

Slide 14

Slide 14 text

The R Admin Goals RViews: Making R legitimate part of or your organization - Nathan Stephens

Slide 15

Slide 15 text

Lay of the Land RStudio - view of the world We build tools that you can use to design an excellent platform for data scientists - Three core products - Run on your Linux servers Open source or professional

Slide 16

Slide 16 text

The R Admin Playbook Build a Sandbox (Proof of Concept) Invest in Learning Develop Best Practices Extend your Domain Integrate and Interoperate

Slide 17

Slide 17 text

Start with a sandbox! RStudio RViews Blog

Slide 18

Slide 18 text

Learning Environments for Zero Dollars rstudio.com/products/quickstart

Slide 19

Slide 19 text

Sandboxes - Learning Resources github.com/sol-eng/data-science-lab ● A modern Linux operating system ● An internet connection ● Sudo access Step-by Guide: Instance + RStudio + Integration Sean Kross

Slide 20

Slide 20 text

Learning Environments for Zero Dollars Virtual Machines vs. Containers

Slide 21

Slide 21 text

command line utility for managing the lifecycle of virtual machines

Slide 22

Slide 22 text

github.com/sol-eng/vagrant-ansible-sandbox Graduate from Sandboxes to Data Science Labs - Start translating your experience into recipes - Recipes are scripts for teaching other people what you know - Configuration management tools are a powerful way to communicate with IT through recipes

Slide 23

Slide 23 text

The improvement of daily work is more important than daily work itself THIS IS A JOURNEY.

Slide 24

Slide 24 text

If doesn’t bring you joy, don’t force it Please don’t force yourself to become an R Admin if the work is tedious to you. But also don’t ignore it. Getting the right tools matters. Seek out an analytic administrator or encourage that growth in someone around you.

Slide 25

Slide 25 text

No content