Slide 1

Slide 1 text

Reflections on a year spent talking to Data Scientists about DevOps

Slide 2

Slide 2 text

Solutions Engineering isn’t Dev and it isn’t Ops... Industrial Research Business Management Human Resources Government Work Regulated Environments Big Data Applications Cloud Infrastructure R in Production What is there to learn? What are the needs? What are the problems? Solutions Engineers!

Slide 3

Slide 3 text

What are the problems? 1. Legitimacy How do you get R recognized as an analytic standard? How do you make R a legitimate part of your organization and get the resources you need to support it? In many organizations, R enters through the back door when analysts download the free software and install it on their local workstations… Some organizations struggle to standardize on R due to a lack of management and governance around open source software. At the same time, organizations may neglect R on user workstations, thereby increasing security, legal, and operational risks. - Nathan Stephens, R Views 2016

Slide 4

Slide 4 text

What are the problems? 1. Legitimacy

Slide 5

Slide 5 text

(super-quick) Introduction to DevOps

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

1. DevOps is a philosophy / set of practices 2. Which create new processes for collaboration between Dev and Ops teams 3. There’s nothing new in DevOps A framework for making sense out of common sense

Slide 8

Slide 8 text

Vicious cycle of mutual resentment and distrust Dev Silo IT/Ops Silo THE FEAR “Hey - could you just put this thing in production real quick?” “Uh.. I just deployed this little change, and something might be broken”

Slide 9

Slide 9 text

Strategies for Managing Code Handoffs Steal Existing & Define Shared Goals

Slide 10

Slide 10 text

SUPER-vicious cycle of mutual resentment and distrust Data Science Silo IT/Ops Silo THE FEAR “Hey - I wrote this code using a bunch of open source packages some random person from the internet created … Also, I built a Web App - is that cool?”

Slide 11

Slide 11 text

Challenges for the R User Organizational ● Legitimizing R ● Working with IT Technical ● Experience ● Education ● Exposure

Slide 12

Slide 12 text

Shiny in Production Journey Code Profiling Version Control Testing Deployment/Release Access/Security Performance Tuning Shared Goal: Shorten the distance between development and production Shared Goal: The improvement of daily work Shared Goal: Reduce the risk of deploying a breaking change

Slide 13

Slide 13 text

Code Quality and Performance The “Hour-Long-Talk” of Data Products - Rambling, Cluttered - Parts that work well - Parts that work not-so well Local Development EDA, Prototyping, Iteration The “Lightning-Talk” of Data Products - Targeted - Elegant - Streamlined - Optimized Production Development

Slide 14

Slide 14 text

Turn a Prototype into a Production Application Performance Workflow 1. Use shinyloadtest to see if app is fast enough 2. If not, use profvis to see what’s making it slow 3. Optimize a. Move work out of shiny (very often) b. Make code faster (very often) c. Use caching (sometimes) d. Use async (occasionally) 4. Repeat!

Slide 15

Slide 15 text

Testing: Why Test Shiny Apps? ● You’ve developed a nice app ● You want to be confident that it will keep running in the future Things that can change/break a Shiny application ● Modifying code ● Upgrading the shiny package ● Upgrading other packages ● Upgrading R ● External data source changes or fails Shared Goal: Reduce the risk of deploying a breaking change

Slide 16

Slide 16 text

Automation! ● I don’t want to remember to run this testing procedure ● I don’t want to have to assure someone from IT that I ran it ● I certainly don’t want to hand the job off to them GIVE IT TO THE MACHINES Shared Goal: The improvement of daily work

Slide 17

Slide 17 text

Shared Goal: Shorten the distance between development and production ADVOCATE FOR A SANDBOX PUBLISHING ENVIRONMENT B. User Acceptance Testing A. Automated Snapshot Testing

Slide 18

Slide 18 text

● Deployment is any push of code to an environment (test, prod) ● Release is when that code (feature) is made available to users Application-based release patterns vs. Environment-based release patterns DevOps Learning: Decouple deployment from release

Slide 19

Slide 19 text

The DevOps Handbook 1. Accelerate Flow - Make work visible - Limit Work in Progress (WIP) - Reduce Batch Sizes - Reduce the number of handoffs - Continually identify and elevate constraints - Eliminate hardships and waste 2. Utilize Feedback - See problems as they occur - Swarm to solve problems and build new knowledge - Keep pushing quality closer to the source - Enable optimizing for downstream work centers 3. Learn and Experiment - Enable organizational learning and a safety culture - Institutionalize the improvement of daily work - Transform local discoveries into global improvements - Inject resilience patterns into daily work Three principles form the underpinnings of DevOps:

Slide 20

Slide 20 text

Start by answering some questions… - What is a Shiny Application? - Who is the audience? - What is your service level agreement definition? (SLA) - What does your analytic architecture look like today? - What are your goals for evolving this architecture? - How will monitoring be handled? - Who is responsible for maintenance? Make work visible, Define shared goals, Build a checklist, Iterate Empathetic Communication is Challenging

Slide 21

Slide 21 text

Production is... CUSTOMER/USER FACING - Ready to use - Software that end users are using - An app that is live and available to the end user - Apps on our production server are available to our clients - Client facing Credibility AT SCALE - Scaled to a larger audience - Bulletproof, scalable, fails predictably - Live to 1000 of users with production vehicle data SERVICE LEVEL AGREEMENTS - Required for mission-critical operations; downtime affects the ability to serve customers - Deployed for end users to have continual access without performance issues ENVIRONMENTAL REQUIREMENTS - An area where validated applications are deployed in a locked down environment - The main part of a company that handles all process - Application or system operates effectively without much maintaining effects - A server or environment that runs the “final” applications that your ultimate end-users (often external customers) use to get stuff down DOCUMENTATION - TESTING & MONITORING - Creating apps that can reach a wider audience and are deployed/tested in a consistent manner - Running in a way that is stable to use, documented and monitored

Slide 22

Slide 22 text

January 1. Shiny in Production Workshop 2. Configuration Management Tools for the R Admin April 3. Championing Analytic Infrastructure July 4. Art of the Feature Toggle 5. Environmental Release Patterns August 6. Shiny in Production: Building bridges from data science to IT September 7. Data Product Delivery: The R user’s journey toward improving daily work 8. The R in Production Handoff: Building bridges from data science to IT October 9. Interactivity in Production 10. Is there a Future for DevOps? speakerdeck.com/kellobri solutions.rstudio.com community.rstudio.com #radmins