Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DevOps Toolkit for R - Tinkering in the Cloud

DevOps Toolkit for R - Tinkering in the Cloud

Presented at UseR! 2018 in Brisbane Australia

kellobri

July 13, 2018
Tweet

More Decks by kellobri

Other Decks in Technology

Transcript

  1. Evolution of the Data Scientist Statisticians - Subject Matter Expertise

    - Classical Statistics Learned to Code - R/python Data Scientists - Charts - Modeling - Bigger Data - Reproducibility Learned to Build Data Products - Visualizations - Animations - Shiny Applications - Dashboards - Packages - APIs New Skills! New Skills! New Jobs!
  2. Evolution of the Data Scientist Who Inherits Data Product Stewardship?

    - Access Control - Infrastructure Management - Maintenance - Scheduling
  3. Evolution of the Data Scientist Data Product Stewards - Access

    Control - Infrastructure Management - Maintenance - Scheduling Who Becomes the Analytic Administrator? - Invest time in learning best practices around data-ops - Spend less time doing data science Support / Resources R in Production Material Webinars for the Analytic Admin Communication with IT
  4. When developers begin to think of infrastructure as part of

    their application, stability and performance become normative. - Jeff Geerling “Ansible for DevOps”
  5. Resources for a Data Lab with RStudio Products github.com/sol-eng/data-science-lab •

    A modern Linux operating system • An internet connection • Sudo access Step-by Guide: Instance + RStudio + Integration Sean Kross
  6. Orchestrate and Burn Down Proof of Concepts take time and

    energy - but most of the time it’s best not to get too attached. When it comes time for the real deal, use what you learned to start fresh. The cleansing power of BURNING IT DOWN
  7. Configuration Management Automation for Everyone. Simple, powerful, agentless automation language

    to describe your complete IT infrastructure. Design your perfect playbook, or one that’s easily customizable
  8. Ansible Resources Webinars available on demand Good Community on Twitter

    Use in combination with other tools! Boto3 library for AWS Vagrant + VirtualBox for local VM quick launch Modules available for tons of other integrations!
  9. Ideally... You’re no longer new to Linux You’ve practiced setting

    up Data Lab Infrastructure You can customize, stand up and burn down infrastructure On Demand
  10. Google App Engine Fully managed serverless application platform Bring your

    code, have GCP manage all the infrastructure, Pay for what you use Famously (Infamously?) used by Snapchat 2008 2018 Now with Custom Runtimes!!
  11. Google App Engine - R not Included Can it be

    used to deploy docker + R + plumber? “Popular Crowd”
  12. You don’t always need to Docker all the things But

    in this particular instance we do.
  13. Two Custom App Engine Projects Two Directories: - Dockerfile -

    App.yaml - Assets runtime: custom env: flex
  14. Docker Prep - Plumber FROM trestletech/plumber RUN R -e 'install.packages(c("ggplot2"))'

    COPY [".", "./"] EXPOSE 8080 ENTRYPOINT ["R", "-e", "pr <- plumber::plumb(commandArgs()[4]); pr$run(host='0.0.0.0', port=8080)"] CMD ["plumber.R"] Jeff Allen is giving an RStudio Webinar on Plumber - July 25th - Catch the recording later if you’re in this timezone
  15. Docker Prep - NGINX FROM nginx COPY nginx.conf /etc/nginx/nginx.conf RUN

    mkdir -p /var/log/app_engine # Add the static content ADD www/ /usr/share/nginx/www/ RUN chmod -R a+r /usr/share/nginx/www The NGINX website has introductory webinars available on demand + loads of documentation
  16. Run App Engine! 1. Prep your Dockerfile 2. Create an

    app.yaml file 3. Gather any asset files $ gcloud app create $ gcloud app deploy
  17. *Not an instantaneous deployment process Leave yourself at least 15

    minutes for gcloud app deploy to run. (App Links - Now Defunct) If you need something with fast push button deploy... (I Work for RStudio)