Save 37% off PRO during our Black Friday Sale! »

Anaconda for R users

Anaconda for R users

Anaconda is a popular open-source Python distribution that includes more than 200 packages for scientific computing and data science. Recently, the Anaconda team released the “R Essentials” bundle with the IRKernel, which allows users to run R directly from a Jupyter notebook, and over 80 of the most used R packages for data science, including dplyr, shiny, ggplot2, tidyr, caret and nnet.

Anaconda includes Bokeh (http://bokeh.pydata.org/en/latest/) which is a visualization library that provides a flexible and powerful declarative framework for creating web-based plots. Bokeh renders plots using HTML canvas and provides many mechanisms for interactivity. Bokeh has interfaces in Python, Scala, Julia, and R, which is included in the "R-Essentials" bundle as rbokeh.

In this talk we will present how to get "R-Essentials", use conda for package and environment management, run Jupyter notebooks with the IRKernel and build interactive visualizations with rbokeh (http://hafen.github.io/rbokeh).

6cc5be6a122c6e768981003fd2e24789?s=128

Christine Doig

November 03, 2015
Tweet

Transcript

  1. Anaconda for R Users Christine Doig Oct 2015

  2. 2 • Industrial Engineer, UPC, Barcelona • Master Thesis -

    Voltage Stability Analysis, Aachen, Germany • Process Engineer - Operations Research, P&G • Business Analyst/Consultant, “La Caixa” • Quantitative Techniques for Finantial Markets, FME, UPC • Master in Data Mining and BI, FIB-UPC, Barcelona • Data Scientist, Continuum Analytics, Austin, Texas My background… Matlab C SAP SAS Excel VB Matlab SQL R Python R
  3. 3 • DARPA Memex - Human Trafficking • Python Advocate

    and Conference Speaker: PyCon Montreal, PyData Berlin, PyData Dallas, SciPy Austin, Europython Bilbao… • Blaze PM: http://blaze.pydata.org/ • Python Trainings • Blogs: • "Conda for Data Science" (https://www.continuum.io/content/conda-data-science) • "Jupyter and Conda for R" (https://www.continuum.io/blog/developer/jupyter-and-conda-r). at Continuum Analytics…
  4. 4 • Twitter: ch_doig • Github: chdoig • Site: chdoig.github.io

    • Email: cdoig@continuum.io Keep in touch!
  5. 5 Continuum Analytics… …distributes Anaconda, an open source Python distribution

    that includes more than 300 packages for scientific computing and data science …provides enterprise ready products for data scientist through the Anaconda Platform …delivers Python trainings and offers consulting services …supports the development of open source technology: conda, blaze, dask, bokeh, numba… …sponsors Python conferences PyData, SciPy, PyCon, Europython, PySS… Learn more: https://www.continuum.io/
  6. Agenda 6 • Anaconda and “R Essentials” • Conda: package

    and environment manager • Jupyter: collaborative notebooks • Bokeh: interactive data visualizations
  7. CONDA, ANACONDA & R ESSENTIALS Introduction

  8. 8 Conda • Package and environment manager • Language angnostic

    (Python, R, Java…) • Cross-platform (Windows, OS X, Linux) $ conda install python=2.7 $ conda install pandas $ conda install -c r r $ conda install mongodb
  9. 9 Conda • Open source: BSD license • Docs: http://conda.pydata.org/docs/

    • GH: https://github.com/conda
  10. 10 Conda vs Anaconda vs Miniconda vs R Essentials •

    Conda: package manager • Anaconda: Python + Conda + packages • Miniconda: Python + Conda • R Essentials: R + R packages
  11. 11 Language agnostic Python packages handles environments ! natively virtualenv

    installs binaries compiles from source general purpose ! envs python! envs Conda Pip
  12. 12 Conda + pip $ conda install pip $ pip

    install foo Conda skeleton pip $ conda skeleton pip foo $ conda build foo/
  13. 13 Why Conda? • Python with compiled, platform-dependent C, C++,

    or Fortran code • Seen this message too many times:“Storing debug log for failure in /.pip/pip.log” • Multi-language Data Science Projects
  14. 14 Anaconda Cloud ~ Github for binary packages

  15. 15 Anaconda Cloud $ conda build conda.recipe/ $ conda server

    upload my_foo_pkg $ conda install -c chdoig my_foo_pkg
  16. 16 Anaconda for R users $ conda install -c r

    r-foo
  17. 17 Mirror CRAN packages $ conda skeleton cran ldavis !

    $ conda build r-ldavis/ ! $ conda server upload my_r_pkg ! $ conda install -c chdoig my_r_pkg
  18. 18 Conda environments name: myenv channels: - chdoig - r

    - foo dependecies: - python=2.7 - r - r-ldavis - pandas - mongodb - spark=1.5 - pip - pip: - flask-migrate - bar=1.4 environment.yml $ conda env create $ source activate myenv $ conda env export -n freeze.yml Create and activate Freeze versions Upload to anaconda.org $ conda server upload my_foo_env.yml $ conda env create chdoig/my_foo_env.yml
  19. 19 Conda auto env cdoig:~$ cd pygotham-topic-modeling/ discarding /anaconda/bin from

    PATH prepending /anaconda/envs/pygotham-topic/bin to PATH (pygotham-topic)cdoig:~/pygotham-topic-modeling$ https://github.com/chdoig/conda-auto-env
  20. 20 "R essentials" comes with IRKernel and over 80 of

    the most used R packages for data science like dplyr, shiny, ggplot2, tidyr, caret and nnet. $ conda install -c r r-essentials $ conda config --add channels r or $ conda install r-essentials R Essentials
  21. 21 or, alternatively, create an environment to isolate your "R

    essentials" packages from others: ! $ conda create -n r-essentials -c r r-essentials R Essentials environment
  22. 22 $ conda metapackage custom-r-bundle 0.1.0 --dependencies r-irkernel jupyter r-ggplot2

    r-dplyr --summary "My custom R bundle” ! Custom metapackage to share
  23. JUPYTER Introduction

  24. 24 http://jupyter.org/ https://try.jupyter.org/ The Jupyter Notebook is a web application

    that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
  25. 25 IPython IPython notebook nbviewer tmpnb binder Jupyter https://try.jupyter.org/ http://mybinder.org/

  26. 26 ! To start jupyter notebooks, simply run the following

    command: ! $ jupyter notebook http://nbviewer.ipython.org/github/chdoig/conda-jupyter-irkernel/blob/master/Jupyter%20and%20conda%20for%20R.ipynb
  27. 27 Jupyter and conda for R notebook demo

  28. 28 Slideshow

  29. 29

  30. 30 $ jupyter nbconvert my_r_notebook.ipynb --to slides --post serve

  31. 31

  32. BOKEH Introduction

  33. 33 Custom visualizations Dashboards Streaming/ Animations Charts T ools Widgets

    Maps Hover Bokeh
  34. 34

  35. 35 http://hafen.github.io/rbokeh/ rbokeh

  36. 36 rbokeh demo

  37. 37 • Twitter: ch_doig • Github: chdoig • Site: chdoig.github.io

    • Email: cdoig@continuum.io Keep in touch!
  38. 38 Q & A Thanks!