Slide 1

Slide 1 text

Open Infrastructure for open science How Binder Powers an Open Stack in the Cloud Chris Holdgraf, UC Berkeley and Project Jupyter @choldgraf

Slide 2

Slide 2 text

you???

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

A bit about me then...

Slide 5

Slide 5 text

A bit about me now...

Slide 6

Slide 6 text

a community of people and an ecosystem of open tools and standards for interactive computing

Slide 7

Slide 7 text

create things that are language-agnostic and modular. Empower people to use other open tools.

Slide 8

Slide 8 text

Aside: Jupyter and the last mile problem

Slide 9

Slide 9 text

You San Jose Coffee!

Slide 10

Slide 10 text

You San Jose Coffee!

Slide 11

Slide 11 text

You San Jose Coffee!

Slide 12

Slide 12 text

BART Bus You San Jose Coffee!

Slide 13

Slide 13 text

BART Bus The “last mile” You San Jose Coffee!

Slide 14

Slide 14 text

Public infrastructure gets us closer to our goal. It makes the last mile shorter.

Slide 15

Slide 15 text

How does Jupyter fit in to this?

Slide 16

Slide 16 text

You Your awesome report

Slide 17

Slide 17 text

You Your Awesome report

Slide 18

Slide 18 text

You Your awesome report Jupyter shortens the last mile by creating and leveraging public infrastructure server .ipynb package ecosystem Notebook document specification Jupyter server protocol Interactive Kernels Notebook interfaces

Slide 19

Slide 19 text

Back to our talk...

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Our mission for this talk Us .ipynb Open, reproducible, sharable environments

Slide 22

Slide 22 text

Part 1: from your laptop to the cloud with JupyterHub

Slide 23

Slide 23 text

(some) data science should be taught to everyone (no, really)

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

How can we connect people with computation?

Slide 26

Slide 26 text

What is JupyterHub? Host pre-configured data science environments on shared infrastructure jupyter.org/hub

Slide 27

Slide 27 text

myhub.org My fancy machine in the cloud

Slide 28

Slide 28 text

myhub.org

Slide 29

Slide 29 text

myhub.org environments

Slide 30

Slide 30 text

myhub.org interfaces environments

Slide 31

Slide 31 text

myhub.org interfaces environments documents

Slide 32

Slide 32 text

compute data

Slide 33

Slide 33 text

AUTHENTICATION

Slide 34

Slide 34 text

Chris Is Trying A Live Demo Hopefully he doesn’t embarrass himself too badly. Textbook link Interact link

Slide 35

Slide 35 text

EXAMPLE + shout-out to open humans notebooks.openhumans.org

Slide 36

Slide 36 text

Us .ipynb Open, reproducible, sharable environments

Slide 37

Slide 37 text

How can users package+share their work?

Slide 38

Slide 38 text

Part 2: packaging and sharing your environment with repo2docker

Slide 39

Slide 39 text

What is repo2docker? Convert a repository into a Docker image that runs the code inside. repo2docker.readthedocs.io

Slide 40

Slide 40 text

github.com/minrk/ligo-binder

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

repo2docker what does it do? $ jupyter repo2docker \ > https://github.com/minrk/ligo-binder

Slide 43

Slide 43 text

repo2docker what does it do? $ jupyter repo2docker \ > https://github.com/minrk/ligo-binder Cloning into '/var/folders/.../T/repo2dockermu6z66sd'... Using CondaBuildPack builder Step 1/31 : FROM buildpack-deps:bionic ---> 29f4eef41002 Step 2/31 : ENV DEBIAN_FRONTEND=noninteractive ---> Using cache ---> ee1ba7c4f5f4 Step 3/31 : RUN apt-get update && apt-get install --yes --no-install-recommends locales && apt-get purge && apt-get clean && rm -rf /var/lib/apt/lists/*

Slide 44

Slide 44 text

repo2docker what does it do? Step 1: get the repo git clone https://github.com/me/myproject

Slide 45

Slide 45 text

repo2docker what does it do? Step 2: Identify requirements

Slide 46

Slide 46 text

Step 2: Identify requirements repo2docker what does it do?

Slide 47

Slide 47 text

Step 2: Identify requirements repo2docker what does it do?

Slide 48

Slide 48 text

... COPY conda/install-miniconda.bash /tmp/install-miniconda.bash COPY conda/environment.py-3.6.frozen.yml /tmp/environment.yml RUN bash /tmp/install-miniconda.bash && \ rm /tmp/install-miniconda.bash /tmp/environment.yml ... Step 3: generate Dockerfile repo2docker what does it do?

Slide 49

Slide 49 text

Step 3: generate Dockerfile ... # Copy and chown stuff. COPY src/ ${HOME} RUN chown -R ${NB_USER}:${NB_USER} ${HOME} # Run assemble scripts! These will actually build the spec # in the repository into the image. USER ${NB_USER} RUN ${KERNEL_PYTHON_PREFIX}/bin/pip install --no-cache-dir \ -r "requirements.txt" ... repo2docker what does it do?

Slide 50

Slide 50 text

Step 4: build (& push) image docker build -t myimage docker push myimage repo2docker what does it do?

Slide 51

Slide 51 text

repo2docker what does it do? $ jupyter repo2docker \ > https://github.com/minrk/ligo-binder Cloning into '/var/folders/.../T/repo2dockermu6z66sd'... Using CondaBuildPack builder Step 1/31 : FROM buildpack-deps:bionic ---> 29f4eef41002 Step 2/31 : ENV DEBIAN_FRONTEND=noninteractive ---> Using cache ---> ee1ba7c4f5f4 Step 3/31 : RUN apt-get update && apt-get install --yes --no-install-recommends locales && apt-get purge && apt-get clean && rm -rf /var/lib/apt/lists/*

Slide 52

Slide 52 text

Some supported configuration files ● environment.yml ● requirements.txt ● REQUIRE ● install.R ● apt.txt ● setup.py ● postBuild ● runtime.txt ● Dockerfile ● yournewbuildpack.txt

Slide 53

Slide 53 text

● Repos should be human and machine readable ● Use existing specifications and standards ● Support many languages and interfaces ● Be lightweight and tightly-scoped, but extendable Guiding principles of repo2docker

Slide 54

Slide 54 text

Us .ipynb Open, reproducible, sharable environments

Slide 55

Slide 55 text

Part 3: tying this together with BinderHub

Slide 56

Slide 56 text

What is BinderHub? One-click sharable, interactive, reproducible environments from your public git repository mybinder.org binderhub.readthedocs.io

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

● ● ● ● BinderHub is open tech...

Slide 59

Slide 59 text

One example: mybinder.org

Slide 60

Slide 60 text

Chris Is Trying A Live Demo Hopefully he doesn’t embarrass himself too badly. mybinder.org binder-examples/requirements

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

Mybinder.org is an open service...

Slide 63

Slide 63 text

mybinder.org weekly sessions, last ~year

Slide 64

Slide 64 text

mybinder.org sessions, last month

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

Some cool new projects (or, stuff you can help out with)

Slide 67

Slide 67 text

Interactive books with Jupyter-Book github.com/jupyter/jupyter-book

Slide 68

Slide 68 text

https://github.com/simpeg-research/heagy-2018-aem Publishable documents (with pandoc?)

Slide 69

Slide 69 text

HTML dashboards with voila github.com/QuantStack/voila

Slide 70

Slide 70 text

In summary jupyter.org Jupyter makes the “last-mile” problem as small as possible by building modular, open tools. Us .ipynb Open, reproducible, sharable environments

Slide 71

Slide 71 text

In summary JupyterHub lets you create a shared, interactive analytics environment Us .ipynb Open, reproducible, sharable environments mybinder.org

Slide 72

Slide 72 text

In summary repo2docker creates reproducible Docker images from a repository Us .ipynb Open, reproducible, sharable environments mybinder.org

Slide 73

Slide 73 text

In summary BinderHub is an open web application to create shareable, reproducible coding environments Us .ipynb Open, reproducible, sharable environments Mybinder.org

Slide 74

Slide 74 text

Get involved with Jupyter @choldgraf jupyterhub-team-compass.readthedocs.io discourse.jupyter.org ● All of these projects are open source, run by open communities ● Jupyter is a place where *anybody* can participate ● If you’d like to get involved: