Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Binder 2.0 - The Next Generation of Reproducibl...

Binder 2.0 - The Next Generation of Reproducible Scientific Environments with repo2docker and BinderHub

This talk covers recent developments in the Binder Project, and describes the architecture and design of BinderHub, an open-source tool that runs on Kubernetes and lets your users create sharable, interactive, reproducible data analytics environments. It was originally given at SciPy 2018 (video here: https://www.youtube.com/watch?v=KcC0W5LP9GM)

Chris Holdgraf

July 14, 2018
Tweet

More Decks by Chris Holdgraf

Other Decks in Technology

Transcript

  1. $ jupyter repo2docker \ > https://github.com/minrk/ligo-binder Cloning into '/var/folders/.../T/repo2dockermu6z66sd'... Using

    CondaBuildPack builder Step 1/31 : FROM buildpack-deps:bionic ---> 29f4eef41002 Step 2/31 : ENV DEBIAN_FRONTEND=noninteractive ---> Using cache ---> ee1ba7c4f5f4 Step 3/31 : RUN apt-get update && apt-get install --yes --no-install-recommends locales && apt-get purge && apt-get clean && rm -rf /var/lib/apt/lists/*
  2. ... # Copy and chown stuff. COPY src/ ${HOME} RUN

    chown -R ${NB_USER}:${NB_USER} ${HOME} # Run assemble scripts! These will actually build the spec # in the repository into the image. USER ${NB_USER} RUN ${KERNEL_PYTHON_PREFIX}/bin/pip install --no-cache-dir \ -r "requirements.txt" ...
  3. $ jupyter repo2docker \ > https://github.com/minrk/ligo-binder Cloning into '/var/folders/.../T/repo2dockermu6z66sd'... Using

    CondaBuildPack builder Step 1/31 : FROM buildpack-deps:bionic ---> 29f4eef41002 Step 2/31 : ENV DEBIAN_FRONTEND=noninteractive ---> Using cache ---> ee1ba7c4f5f4 Step 3/31 : RUN apt-get update && apt-get install --yes --no-install-recommends locales && apt-get purge && apt-get clean && rm -rf /var/lib/apt/lists/*
  4. • Dockerfile • environment.yml • requirements.txt • REQUIRE • install.R

    • apt.txt • setup.py • postBuild • runtime.txt
  5. # URL to build a given version of a repo

    url = f"https://mybinder.org/build/gh/{repo}/{ref}" r = requests.get(url, stream =True) for line in r.iter_lines(): line = line.decode("utf8", "replace") # EventStream messages if line.startswith("data:"): evt = json.loads(line.split(":", 1)[1]) # echo each message if "message" in evt: sys .stdout.write(evt["message"]) if evt.get("phase") == "ready": # open in a browser when it’s ready url = evt["url"] token = evt["token"] webbrowser .open(f"{url}?token={token}" )