How to avoid 'death by Jupyter notebooks'? Towards more effective and educational notebooks

How to avoid 'death by Jupyter notebooks'? Towards more effective and educational notebooks

Using Jupyter Notebooks is easy, but developing Jupyter notebooks that are effective and educational for others is not so easy. This talk will guide you through a set of principles to make Jupyter notebooks more educational and enjoyable for others. Concepts that will be covered are modularisation, leveraging html functionalities and converging content in an index notebook.

157d70ab1729f9665879a445fb6b7c87?s=128

Julia Wagemann

November 04, 2020
Tweet

Transcript

  1. #JupyterCon2020 How to avoid ‘death by Jupyter notebooks’? Towards more

    effective and educational notebooks 1 Julia Wagemann
  2. About me 2 • Independent consultant for geospatial data and

    PhD candidate at Philipps University of Marburg • My work is in the intersection of data providers and data users with a focus on climate | meteorological | atmospheric data • Been using Jupyter notebooks since 2014 • Since 2019, I am developing Jupyter-based training on atmospheric composition data for the European Organisation for the Exploitation of Meteorological Satellites (EUMETSAT) https://gitlab.eumetsat.int/eumetlab/atmosphere/atmosphere • Github: jwagemann | Twitter: @JuliaWagemann
  3. Motivation 3 Over 25k repositories and over 708k code contributions

    tagged with Jupyter notebooks on GitHub. It is ‘trendy’ to share code examples as Jupyter notebooks. Which is good. BUT: Reality: on the verge of ‘death by Jupyter notebooks’
  4. Reality ≠ reproducibility 4 Long code blocks without explanation Unorganized

    - e.g. libraries are loaded in the middle of a workflow Messy - no naming convention followed
  5. Overview - Key concepts • Get organised! ◦ Index notebook

    ◦ Follow a naming convention ◦ Navigation pane • Layout and outline are key! ◦ Header and Footer ◦ Table of Contents / Notebook Outline ◦ Alert boxes ◦ Other styling options • Outsource and modularize! ◦ Example: functions 5
  6. Get organized Index notebook 6 → 00_index.ipynb • Gives an

    overview of what the notebooks are about • Provides an outline and links to other notebooks with Markdown • Links to additional (external) content
  7. Get organized - Follow a naming convention 7

  8. Get organized - Navigation pane 8 html: <a href="./01_how_to_not_do_it.ipynb"><< 01

    - Example - How to NOT do it</a> <span style="float:right;"> <a href="./03_functions_example.ipynb">03 - Example - Functions >></a> </span>
  9. Layout and outline are key - Footer and Header 9

    Header Footer html: <img src='./img/file.png' align='right' width='80%'></img>
  10. Layout and outline are key Table of contents → 02_jupyter_notebooks_key_concepts.ipynb

    • Helps your audience to navigate through the notebook • Provides an overview of the notebooks main headings • Helps you audience to direct straight to a specific section Markdown link: [GET ORGANIZED](#get_organized) html <a>...</a> statement : ## <a id=’get_organized’></a> GET ORGANIZED
  11. Layout and outline are key Alert boxes → 02_jupyter_notebooks_key_concepts.ipynb •

    Colorize a cell based on ‘alert-levels’ • There are four different ‘alert-levels’: ◦ Alert-warning (yellow) ◦ Alert-success (green) ◦ Alert-danger (red) ◦ Alert-info (blue) <div class="alert alert-block alert-warning"> <b>NOTE:</b> Alert level 'warning' colorizes a cell <i>yellow</i>. </div>
  12. Layout and outline are key Other styling options → 02_jupyter_notebooks_key_concepts.ipynb

    • <hr> - Horizontal rule • <br> - Space cell • `code` - Text formatted as code
  13. Outsource and modularize - Example: functions → functions → 02_jupyter_notebooks_key_concepts

    def generate_geographical_subset(xarray, latmin, latmax, lonmin, lonmax): """ Generates a geographical subset of a xarray DataArray Parameters: xarray (xarray DataArray): a xarray DataArray with latitude and longitude coordinates latmin, latmax, lonmin, lonmax (int): boundaries of the geographical subset Returns: Geographical subset of a xarray DataArray. """ return xarray.where((xarray.latitude < latmax) & (xarray.latitude > latmin) & (xarray.longitude < lonmax) & (xarray.longitude > lonmin),drop=True) from ipynb.fs.full.functions import generate_geographical_subset ?generate_geographical_subset
  14. Overview - Key principles • Know your audience • Simplify

    code • Jupyter notebooks ≠ reproducibility • Four eyes see better than two - Quality assurance
  15. Reproducibility does not happen by chance Reproducibility is going the

    ‘extra mile’
  16. thanks 1 6 Julia Wagemann Independent consultant and PhD candidate

    Website: jwagemann.com Twitter: @JuliaWagemann GitHub: github.com/jwagemann