Unit Testing Jupyter Notebooks - testbook (SciPy India 2020)

Unit Testing Jupyter Notebooks - testbook Rohit Sanjay SciPy India
2020

About Us Matthew Seal CTO @ Noteable Inc Twitter: @codeseal
GitHub: @Mseal Rohit Sanjay Twitter: @imrohitsanj GitHub: rohitsanj

Unit testing

A little about unit testing - Single unit of source
code is tested individually Source: https://www.mathworks.com/help/matlab/mocking_overview.png

Context behind why we created testbook

Jupyter Notebooks can get very messy • Code written to
conduct data science experiments in Jupyter Notebooks can get messy. • Enforcing good coding habits in Jupyter Notebooks can lead to maintainable and easily refactorable code. Some potential good habits are.. • Use functions to abstract away complexity • Smuggle code out of Jupyter notebooks as soon as possible • Apply test driven development • Make small and frequent commits Source: https://www.thoughtworks.com/insights/blog/coding-habits-data-scientists

Test driven development for Jupyter Notebooks? Few approaches: • Write
integration tests which runs the whole notebook as one unit ◦ Papermill is used for this typically. Doesn’t test complex logic well. • Write the tests in the notebook itself ◦ This means your tests always run when the notebook runs. Adds a lot of noise to the document. • Refactor code out of the notebooks and write them in separate Python modules that can then be independently unit tested. ◦ How .py ﬁles are tested.

Why Test? For code you wish to promote past exploration
and experimentation and make shareable and reusable, you need to make the code reliably reproducible. The best way to achieve this is to: • Simplify the code wherever possible • Deﬁne clear method / API boundaries • Test those method and API boundaries • Repeat testing whenever inputs, dependencies, or code changes In many professional settings this is referred to as “productionizing” your code, and before testbook this could be diﬃcult for notebooks.

Source: https://www.thoughtworks.com/insights/blog/coding-habits-data-scientists

Testbook

Testbook • Testbook is a unit testing framework for testing
code in Jupyter Notebooks. • With testbook, you can now write pytest style unit tests for notebooks, in separate .py ﬁles. • Testbook can now help you write maintainable and reliable Jupyter Notebooks.

A simple unit test using testbook

Another example

How testbook works

How testbook works • Testbook works by creating reference objects.
• Reference objects hold a reference to an actual object in the notebook. • All attribute access and assertions performed on these reference objects are internally pushed down (or injected) into the Jupyter kernel.

Features of testbook

Write conventional unit tests for Jupyter Notebooks • You do
not have to learn a new type of testing to use testbook. We have designed the API in a way that is intuitive and ﬁts well into the general notion of unit testing. • Write tests for notebooks just like how you would write tests for Python modules.

Execute all or some speciﬁc cells before unit test -
Testbook allows you to execute a speciﬁc list of cells before a test executes.

Share kernel context across multiple tests

Perform patching of objects in the notebook - You can
patch objects like variables and functions in the notebook - Useful in situations where you want to patch a network request or a ﬁle I/O operation in the notebook.

Inject code into Jupyter notebooks • Injecting code into notebooks
during runtime is the secret sauce of testbook • All assertions are injected into the notebook • If you need to perform any assertions which are not (currently) supported by the testbook API, you could simple write the assertion code and inject that into the notebook.

Inject code into Jupyter notebooks

Works with any unit testing library • Testbook provides the
assertion part of the equation (no pun intended), whereas the reporting needs to be done by an existing unit testing framework • This was an intentional design choice. • Testbook is pluggable into any unit testing library - pytest, unittest, nose etc.

Who is testbook for

Should I Use Testbook? Testbook is intended to help with
developers ensure their code continues working after they move to a new project. Here’s some rough guidelines for when you should consider adding this library to your project: • Are you sharing this Notebook with others who will run it? • Will the inputs (data) for the Notebook change over time? • Do you need this Notebook to run again in the future? • Are you going to automate Notebook execution on a schedule? If yes to these, then testbook is a good tool for you to consider using.

While we’ve said “developers” and referenced situations with teams or
larger organizations, the tool is not only intended for individuals in such situations. Sometimes that person you’re sharing the Notebook with is your future self who doesn’t remember too well what you wrote. In essence, anyone authoring a Notebook should be able to make use of testbook. Who Should Use Testbook?

Testbook in the wild

Ark-analysis Link

Ark-analysis https://github.com/angelolab/ark-analysis/pull/318

nbcelltests https://github.com/jpmorganchase/nbcelltests

Roadmap of testbook

What’s Coming Up? We have more great things planned for
the library beyond what was described above. If you’d like to contribute we’re also happy to have more developers submitting Issues and PRs (even tiny ones!). Here some recent and upcoming changes: • Full release of feature complete library [Done] • Documentation overhaul for testbook [Done] • Ability to easily apply Python mocks in testbook executions [Done] • Support for code coverage across Notebook ﬁles • Better support for non-Python Notebooks

Thanks! PyPI - pypi.org/project/testbook pip install testbook GitHub - github.com/nteract/testbook
(drop a star for good karma) Docs - testbook.readthedocs.io nteract - nteract.io -> nteract GitHub

Unit Testing Jupyter Notebooks - testbook (SciP...

Unit Testing Jupyter Notebooks - testbook (SciPy India 2020)

Rohit Sanjay

More Decks by Rohit Sanjay

Other Decks in Programming

Featured

Transcript

Unit Testing Jupyter Notebooks - testbook Rohit Sanjay SciPy India

About Us Matthew Seal CTO @ Noteable Inc Twitter: @codeseal

Unit testing

A little about unit testing - Single unit of source

Context behind why we created testbook

Jupyter Notebooks can get very messy • Code written to

Test driven development for Jupyter Notebooks? Few approaches: • Write

Why Test? For code you wish to promote past exploration

Source: https://www.thoughtworks.com/insights/blog/coding-habits-data-scientists

Testbook

Testbook • Testbook is a unit testing framework for testing

A simple unit test using testbook

Another example

How testbook works

How testbook works • Testbook works by creating reference objects.

Features of testbook

Write conventional unit tests for Jupyter Notebooks • You do

Execute all or some speciﬁc cells before unit test -

Share kernel context across multiple tests

Perform patching of objects in the notebook - You can

Inject code into Jupyter notebooks • Injecting code into notebooks

Inject code into Jupyter notebooks

Works with any unit testing library • Testbook provides the

Who is testbook for

Should I Use Testbook? Testbook is intended to help with

While we’ve said “developers” and referenced situations with teams or

Testbook in the wild

Ark-analysis Link

Ark-analysis https://github.com/angelolab/ark-analysis/pull/318

nbcelltests https://github.com/jpmorganchase/nbcelltests

Roadmap of testbook

What’s Coming Up? We have more great things planned for

Thanks! PyPI - pypi.org/project/testbook pip install testbook GitHub - github.com/nteract/testbook