Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monorepos in Python

Monorepos in Python

PyCascades 2025

Avatar for Avik Basu

Avik Basu

March 01, 2025

More Decks by Avik Basu

Other Decks in Technology

Transcript

  1. About Me • Sta ff Data Scientist at Intuit •

    Engineering + Data Science • Love PS5 games • Soccer + Tennis • Driving is therapy!
  2. What is a mono-repository? • a.k.a. Monolith • A single

    code base for multiple projects • Projects can be • Libraries • Microservices • Jupyter notebooks/explorations • Projects can be in more than 1 language
  3. Mono-repos When does it make sense? • Inter-related projects with

    a common goal • Small to medium level code-base • Early stage projects • Better development velocity • Easier promotion • Di ff erent components have similar change rate • System-level extensions, e.g. C/C++/Rust
  4. Python Mono-repos Advantages • One place to look through everything

    • Can improve development velocity (sometimes) • Easier promotion of open-source projects (think GitHub ⭐) • Better management of other language extensions, e.g. C/C++/Rust • Popular mono-repositories • pytorch-lightning • azure-sdk-for-python
  5. Python Mono-repos Disadvantages • Versioning decisions • CI/CD complexity •

    Merging and con fl icts • Code ownership issues • Need for specialized tools • Unintended code breakage
  6. Project Scenario Data Science • 2 libraries • 1 core

    library containing ML models and related tools • 1 supporting library for connecting to di ff erent data sources • 1 microservice • FastAPI app with a docker container • Serves real-time inference • 1 notebook project • Contains training jobs in Jupyter notebooks
  7. Open-source tools For managing mono-repos • uv workspace • Allows

    to manage multiple projects • As a bonus uv is a great package management tool • Poetry mono-ranger plugin • Early stage • nx • Language agnostic • Pants • Multi-language
  8. Version management For multiple libraries in the repo A. One

    version for the whole repo ‣ Simpler to manage ‣ Easier to keep track ‣ Unnecessary updates for libraries with no changes
 
 
 B. Each library has its own version ‣ No unnecessary updates ‣ Di ffi cult to maintain ‣ Essential to have a compatibility matrix ‣ Needs git submodules