Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[PyCon JP] Modernizing development workflows for a 7-year old 74K LoC Python project using Pantsbuild

Joongi Kim
October 17, 2022

[PyCon JP] Modernizing development workflows for a 7-year old 74K LoC Python project using Pantsbuild

Joongi Kim

October 17, 2022
Tweet

More Decks by Joongi Kim

Other Decks in Programming

Transcript

  1. Mic Test My Name is Joongi Kim. The title of

    my talk is Modernizing development workflow for a 7-year old 74K LoC Python project using Pantsbuild. My presentation will be in English. The presentation materials are in English. I will publish the presentation materials. I agree to having my picture taken during my presentation. I will comply with the PyCon JP Code of Conduct.
  2. Modernizing development workflow for a 7-year old 74K LoC Python

    project using Pantsbuild Joongi Kim (김준기, ⾦駿起) Lablup Inc. ("래블업" aka "ラボをアップグレ ド")
  3. About Me • Working as CTO / co-founder of Lablup

    Inc. since 2015 Has been developing Backend.AI for 7+ years An open-source enthusiast Domain: backend engineering, systems programming, distributed & accelerated computing Interests: writing codes manageable 3 years later • Recent status (updating my codes written 5 years ago...) • Talked in PyCon KR for 8 years in serial... Mostly about asyncio-related topics 🤯 😵💫
  4. Table of Contents • Mono-repo vs. Multi-repo: That is the

    problem • Problems with the prior art in my team • Short introduction to Pantsbuild • Mono-repo migration process using Pantsbuild • Customization to adapt with our cases • Experience after migration • Recap
  5. Dependency mgmt is hard • "Software Engineering at Google" Long-term

    SW engineering is all about keeping the pace of upgradability. The key of upgradability is dependency management. Backend.AI is passing this point (7-years old, ~74K LoC) ref) https://abseil.io/resources/swe-book/html/ch01.html#time_and_change 😵
  6. Mono-repo to the rescue? • Well-known dependency managers yarn, npm,

    cargo, poetry, pipenv, go get, ... Diamond ( ) dependency conflicts ✓ Sandboxing all n-th order dependencies vs. full resolution (NP-complete[1]) • Two axes of dependency management Internal: Cross-dependency between components written by us External: Depedency to components written by others Most existing build systems take care of external dependencies only! • Managing internal dependencies Multi-repo: multiple per-component repositories Mono-repo: single merged repository of all components [1] https://research.swtch.com/version-sat
  7. Mono-repo to the rescue? • There is no single right

    answer! How do your component teams collaborate? ✓ Mono-repo may be better if a single team develops multiple components. How synchronous are the release cycles of related components? How closely coupled are the components? (e.g., direct type refs, unversioned APIs) ✗ Scalability issues if the repo becomes very large ✗ Difficult to fork individual components ✗ Difficult to set up CI/CD workflows ✓ Less pain for refactoring across components ✓ Sharing the same development culture & process ✓ Easier onboarding with a unified view of systems ✓ Independent release cycle & versioning ✓ Per-repository team access control ✓ Taking advantage of existing build systems ✗ Team fragmentation by repo boundaries ✗ Sync overheads of internal dependencies ✗ Difficult to have a holistic view Multi-Repo Mono-Repo ref) https://kinsta.com/blog/monorepo-vs-multi-repo/
  8. Mono-repo + Modern build system • Mono-repo simplifies internal dependency

    mgmt. Easy to find duplications and refactor all API usage occurrences at once e.g., "One version rule" • Mono-repo at large scale needs a "modern" build system. Support both: a single unified build vs. per-component builds Reproducible builds Minimize human errors & mistakes ✓ Declarative dependency configurations ✓ Automatic dependency resolution & inference Speed up builds and CI/CD pipelines ✓ Detecting the affected modules for a changeset in CI workflows ✓ Parallelized & distributed execution with artifact caching
  9. Problems with the prior art in my team Why did

    we begin to consider mono-repo?
  10. Previous practice • Per-package GitHub repositories One Python wheel →

    one repository Release each package independently using the standard setuptools/pip toolchain • Setting up Backend.AI[1] development env. The minimum set of components for the server-side (6 repositories) ✓ manager, agent, common, client-py, webserver, storage-proxy ✓ (There is yet another long story for the frontend...) A single-line installation script (install-dev.sh) ✓ Installs database containers using docker-compose ("halfstack") ✓ Clones multiple repositories, creates venvs, runs "editable-install" in each venv, and populates database schema with fixtures [1] https://github.com/lablup/backend.ai
  11. Problems with prior art (1/2) • Difficult to write and

    review multiple PRs for a single issue A single issue often consists of multiple PRs to multiple repositories Often we forget to switch git branches same for multi-repo clones ✓ There is an implicit rule to match the PR branch names in different repos, and new contributors often forget this, breaking CI/CD. Difficult to keep our mind contexts when switching repositories ✓ e.g., Forgeting to add corresponding client function for a new server API ✓ Often reviewers forget things as well. 🤯 Not very compatible with GitHub ✓ The issue resolution from multiple linked PRs: "OR" instead of "AND" ✓ GitHub Codespace works for a single repository only. ✓ GitHub Project v2 is still missing cross-repo label, milestone configurations.
  12. Problems with prior art (2/2) • Feeling hesitated with refactoring

    Reducing the maintenance points > Splitting components by clear purpose & semantics No way to specify explicit internal dependencies • Time-consuming release process Painful to repeat the same release workflow for 6 repositories... (error-prone) ✓ Need to repeat updating CI/CD configs for 6 times... (reduced motivation to improve) Waiting for dependee packages to get released when there are internal dependencies • Difficult to keep track of compatible set of component version combinations Often a minor patch release makes it incompatible because different components are closed coupled. Causing headaches for on-site engineering staffs when upgrading and applying custom patches for individual customer sites
  13. Solution • The problem Reduced motivation to refactor across components

    & improve dev process (cynicism) Too high context switching overheads for managing issues & PRs • Let's migrate to (semi-)mono-repo! Backend.AI is not yet as big as Google's repository — don't need to worry about the extreme scalability issues. Target repos: open-source core components that shares the same release cycle and has internal cross-dependencies • Challenges How to automate internal dependency management? (e.g., parsing/generating setup.cfg?) How to run tests against only changed modules on commits? (e.g., git sparse checkout?) We need a modern build system tailored for mono-repo!
  14. What is Pants? • Main features Automatic dependency inference by

    static analysis First-class support for the Python ecosystem Graph-based parallel & async task execution Extensible with a plugin subsystem • Overview https://www.pantsbuild.org/docs/how-does-pants-work https://blog.pantsbuild.org/pycon-us-2022-talk/ https://blog.pantsbuild.org/pants-vs-bazel/ Pants 2 is a fast, scalable, user-friendly build system for codebases of all sizes. It's currently focused on Python, Go, Java, Scala, Shell, and Docker, with support for other languages and frameworks coming soon. ref) https://www.pantsbuild.org/
  15. Pants: Architecture Rust-based DAG scheduler & async-parallel execution engine (monadic,

    pure, cancellable/interruptible, concurrent, cached) Python-based BUILD rule engine & intrinsic multi-language plugins Filesystem and OS pants.toml + BUILD configs in Git repositories (Target build may use an arbitrary Python version) Included inside PyPI's pantsbuild-pants wheel package Pre-installed Python Runtime (one of 3.7, 3.8, 3.9) PEX venv generator & dependency resolver
  16. Pants: Basic Usage • Requirements to start using Pants[1] ./pants

    script (download from https://static.pantsbuild.org/setup/pants) pants.toml & pants.ci.toml **/BUILD files ✓ What to build (including source & resource files), what they depends on others • ./pants [global-options] {goal} [goal-options] [targets] What it does: ✓ Self-bootstrap Pants itself at ~/.cache/pants/ & ./.pants.d ✓ Generate a task DAG from BUILD files ✓ Asynchronously run the DAG with parallelization when possible Refer our team's cheatsheet how it works with daily development workflows[2] [1] https://www.pantsbuild.org/docs/installation [2] https://docs.backend.ai/en/latest/dev/daily-workflows.html
  17. Restructuring GitHub repos backend.ai-manager backend.ai-agent backend.ai-common backend.ai-webserver backend.ai-client-py backend.ai-storage-proxy backend.ai-webui

    backend.ai-client-js Backend.AI Core https://github.com/lablup/... Backend.AI Fronted https://github.com/lablup/backend.ai Unify!
  18. Mono-repo structure Unified package version Pants build config for each

    directory (like Bazel) "local-config" templates for Backend.AI Core Backend.AI developer documentation Backend.AI plugin development workspace Utility shell scripts for developers Backend.AI Core source codes Backend.AI Core test codes Pants main config Toolchain configs (flake8, mypy, pytest) Unified requirements for all components Unified requirements dependency lock Toolchain requirements dependency lock Our Pants plugin for custom setup.py generation Main entry scripts for daily use venvs & build artifacts generated by Pants VERSION **/BUILD configs/{manager,agent,common,...} docs/ plugins/ scripts/ src/ai/backend/{manager,agent,common,...} tests/{manager,agent,common,...} pants.toml, pants.ci.toml pyproject.toml, .flake8 requirements.txt python.lock tools/*.lock tools/pants-plugin/setupgen/ ./pants, ./py, ./backend.ai dist/
  19. Migrating multi-repo setup.cfg, requirements/*.txt README.md changes/, CHANGELOG.md configs/ scripts/ src/ai/backend/{component}/

    tests/ src/ai/backend/{component}/BUILD, VERSION, pants.toml, BUILD, requirements.txt src/ai/backend/{component}/README.md changes/, CHANGELOG.md configs/{component}/ scripts/{component}/ src/ai/backend/{component}/ tests/{component}/ (unified) (unified) __version__ = '22.03.1' from pathlib import Path __version__ = ( Path(__file__).parent / 'VERSION' ).read_text().strip() $ cd src/ai/backend/{component} $ ln -s ../../../../VERSION src/ai/backend/{component}/__init__.py: (moved) (moved) (moved) (moved) (moved)
  20. The whole history • https://github.com/lablup/backend.ai/pull/417 Started : Apr 27 /

    Merged: May 31 (168 commits) / many follow-up PRs afterwards ✓ The initial plan was two weeks, but as always... 😅 More than 60 times of Q&A in the Pantsbuild community Slack Pants: 5 bug reports (all fixed now), 2 feature requests, 2 doc patches Pex: 3 bug reports triggering new releases ✓ Afternoon KST: bug report / Dinner KST: talk with developers / Night-morning KST: developers fix the issue and release / Next morning KST: apply the release • The size of mono-repo Backend.AI Core LoC: 74K+ LoC including all external dependencies: 1.5M+
  21. setup.py Generator • Pants plugin : tools/pants-plugins/setupgen • What's added

    Single-source the version number from the root's VERSION file Change long_description_type depending on the extension of README (.md, .rst) Change trove classifer depending on the version number suffix (a, b, rc) Add the license type argument so that each wheel package may have different licenses
  22. towncrier Tool • Pants plugin: tools/pants-plugins/towncrier • What's added Like

    black, isort, and flake8, defined a new "PythonTool" for towncrier Allows using independent venvs and dependency lockfiles for towncrier
  23. Platfrom-specific Deps • Pants plugin: tools/pants-plugins/platform_resources • What's added Use

    different resource files (pre-built executables) for Backend.AI Agent by the target platform argument Needed to rewrite the code upon Pants minor version updates as dependency management implemention in Pants has frequent updates (expecting to be stabilized soon)
  24. Dynamuic Module Loading • Wrote a module loader that searches

    & parses BUILD files using AST Backend.AI largely depends on entry_points of the package metadata for plugin and replacible module discoveries. https://github.com/lablup/backend.ai/blob/main/src/ai/backend/plugin/entrypoint.py • Use ./pants export :: and ./py wrapper script instead of ./pants run ... In PEX envs, there is neither BUILD files nor the package metadata.
  25. Satisfying points • Decreased the time needed for making new

    release (hours → ~10 min.) Automated release-related workflows e.g., Generate GitHub's release note by extracting the latest section of CHANGELOG.md • Reduced code review burdens & context switching overheads One issue completes with one PR! (single file tree & single diff) Review all things together, including documentation Now we can utilize GitHub better: Projects v2 & Codespace • Minimized the impact to CI/CD execution times by taking diffs ./pants test --changed-since=main
  26. Adaptation required (1/3) • Requiring multiple installs of Python versions

    Pants requires Python 3.9 on Apple Silicon Macs / Backend.AI requires Python 3.10 macOS Monterey (via XCode CLI tools) provides Python 3.8 by default Depending on when & how you have used Homebrew or pyenv, Python 3.9 may be missing! ✓ Fastest workaround: brew install [email protected] or pyenv local 3.9.13 For new contributors with less experience on managing multiple Python versions, this becomes a huge hurdle!
  27. Adaptation required (2/3) • ImportError due to forgetting to pass

    PYTHONPATH environment variable Subprocesses should also take pants.toml's source_roots into account • ImportError due to dynamic imports importlib.import_module() SQLAlchemy selects which engine module to import based on the database URL Such dependencies should be manually specified in BUILD files.
  28. Adaptation required (3/3) • Unexpected amount of efforts to parallelize

    the test suite Port number conflicts of database containers created as a test fixture ✓ pants.toml: [pytest].execution_slot_var = "BACKEND_TEST_EXEC_SLOT" ✓ Use emphemeral port numbers and/or add the slot number to a fixed constant • Ubuntu 22.04 + Snap + Docker + /tmp Snap enforces Docker to use a private /tmp instead of the host /tmp. Mounting a /tmp sub-directory to fixture containers → unexpected failure It is not Pants-own problem, but test parallelization would lead to this pitfall in many cases. ✓ Workaround by using ./.tmp instead of /tmp
  29. Recap • Worth enough the efforts and time No slow-down

    of development process after migration, thanks to Pants! ✓ Introduced black + isort + git hook, automation of release note generation Opened many ways to exploit new features of GitHub ✓ Larger action runner to speed up CI, Projects v2, and Codespace • New concerns: unifying tracking of public & private issues • Friendly technical support from the Pantsbuild community Slack The core members and contributors are welcome to questions. I'm trying to contribute back as well! (bug reports, PyCon talks, etc.) • Pants is a highly recommended option if you consider Python-based mono-repo!
  30. About our project: Backend.AI • In short: "An all-in-one enterprise

    platform to develop and operate AI services" • It is an open-source project with enterprise plugins. https://github.com/lablup/backend.ai Contributed to & created many open-source libraries to support this project ✓ aiodocker, aiohttp, aiomonitor-ng, aiotools, aiotusclient, async-timeout, callosum (async RPC), click, etcetra (async etcd3 client), janus, pyzmq, ...
  31. About our project: Backend.AI • In short: "An all-in-one enterprise

    platform to develop and operate AI services" • We are opening a small exhibition booth in the Japan IT Week (Oct 26-28)!