Upgrade to Pro — share decks privately, control downloads, hide ads and more …

piwheels - campug

piwheels - campug

piwheels: building a faster Python package repository for Raspberry Pi users

Talk given at campug on 1 August 2017

Ben Nuttall

August 01, 2017
Tweet

More Decks by Ben Nuttall

Other Decks in Programming

Transcript

  1. piwheels:
    building a faster Python package
    repository for Raspberry Pi users
    Ben Nuttall
    Raspberry Pi Foundation
    UK Charity 1129409

    View Slide

  2. Ben Nuttall

    Raspberry Pi Community Manager

    Columnist on opensource.com

    github.com/bennuttall

    twitter.com/ben_nuttall

    [email protected]

    View Slide

  3. Space

    View Slide

  4. Astro Pi

    View Slide

  5. Dave

    Dave: I need your help. Can we install
    these libraries in space?

    Me: I think so, if we build wheels of
    them.

    View Slide

  6. pip wheel numpy

    View Slide

  7. PyPI

    Python package repository hosted at pypi.python.org

    Install packages with “pip install”

    Packages can be implemented in Python or C

    Packages implemented in C require building
    – This can take a long time

    View Slide

  8. Python wheels

    Wheels can be uploaded to PyPI alongside source
    distributions to save users from building themselves

    Wheels are architecture-specific
    – e.g. win32, win64, macosx, linux32, linux64

    A recent addition allowed “manylinux” wheels to be uploaded

    Most maintainers don't bother uploading wheels, but popular
    build-intensive packages tend to

    View Slide

  9. Python wheels

    When you build a wheel, the filename is made up of a number
    of parts:
    – The package name
    – The package version
    – The Python tag (e.g. cp34)
    – The ABI tag (e.g. cp34m)
    – The platform tag (e.g. linux_armv7l)

    e.g: numpy-1.13.1-cp34-cp34m-linux_armv7l.whl

    View Slide

  10. ARM wheels

    PyPI does not supporting ARM wheels

    I’ll come back to that...

    View Slide

  11. View Slide

  12. pip install numpy

    Browse http://pypi.python.org/simple/numpy

    Look for a wheel file with a platform tag matching the current
    platform...

    View Slide

  13. Which type of x86/x86_64 are you?

    View Slide

  14. ...I have them all

    View Slide

  15. Actually I’m linux_armv7l

    View Slide

  16. What? I’ve never heard of it.
    Here’s the source, build it yourself!

    PyPI doesn’t support uploading ARM wheels :(

    “pip install numpy” takes:
    – ~20 mins on Pi 3 (1.2GHz quad-core)
    – ~2.5 hours on Pi 1 (700MHz single-core)

    View Slide

  17. Fine! I’ll build my own package
    repository...

    cd Projects

    mkdir piwheels

    cd piwheels

    git init

    View Slide

  18. piwheels

    I could build everything on PyPI and host my own repository
    – “pip wheel numpy” works on a Pi – you can distribute the wheel

    I want to log builds and store output in a database
    – I’d better write it in Python, not bash
    – Stack overflow says “import pip” is a thing

    I need a list of all packages in PyPI
    – Stack overflow says PyPI provides an xmlrpclib interface (whatever that is)
    and gives a Python example

    Can I host a package repository?
    – At minimum, an Apache directory listing will do the trick

    View Slide

  19. piwheels v1

    Pi 3 in my living room

    Build the latest version of every package (106k packages)

    Log output into postgres database

    Host a package repository on the same Pi

    On GitHub but not really reproducable

    View Slide

  20. piwheels v1: the results

    It took 10 days to complete the build run

    76% build success rate

    Repository live at piwheels.bennuttall.com

    “pip install numpy ­i http://piwheels.bennuttall.com” works
    and takes 6 seconds :)

    Still running on a Pi 3 in my living room

    Proof of concept: it works, it’s probably useful

    View Slide

  21. Warehouse

    Next-gen PyPI project at pypi.org

    A work in progress – it mostly works now

    Changes made on pypi.org are made on pypi.python.org

    View Slide

  22. Warehouse

    A Google developer working on a Raspberry Pi related Python
    package came across piwheels

    Asked if I’m going to try to get maintainers to upload my wheels
    to PyPI

    I said “they can’t”

    He filed an issue with warehouse, suggesting they allow ARM
    platform wheels

    They said “ok”

    View Slide

  23. Warehouse

    View Slide

  24. Warehouse

    Maintainers can now upload ARMv6 and ARMv7 wheels to
    pypi.io and they appear on pypi.python.org!

    View Slide

  25. Planning piwheels v2

    Build every version of every package

    Keep up with new releases automatically

    Host a package repository as before

    Test suite

    Provide installation instructions & developer documentation
    so people can contribute

    View Slide

  26. Mythic Beasts: Pi in the cloud

    View Slide

  27. There is no Raspberry Pi cloud...

    View Slide

  28. Pete

    View Slide

  29. View Slide

  30. 750,000 releases

    Now 113k packages

    750k package versions to build

    At the previous rate, this will take 70 days

    Left it running for a couple of weeks:
    – Actually running slower than before due to network filesystem
    – New estimate: 100 days

    View Slide

  31. Lightning talk at EuroPython

    View Slide

  32. Why don’t you just cross-compile?

    It’s not all about speed

    Reliability

    Compatability

    Familiarity

    Ease of use

    I can scale up Pis easily

    Eating my own dog food

    View Slide

  33. View Slide

  34. View Slide

  35. ALL the cloud Pis

    View Slide

  36. Adding more Pis

    Provisioned a second Pi
    – No web server or database
    – Connected to the database on first Pi
    – rsync files to first Pi

    Provisioned a third Pi

    Internalled “terminator”

    Provisioned Pis 4 and 5

    This is easy. I’ll be done in no time!

    View Slide

  37. View Slide

  38. It didn’t scale

    Output on 3 Pis was about the same as the output on 1 Pi

    The database was getting hammered

    View Slide

  39. Dave Jones

    Author of picamera

    Co-author of GPIO Zero

    Self-professed SQL know-it-
    all

    We’ve worked together on
    open source projects a lot

    View Slide

  40. Make it scale!

    Pull request:
    – Query optimisations
    – Queuing system with zeromq

    Re-deployed the code

    Original Pi is now “master” running database and web server only

    Other Pis are now “builders” using master’s database and rsync-
    ing files to master

    View Slide

  41. It worked! Keep going!

    View Slide

  42. 20 Raspberry Pis

    ~6k packages per hour

    ~120k per day (15%)

    Now also logging which Pi built each package

    Pis seem to be holding up

    Dropped rsync in favour of sshfs

    It’s going well! I’ll be done in no time!

    View Slide

  43. It’s done!

    View Slide

  44. It’s done!

    Scaled down to 5 Pis to keep up with new releases

    Thanks, Pete! You can have them back now.

    View Slide

  45. The results

    Total packages: 113, 649

    Package versions: 752, 817

    Build success rate: 76%

    Total cumulative time spent building: 156 days, 18 hours (including
    duplicates)
    – In real time this was 26 days:

    16 days with 1 Pi building

    10 days with up to 19 Pis building

    Total disk usage from wheels: 250GB

    View Slide

  46. Builds over time

    View Slide

  47. pypi.org

    Next generation PyPI project

    A Google developer came across piwheels and filed an issue with the
    pypi.org project (warehouse)
    – github.com/pypa/warehouse/issues/2003

    pypi.org now supports uploading ARM wheels :)
    – Thanks @kpayson64 and @dstufft

    Package maintainers can upload wheels built by piwheels to pypi.org
    and they appear on pypi.python.org
    – \o/

    View Slide

  48. Builds per Pi over time

    View Slide

  49. Reasons for failure

    View Slide

  50. Tags

    ABI tag:
    – none: 715, 081
    – cp34m: 15, 548
    – noabi: 1 (oddly, this was the
    wheel package)

    Platform tag:
    – any: 715, 062
    – linux_armv7l: 15, 561
    – manylinux_armv7l: 6
    – noarch: 1 (again, wheel)

    View Slide

  51. People do stupid things

    Random files created in my home directory

    Random stuff appended to my .bashrc

    Some people run “git clone” in their setup.py

    Inadvertently importing numpy

    View Slide

  52. In the future

    Continue to build all new releases on a small number of buider Pis

    Add SSL to web domain

    Create individual package pages with build output

    Install key dependencies and try to fix failed builds

    Rebuild Python 3.4 wheels for 3.5 & 3.6

    Ensure ARMv6 wheels are available too
    – You can rename ARMv7 wheels and they work on ARMv6...

    Add the piwheels server to pip config in Raspbian (our distro) as an
    additional index
    – Users get wheels for free without needing to know about it

    View Slide

  53. Talk is cheap. Show me the code.

    View Slide

  54. Talk is cheap. Show me the code.
    github.com/bennuttall/piwheels

    View Slide