Slide 1

Slide 1 text

piwheels: building a faster Python package repository for Raspberry Pi users Ben Nuttall Raspberry Pi Foundation UK Charity 1129409

Slide 2

Slide 2 text

Ben Nuttall ● Raspberry Pi Community Manager ● Columnist for opensource.com ● github.com/bennuttall ● twitter.com/ben_nuttall ● [email protected]

Slide 3

Slide 3 text

pip install numpy

Slide 4

Slide 4 text

Python wheels ● Wheels are built distributions, and can be uploaded to PyPI alongside source distributions ● This saves users from building themselves ● Wheels are architecture-specific – e.g. win32, win64, macosx, linux32, linux64 ● A recent addition allowed “manylinux” wheels to be uploaded ● Raspberry Pi is not the “manylinux” architecture, it’s ARM

Slide 5

Slide 5 text

ARM wheels ● Technically... – Pi 3 is ARMv8 – Pi 2 is ARMv7 – Pi 1 / Zero are ARMv6 ● But... – Wheels built on Pi 2/3 are tagged “armv7l” – Wheels built on Pi 1 / Zero are tagged “armv6l” ● And… – They are actually ARMv6 wheels, and they’re all the same ● So… – A wheel built on a Pi 3 will work on a Pi 2, as is – A wheel built on a Pi 3 will work on a Pi 1 / Zero (if renamed to “armv6l”)

Slide 6

Slide 6 text

pip install

Slide 7

Slide 7 text

Which type of x86/x86_64 are you?

Slide 8

Slide 8 text

...I have them all

Slide 9

Slide 9 text

Actually I’m linux_armv7l

Slide 10

Slide 10 text

What? I’ve never heard of it. Here’s the source, build it yourself! ● ~20 mins on Pi 3 (1.2GHz quad-core) ● ~2.5 hours on Pi 1 (700MHz single-core)

Slide 11

Slide 11 text

Fine! I’ll build my own package repository... ● cd Projects ● mkdir piwheels ● cd piwheels ● git init

Slide 12

Slide 12 text

piwheels ● I could build everything on PyPI and host my own repository – “pip wheel numpy” works on a Pi – You can distribute the wheel and it’s a super fast install ● Can I host my own package repository? – Apparently, yes! – At minimum, an Apache directory listing will do the trick

Slide 13

Slide 13 text

piwheels v1 ● Pi 3 in my living room ● Build the latest version of every package (106k packages at the time) ● Log output into postgres database ● Host a package repository on the same Pi

Slide 14

Slide 14 text

piwheels v1: the results ● It took 10 days to complete the build run ● 76% build success rate ● Repository (was) live at piwheels.bennuttall.com ● “pip install numpy ­i http://piwheels.bennuttall.com” works and takes 6 seconds :) ● Proof of concept: it works, it’s probably useful

Slide 15

Slide 15 text

Planning piwheels v2 ● Build every version of every package ● Keep up with new releases automatically ● Host a package repository as before ● Create a test suite ● Provide installation instructions & developer documentation so people can contribute

Slide 16

Slide 16 text

Planning piwheels v2 ● Now 113k packages on PyPI ● But now I’m building every version of every package ● 750k package versions to build ● At the previous rate, this will take 70 days on one Pi… – Maybe I could use … more than one Pi?

Slide 17

Slide 17 text

Mythic Beasts: Pi in the cloud

Slide 18

Slide 18 text

There is no Raspberry Pi cloud...

Slide 19

Slide 19 text

Pete

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

piwheels v2 ● One Pi to start ● Left it running for a couple of weeks: – Actually running slower than before due to NFS – New estimate: 100 days

Slide 22

Slide 22 text

Lightning talk at EuroPython

Slide 23

Slide 23 text

Why don’t you just cross-compile? ● It’s not all about speed ● Reliability ● Compatability ● Familiarity ● Ease of use ● I can scale up Pis easily ● Eating my own dog food

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

ALL the cloud Pis

Slide 27

Slide 27 text

Adding more Pis ● Provisioned a second Pi – No web server or database – Connected to the database on first Pi – rsync files to first Pi ● Provisioned a third Pi ● Installed “terminator” ● Provisioned Pis 4 and 5 ● This is easy. I’ll be done in no time!

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

It didn’t scale ● Output on 3 Pis was about the same as the output on 1 Pi ● The database was getting hammered

Slide 30

Slide 30 text

Dave Jones ● Author of picamera ● Co-author of GPIO Zero ● Self-professed SQL know-it- all ● We’ve worked together on open source projects a lot

Slide 31

Slide 31 text

Make it scale! ● Pull request: – Query optimisations – Queuing system with zeromq ● Re-deployed the code ● Original Pi is now “master” running database and web server only ● Other Pis are now “builders” using master’s database and rsync- ing files to master

Slide 32

Slide 32 text

It worked! Keep going!

Slide 33

Slide 33 text

20 Raspberry Pis ● ~3k packages per hour ● ~72k per day (10%) ● Now also logging which Pi built each package ● Pis seem to be holding up ● Dropped rsync in favour of sshfs ● It’s going well! I’ll be done in no time!

Slide 34

Slide 34 text

People do stupid things ● Random files created in my home directory ● Random stuff appended to my .bashrc ● Some people run “git clone” in their setup.py ● Inadvertently importing numpy

Slide 35

Slide 35 text

The results ● Total packages processed: 113, 649 ● Package versions built: 570, 648 / 752, 817 (76%) ● Total cumulative time spent building: 156 days, 18 hours (including duplicates) – In real time this was 26 days: ● 16 days with 1 Pi building ● 10 days with up to 19 Pis building ● 250GB disk space used by wheels

Slide 36

Slide 36 text

Reasons for failure

Slide 37

Slide 37 text

The results (round 2) ● Discount “no release” versions ● Install key missing dependencies ● Total packages: 100, 802 / 117, 444 (86%) ● Package versions: 586, 266 / 703, 571 (83%)

Slide 38

Slide 38 text

Reasons for failure (round 2)

Slide 39

Slide 39 text

It’s done! ● Scaled down to 5 Pis to keep up with new releases ● Thanks, Pete! You can have them back now.

Slide 40

Slide 40 text

Supporting multiple Python versions ● We built everything on Raspbian Jessie (Python 3.4) ● Raspbian Stretch is now released (Python 3.5) ● We also want to support Python 3.6 (and 2.7) ● Pure Python packages are built for a whole major Python version (i.e. any Python 3.x) ● Compiled packages are built for a specific minor Python version (e.g. Python 3.4) and need rebuilding for other ABIs

Slide 41

Slide 41 text

Build procedure ● Keep package list and version list up-to-date in database ● Form a build queue for unattempted package versions ● Attempt to build everything in the queue ● If a wheel is tagged with “linux_armv7l”, create a symlink from “linux_armv6l” for Raspberry Pi 1/Zero users ● If a wheel is tagged with an ABI other than “none”, trigger it to be rebuilt for other ABIs ● Stretch builder will pick this up and attempt to build for Python 3.5 ● Same principle for Python 3.6 and 2.7

Slide 42

Slide 42 text

How to get it ● Raspbian Jessie (old stable): – Manually configure pip to use piwheels (/etc/pip.conf) ● Raspbian Stretch (new stable): – pip now pre-configured to use piwheels as an additional index – “sudo apt upgrade” will bring in the config :)

Slide 43

Slide 43 text

www.piwheels.hostedpi.com

Slide 44

Slide 44 text

github.com/bennuttall/piwheels

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

piwheels: building a faster Python package repository for Raspberry Pi users Ben Nuttall Raspberry Pi Foundation UK Charity 1129409