Slide 1

Slide 1 text

Python HTTP Clients

Slide 2

Slide 2 text

Background sethmlarson sethmlarson.dev ● Graduated from UMN ● Medical device industry ● Python open source maintainer ● Passionate about data security, networking

Slide 3

Slide 3 text

What is HTTP?

Slide 4

Slide 4 text

What is HTTP? ● Wildly successful protocol developed at CERN ● Originally for sharing academic documents ● Now for websites, apps, cars, fridges, lightbulbs ● Simple by default, tons of optional functionality

Slide 5

Slide 5 text

What is HTTP? ● Clients send HTTP Requests ○ “I want to see this document” ○ “Here is my username and password” ● Servers reply with HTTP Responses ○ “Here’s that document” ○ “I couldn’t find that document” ○ “Look over here for that document” ● Semantics defined in RFC 7231

Slide 6

Slide 6 text

HTTP is Evolving

Slide 7

Slide 7 text

HTTP is Evolving ● HTTP/2 (RFC 7450) published in 2015 ○ Changes message format from text to binary ○ One TCP connection handles concurrent requests ○ Header compression

Slide 8

Slide 8 text

HTTP is Evolving ● HTTP/3 to be released Soon™ ○ Builds on HTTP/2 concepts ○ Uses QUIC instead of TCP for transport ○ Performance wins for the lossy connections (mobile, islands, developing nations)

Slide 9

Slide 9 text

Python is Evolving

Slide 10

Slide 10 text

Python is Evolving ● Async I/O is the new tool for doing multiple things at a time. ● Basically jump between a set of “tasks” when those tasks are doing something that doesn’t require CPU (blocking I/O, disk, networking) TASK A * CPU work* TASK B sleep(10)

Slide 11

Slide 11 text

Python is Evolving ● Asyncio ○ Released in Python 3.4 ● async/await (PEP 492) ○ Released in Python 3.6 ● Alternate async implementations ○ Curio released in 2015 ○ Trio released in 2017

Slide 12

Slide 12 text

Python is Evolving ● Problem for libraries is that async / sync don’t cooperate well together. ● If any sync-blocking I/O, will hold up all other tasks ● If async, can’t be executed from sync context ● Typically library authors pick one or the other → Fragmentation

Slide 13

Slide 13 text

Python’s HTTP History

Slide 14

Slide 14 text

Python’s HTTP History ● httplib, http.client ● Standard library HTTP client for Python 2 and 3 ● Early-on didn’t have many features ● Interface is confusing to users ● Different APIs on Python 2 and 3

Slide 15

Slide 15 text

Python’s HTTP History ● urllib3 released in 2008 by Andrey Petrov ● Uses httplib and http.client to support both Python 2 and 3 ● Added missing features from stdlib ● (I’m a maintainer of this library)

Slide 16

Slide 16 text

Python’s HTTP History ● Requests released in 2011 ● User-friendly HTTP client interface ● Leverages urllib3 for HTTP dispatch ● Extensible via custom Adapters ● De-facto standard for HTTP in Python ● Synchronous, HTTP/1.1 only

Slide 17

Slide 17 text

Python’s HTTP History ● We’re basically still here ● Support synchronous I/O only ● Support HTTP/1.X only ● HTTP implementation tightly-coupled to the standard library ● Missing 8+ years of innovations to HTTP and async I/O

Slide 18

Slide 18 text

Python’s HTTP Future

Slide 19

Slide 19 text

Python’s HTTP Future ● aiohttp ○ Released in 2013 by Nikolay Kim ○ HTTP client written for Asyncio ○ HTTP/1.1 implementation tightly-coupled ○ No synchronous interface

Slide 20

Slide 20 text

Python’s HTTP Future ● hyper ○ Released in 2014 by Cory Benfield ○ HTTP client that supports HTTP/2 and HTTP/1.1 ○ Implements the same interface as httplib and http.client ○ Synchronous I/O

Slide 21

Slide 21 text

Python’s HTTP Future ● Sans-I/O ● Implement network protocols as state-machines ● Written by Cory Benfield, Brett Cannon, and others ● Many benefits to this design: testable, reusable. ● No coupling to I/O, allows sync and async

Slide 22

Slide 22 text

Python’s HTTP Future ● h2, hyperframe, hpack ○ Released in 2015 by Cory Benfield ○ HTTP/2 implementation in sans-I/O Python ○ Pulled out of the hyper package into their own packages ● h11 ○ Released in 2016 by Nathaniel Smith ○ HTTP/1.1 implementation in sans-I/O Python

Slide 23

Slide 23 text

Python’s HTTP Future ● asks ● Released in 2017 by Mark Jameson ● Async HTTP client that speaks HTTP/1.1 ● Requests-like interface ● Supports Asyncio, Trio, and Curio via Anyio

Slide 24

Slide 24 text

Python’s HTTP Future ● aioquic ● Released in June 2019 by Jeremy Lainé ● Implementation of QUIC and HTTP/3 in sans-I/O Python ● Reusable if another application protocol (like DNS!) is implemented with QUIC

Slide 25

Slide 25 text

Python’s HTTP Future ● HTTPX ● Released in July 2019 by Tom Christie ● Supports HTTP/1.1, HTTP/2, (HTTP/3 planned) ● Requests-like interface ● Allows making synchronous and async requests ● Supports Asyncio and Trio ● (I’m a maintainer of this library)

Slide 26

Slide 26 text

HTTP Client Libraries Project Prod-Ready Sync Asyncio Trio Curio Requests API HTTP/1.1 HTTP/2 HTTP/3 urllib3 ✓ ✓ ✓ requests ✓ ✓ ✓ ✓ aiohttp ✓ ✓ ✓ hyper ✓ ✓ ✓ asks ✓ ✓ ✓ ✓ ✓ HTTPX ✓ ✓ ✓ ✓ ✓ ✓ SUPPORTED PLANNED *

Slide 27

Slide 27 text

Conclusion ● Python is catching up with HTTP ecosystem ● Everyone benefits from well-maintained sans-I/O protocol libraries ● All libraries mentioned here are Open Source and would benefit from your contributions ● There’s still plenty of work to do!

Slide 28

Slide 28 text

Thank You! (Come find me after if you want stickers!)

Slide 29

Slide 29 text

Mentions and Sources ● PyPI Downloads: https://pypistats.org ● urllib3: https://github.com/urllib3/urllib3 ● Requests: https://github.com/psf/requests ● aiohttp: https://github.com/aio-libs/aiohttp ● Sans-I/O: https://sans-io.readthedocs.io ● h2/h11/hyper: https://github.com/python-hyper ● asks: https://github.com/theelous3/asks ● aioquic: https://github.com/aiortc/aioquic ● HTTPX: https://github.com/encode/httpx