Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Python HTTP Clients

Python HTTP Clients

A look at the past and present of HTTP client libraries in Python

Seth Michael Larson

October 10, 2019
Tweet

More Decks by Seth Michael Larson

Other Decks in Programming

Transcript

  1. Background sethmlarson sethmlarson.dev • Graduated from UMN • Medical device

    industry • Python open source maintainer • Passionate about data security, networking
  2. What is HTTP? • Wildly successful protocol developed at CERN

    • Originally for sharing academic documents • Now for websites, apps, cars, fridges, lightbulbs • Simple by default, tons of optional functionality
  3. What is HTTP? • Clients send HTTP Requests ◦ “I

    want to see this document” ◦ “Here is my username and password” • Servers reply with HTTP Responses ◦ “Here’s that document” ◦ “I couldn’t find that document” ◦ “Look over here for that document” • Semantics defined in RFC 7231
  4. HTTP is Evolving • HTTP/2 (RFC 7450) published in 2015

    ◦ Changes message format from text to binary ◦ One TCP connection handles concurrent requests ◦ Header compression
  5. HTTP is Evolving • HTTP/3 to be released Soon™ ◦

    Builds on HTTP/2 concepts ◦ Uses QUIC instead of TCP for transport ◦ Performance wins for the lossy connections (mobile, islands, developing nations)
  6. Python is Evolving • Async I/O is the new tool

    for doing multiple things at a time. • Basically jump between a set of “tasks” when those tasks are doing something that doesn’t require CPU (blocking I/O, disk, networking) TASK A * CPU work* TASK B sleep(10)
  7. Python is Evolving • Asyncio ◦ Released in Python 3.4

    • async/await (PEP 492) ◦ Released in Python 3.6 • Alternate async implementations ◦ Curio released in 2015 ◦ Trio released in 2017
  8. Python is Evolving • Problem for libraries is that async

    / sync don’t cooperate well together. • If any sync-blocking I/O, will hold up all other tasks • If async, can’t be executed from sync context • Typically library authors pick one or the other → Fragmentation
  9. Python’s HTTP History • httplib, http.client • Standard library HTTP

    client for Python 2 and 3 • Early-on didn’t have many features • Interface is confusing to users • Different APIs on Python 2 and 3
  10. Python’s HTTP History • urllib3 released in 2008 by Andrey

    Petrov • Uses httplib and http.client to support both Python 2 and 3 • Added missing features from stdlib • (I’m a maintainer of this library)
  11. Python’s HTTP History • Requests released in 2011 • User-friendly

    HTTP client interface • Leverages urllib3 for HTTP dispatch • Extensible via custom Adapters • De-facto standard for HTTP in Python • Synchronous, HTTP/1.1 only
  12. Python’s HTTP History • We’re basically still here • Support

    synchronous I/O only • Support HTTP/1.X only • HTTP implementation tightly-coupled to the standard library • Missing 8+ years of innovations to HTTP and async I/O
  13. Python’s HTTP Future • aiohttp ◦ Released in 2013 by

    Nikolay Kim ◦ HTTP client written for Asyncio ◦ HTTP/1.1 implementation tightly-coupled ◦ No synchronous interface
  14. Python’s HTTP Future • hyper ◦ Released in 2014 by

    Cory Benfield ◦ HTTP client that supports HTTP/2 and HTTP/1.1 ◦ Implements the same interface as httplib and http.client ◦ Synchronous I/O
  15. Python’s HTTP Future • Sans-I/O • Implement network protocols as

    state-machines • Written by Cory Benfield, Brett Cannon, and others • Many benefits to this design: testable, reusable. • No coupling to I/O, allows sync and async
  16. Python’s HTTP Future • h2, hyperframe, hpack ◦ Released in

    2015 by Cory Benfield ◦ HTTP/2 implementation in sans-I/O Python ◦ Pulled out of the hyper package into their own packages • h11 ◦ Released in 2016 by Nathaniel Smith ◦ HTTP/1.1 implementation in sans-I/O Python
  17. Python’s HTTP Future • asks • Released in 2017 by

    Mark Jameson • Async HTTP client that speaks HTTP/1.1 • Requests-like interface • Supports Asyncio, Trio, and Curio via Anyio
  18. Python’s HTTP Future • aioquic • Released in June 2019

    by Jeremy Lainé • Implementation of QUIC and HTTP/3 in sans-I/O Python • Reusable if another application protocol (like DNS!) is implemented with QUIC
  19. Python’s HTTP Future • HTTPX • Released in July 2019

    by Tom Christie • Supports HTTP/1.1, HTTP/2, (HTTP/3 planned) • Requests-like interface • Allows making synchronous and async requests • Supports Asyncio and Trio • (I’m a maintainer of this library)
  20. HTTP Client Libraries Project Prod-Ready Sync Asyncio Trio Curio Requests

    API HTTP/1.1 HTTP/2 HTTP/3 urllib3 ✓ ✓ ✓ requests ✓ ✓ ✓ ✓ aiohttp ✓ ✓ ✓ hyper ✓ ✓ ✓ asks ✓ ✓ ✓ ✓ ✓ HTTPX ✓ ✓ ✓ ✓ ✓ ✓ SUPPORTED PLANNED *
  21. Conclusion • Python is catching up with HTTP ecosystem •

    Everyone benefits from well-maintained sans-I/O protocol libraries • All libraries mentioned here are Open Source and would benefit from your contributions • There’s still plenty of work to do!
  22. Mentions and Sources • PyPI Downloads: https://pypistats.org • urllib3: https://github.com/urllib3/urllib3

    • Requests: https://github.com/psf/requests • aiohttp: https://github.com/aio-libs/aiohttp • Sans-I/O: https://sans-io.readthedocs.io • h2/h11/hyper: https://github.com/python-hyper • asks: https://github.com/theelous3/asks • aioquic: https://github.com/aiortc/aioquic • HTTPX: https://github.com/encode/httpx