Slide 1

Slide 1 text

High-performance
 networking in Python Yury Selivanov EuroPython 2016

Slide 2

Slide 2 text

Yury Selivanov – @1st1 About me • Co-founder of MagicStack (http://magic.io); check our company out! • Avid Python user since 2008. • CPython core developer since 2013.

Slide 3

Slide 3 text

Yury Selivanov – @1st1 What I do for Python • Co-authored and implemented PEP 362:
 inspect.signature API. • Authored and implemented PEP 492:
 async/await syntax. • Co-maintainer of asyncio with Guido and Victor Stinner. • Created uvloop.

Slide 4

Slide 4 text

Yury Selivanov – @1st1 Structure of this talk • Overview of async/await. • asyncio and uvloop. • Sockets, Protocols or Streams? • Learn by example: high-performance PostgreSQL driver. • Recap.

Slide 5

Slide 5 text

Performance is hard.

Slide 6

Slide 6 text

Part I async/await

Slide 7

Slide 7 text

Yury Selivanov – @1st1 One obvious way to do it • Callbacks and Deferred. • Stackless Python & greenlets. • Coroutines using generators and yield. • Coroutines using generators and yield from. • And now we have async/await.

Slide 8

Slide 8 text

Yury Selivanov – @1st1 • Dedicated syntax; concise and readable. • New builtin type for coroutines. • New concepts: async for and async with. • Generic, framework agnostic design. • Fast: only ~2x slower than a function call. Why async/await is the answer

Slide 9

Slide 9 text

Yury Selivanov – @1st1 async/await • Subtype of generators; shares a lot of code. • await —> yield from —> YIELD_FROM opcode. • @types.coroutine. • __await__. • __aenter__, __aexit__, __aiter__, __anext__.

Slide 10

Slide 10 text

Part II asyncio, libuv, cython, uvloop

Slide 11

Slide 11 text

Yury Selivanov – @1st1 asyncio • Developed and actively maintained by the BDFL. • A toolbox for frameworks and protocols. • Part of Standard Library.

Slide 12

Slide 12 text

Yury Selivanov – @1st1 asyncio: what’s inside? • Standardized pluggable event loop. • Interfaces for Protocols and Transports. • Factories for servers and connections; streams. • Futures and Tasks: callbacks + coroutines, timeouts, cancellation, etc. • Subprocess, queues, synchronisation primitives.

Slide 13

Slide 13 text

Yury Selivanov – @1st1 Event Loop • Event loop is the foundation of asyncio. • Factories for Tasks and Futures. • IO multiplexor. • Low level APIs for timed events, scheduling callbacks, subprocesses and signals. • And… you can replace it.

Slide 14

Slide 14 text

Yury Selivanov – @1st1 github.com/magicstack/uvloop • 99.(9)% compatible asyncio event loop. • Written in Cython. • Uses libuv under the hood. • Fast Tasks and Futures = faster async/await. • Super fast IO.

Slide 15

Slide 15 text

Yury Selivanov – @1st1 How fast is uvloop? 2-4x faster than asyncio.

Slide 16

Slide 16 text

Yury Selivanov – @1st1 OK, it’s faster than asyncio…

Slide 17

Slide 17 text

Just use uvloop Thank you for your time. Questions? It’s not that easy, unfortunately…

Slide 18

Slide 18 text

Part III Sockets, Streams and Protocols

Slide 19

Slide 19 text

Yury Selivanov – @1st1 One obvious way to do it: episode 2 • Low level sock_sendall, sock_recv, sock_connect. • High level StreamReader and StreamWriter. • Low level Protocols and Transports.

Slide 20

Slide 20 text

Yury Selivanov – @1st1 loop.sock_* methods

Slide 21

Slide 21 text

Yury Selivanov – @1st1 Streams

Slide 22

Slide 22 text

Yury Selivanov – @1st1 Protocols and Transports

Slide 23

Slide 23 text

Yury Selivanov – @1st1 Downsides • loop.sock_*: loop cannot buffer data for you. 
 No flow control. • Streams: easy to use, but too generic. • Protocols & Transports: say hello to callbacks.

Slide 24

Slide 24 text

Flow Control.

Slide 25

Slide 25 text

Yury Selivanov – @1st1 So which API should I use? • Use loop.sock_* for easy porting synchronous code to asyncio. Better consider using Streams. • Use Streams for implementing protocols with async/await. • Use Protocols and Transports for performance.

Slide 26

Slide 26 text

Yury Selivanov – @1st1 Let’s focus on Protocols. • loop pushes data to Protocols. • Protocols send data back using Transports. • Protocols can implement specialized read and write buffers. • Protocols can do flow-control. • Full control over how IO is performed.

Slide 27

Slide 27 text

Yury Selivanov – @1st1 Protocols & async/await • Strategy #1: develop custom streaming abstraction tailored for the concrete protocol and use async/await.
 
 A good example is aiohttp.

Slide 28

Slide 28 text

Yury Selivanov – @1st1 Protocols & async/await • Strategy #2: Implement protocol parsing and logic using callbacks. 
 
 Hide that under an easy to use async/await facade.

Slide 29

Slide 29 text

Part IV asyncpg

Slide 30

Slide 30 text

Yury Selivanov – @1st1 github.com/magicstack/asyncpg • The fastest PostgreSQL driver for asyncio. 
 And for Python in general. • Open-sourced 2 hours ago! • Written from scratch in Cython and Python. 
 Does not use libpq.

Slide 31

Slide 31 text

Yury Selivanov – @1st1 github.com/magicstack/asyncpg • Uses Postgres binary data format. • That means more efficient encoding of data -> less network traffic -> faster IO. • Binary means much faster parsing. Always prefer binary. • Not all Postgres types can be unambiguously parsed 
 when transferred as text.

Slide 32

Slide 32 text

Yury Selivanov – @1st1 • Ditches DB API. • The idea: let’s build an efficient driver for Postgres. Use its features to the max. • Supports all built-in Postgres types; composite and custom types. github.com/magicstack/asyncpg

Slide 33

Slide 33 text

Yury Selivanov – @1st1 • Server loves prepared statements. • So we use them extensively. • Each connection has an LRU cache of prepared statements. • We dynamically build and cache pipelines of data encoders and decoders for each statement. github.com/magicstack/asyncpg

Slide 34

Slide 34 text

Yury Selivanov – @1st1 Did I say asyncpg is fast?

Slide 35

Slide 35 text

Yury Selivanov – @1st1 asyncpg architecture • Protocol is implemented in two classes:
 CoreProtocol and Protocol(CoreProtocol). • CoreProtocol class knows almost nothing about asyncio. • Protocol class is the bridge between the world of callbacks and async/await.

Slide 36

Slide 36 text

Yury Selivanov – @1st1 Parsing PostgreSQL protocol • Naïve approach: use Python bytes type and/or memoryviews. • Problem: the driver will spend a lot of time allocating and deallocating memory. The performance will suffer.

Slide 37

Slide 37 text

Yury Selivanov – @1st1 Parsing PostgreSQL protocol • Solution: use Cython to work with ‘char*’; 
 create high-level Pythonic buffers for that. • asyncpg has three types of buffers!

Slide 38

Slide 38 text

Yury Selivanov – @1st1 ReadBuffer

Slide 39

Slide 39 text

Yury Selivanov – @1st1 asyncpg & async/await • High level logic and API is built in pure Python 
 with async/await.

Slide 40

Slide 40 text

Part V Recap

Slide 41

Slide 41 text

Yury Selivanov – @1st1 Don’t be afraid of Protocols • Use asyncio Protocols and Transports to create high performance protocols. • Use Cython for low level code. It’s amazing. • async/await should always be the preferred public API. • Once you have fast drivers, your application will work fast. Use async/await.

Slide 42

Slide 42 text

Yury Selivanov – @1st1 loop.create_future() • Use `loop.create_future()` instead of `asyncio.Future(loop=loop)`. • Added in Python 3.5.2. • Allows uvloop to inject high performance Future implementation in your code.

Slide 43

Slide 43 text

Yury Selivanov – @1st1 Binary Protocols • Always use binary protocols. • Less traffic between server and clients, 
 faster parsing.

Slide 44

Slide 44 text

Yury Selivanov – @1st1 Profiling and Benchmarking • Always profile your code. Always. • Cython code can be profiled with valgrind;
 results visualized with KCachegrind. • Use `cython -a` option to generate HTML output for your Cython code.

Slide 45

Slide 45 text

Yury Selivanov – @1st1 Zero-copy • Implement custom buffer classes. • Prefer C data types; working with Python 
 bytes is expensive. • Have efficient buffer to build requests and one buffer with ‘transport.write’. Or use multiple buffers and a single ‘transport.writelines’.

Slide 46

Slide 46 text

Yury Selivanov – @1st1 TCP options • When you have no control over how transport.write is called, use TCP_CORK option to buffer writes and send as few TCP packets as possible. • Use TCP_NODELAY to send data as soon as you call ‘transport.write’.

Slide 47

Slide 47 text

Yury Selivanov – @1st1 Timeouts • Implement timeouts as part of your APIs. • ‘asyncio.wait_for’ has a lot of overhead; use ‘loop.call_later’ and build custom cancellation logic at the Protocol level.

Slide 48

Slide 48 text

That’s it. Thank you. Questions?