Writing Redis in Python with asyncio

Writing Redis in Python with asyncio

Python has been adding more and more async features to the language and the standard library. Starting with asyncio in python 3.4 and including the new async/await keywords in python 3.5, it’s difficult to understand how all these pieces fit together. More importantly, it’s hard to envision how to use these new language features in a real world application. In this talk we’re going to move beyond the basic examples of TCP echo servers and example servers that can add number together. Instead I’ll show you a realistic asyncio application. This application is a port of redis, a popular data structure server, written in python using asyncio. In addition to basic topics such as handling simple redis commands (GET, SET, RPUSH, etc), we’ll look at notifications using pub/sub, and how to implement blocking queues.

A8cb6a42bfa3881315aafa5e027390e3?s=128

James Saryerwinnie

July 21, 2016
Tweet

Transcript

  1. WRITING REDIS IN PYTHON WITH ASYNCIO James Saryerwinnie / @jsaryer

  2. GOALS ‣ How to structure a “larger” network server application

    ‣ Request/Response structure ‣ Publish/Subscribe ‣ Blocking queues
  3. ABOUT ME ‣ AWS CLI ‣ Boto3/botocore/boto ‣ JMESPath ‣

    AWS Shell ‣ Chalice
  4. s Really Work (version 1.0) Create your own cartoon at

    www.projectc customer ned it How the project leader understood it How the analyst designed it How the programmer wrote it How the business consultant described it http://www.projectcartoon.com/cartoon/3 How the authors envisioned it How Projects Really Work (version 1.0) How the customer explained it How the project leader understood it How the analyst designed it How version 1.0) Create your own cartoon at www.projectcartoon.com w the project leader understood it How the analyst designed it How the programmer wrote it How the business consultant described it What might happen here
  5. REDIS ‣ Data structure server ‣ Set and get key

    value pairs ‣ Values can be more than string
  6. Redis Client GET foo bar Redis Client SET foo bar

    OK
  7. Redis Client LPOP foo a Redis Client RPUSH foo a

    RPUSH foo b RPUSH foo c LRANGE foo 0 2 b, c
  8. REQUEST / RESPONSE

  9. Redis Client GET foo bar What we want

  10. None
  11. server.py import asyncio loop = asyncio.get_event_loop() coro = loop.create_server(RedisServerProtocol, '127.0.0.1',

    6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  12. server.py import asyncio loop = asyncio.get_event_loop() coro = loop.create_server(RedisServerProtocol, '127.0.0.1',

    6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  13. server.py import asyncio loop = asyncio.get_event_loop() coro = loop.create_server(RedisServerProtocol, '127.0.0.1',

    6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  14. server.py import asyncio loop = asyncio.get_event_loop() coro = loop.create_server(RedisServerProtocol, '127.0.0.1',

    6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  15. server.py import asyncio loop = asyncio.get_event_loop() coro = loop.create_server(RedisServerProtocol, '127.0.0.1',

    6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  16. protocol.py class RedisServerProtocol(asyncio.Protocol): def connection_made(self, transport): self.transport = transport def

    data_received(self, data): pass How do these work?
  17. Let’s look under the hood

  18. server.py import asyncio loop = asyncio.get_event_loop() coro = loop.create_server(RedisServerProtocol, '127.0.0.1',

    6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  19. server.py import asyncio loop = asyncio.get_event_loop() coro = loop.create_server(RedisServerProtocol, '127.0.0.1',

    6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  20. asyncio/selector_events.py def _accept_connection2( self, protocol_factory, conn, extra, server=None): protocol =

    None transport = None try: protocol = protocol_factory() # RedisServerProtocol waiter = futures.Future(loop=self) transport = _SelectorSocketTransport(self, sock, protocol, waiter, extra, server) # ... except Exception as exc: # ... pass
  21. asyncio/selector_events.py def _accept_connection2( self, protocol_factory, conn, extra, server=None): protocol =

    None transport = None try: protocol = protocol_factory() # RedisServerProtocol waiter = futures.Future(loop=self) transport = _SelectorSocketTransport(self, sock, protocol, waiter, extra, server) # ... except Exception as exc: # ... pass
  22. Protocol Transport Protocol Transport Protocol Transport client_connected client_connected client_connected

  23. asyncio/selector_events.py class _SelectorSocketTransport(_SelectorTransport): def __init__(self, loop, sock, protocol, waiter=None, extra=None,

    server=None): super().__init__(loop, sock, protocol, extra, server) self._eof = False self._paused = False self._loop.call_soon(self._protocol.connection_made, self) # only start reading when connection_made() has been called self._loop.call_soon(self._loop.add_reader, self._sock_fd, self._read_ready)
  24. asyncio/selector_events.py class _SelectorSocketTransport(_SelectorTransport): def __init__(self, loop, sock, protocol, waiter=None, extra=None,

    server=None): super().__init__(loop, sock, protocol, extra, server) self._eof = False self._paused = False self._loop.call_soon(self._protocol.connection_made, self) # only start reading when connection_made() has been called self._loop.call_soon(self._loop.add_reader, self._sock_fd, self._read_ready)
  25. asyncio/selector_events.py class _SelectorSocketTransport(_SelectorTransport): def __init__(self, loop, sock, protocol, waiter=None, extra=None,

    server=None): super().__init__(loop, sock, protocol, extra, server) self._eof = False self._paused = False self._loop.call_soon(self._protocol.connection_made, self) # only start reading when connection_made() has been called self._loop.call_soon(self._loop.add_reader, self._sock_fd, self._read_ready)
  26. asyncio/selector_events.py def _read_ready(self): try: data = self._sock.recv(self.max_size) except (BlockingIOError, InterruptedError):

    pass except Exception as exc: self._fatal_error(exc, 'Fatal read error') else: if data: self._protocol.data_received(data) else: pass
  27. asyncio/selector_events.py def _read_ready(self): try: data = self._sock.recv(self.max_size) except (BlockingIOError, InterruptedError):

    pass except Exception as exc: self._fatal_error(exc, 'Fatal read error') else: if data: self._protocol.data_received(data) else: pass
  28. protocol.py class RedisServerProtocol(asyncio.Protocol): def connection_made(self, transport): self.transport = transport def

    data_received(self, data): pass Callbacks
  29. Protocol Transport Protocol Transport Protocol Transport Protocol Transport Event Loop

    data_received() data_received() data_received() data_received()
  30. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def __init__(self, db): self._db = db def

    connection_made(self, transport): self.transport = transport def data_received(self, data): parsed = parser.parse_wire_protocol(data) # [b"SET", b"foo", b"bar"] command = parsed[0].lower() if command == b'get': response = self._db.get(parsed[1]) elif command == b'set': response = self._db.set(parsed[1], parsed[2]) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  31. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def __init__(self, db): self._db = db def

    connection_made(self, transport): self.transport = transport def data_received(self, data): parsed = parser.parse_wire_protocol(data) # [b"SET", b"foo", b"bar"] command = parsed[0].lower() if command == b'get': response = self._db.get(parsed[1]) elif command == b'set': response = self._db.set(parsed[1], parsed[2]) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  32. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def __init__(self, db): self._db = db def

    connection_made(self, transport): self.transport = transport def data_received(self, data): parsed = parser.parse_wire_protocol(data) # [b"SET", b"foo", b"bar"] command = parsed[0].lower() if command == b'get': response = self._db.get(parsed[1]) elif command == b'set': response = self._db.set(parsed[1], parsed[2]) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  33. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def __init__(self, db): self._db = db def

    connection_made(self, transport): self.transport = transport def data_received(self, data): parsed = parser.parse_wire_protocol(data) # [b"SET", b"foo", b"bar"] command = parsed[0].lower() if command == b'get': response = self._db.get(parsed[1]) elif command == b'set': response = self._db.set(parsed[1], parsed[2]) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  34. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def __init__(self, db): self._db = db def

    connection_made(self, transport): self.transport = transport def data_received(self, data): parsed = parser.parse_wire_protocol(data) # [b"SET", b"foo", b"bar"] command = parsed[0].lower() if command == b'get': response = self._db.get(parsed[1]) elif command == b'set': response = self._db.set(parsed[1], parsed[2]) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  35. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def __init__(self, db): self._db = db def

    connection_made(self, transport): self.transport = transport def data_received(self, data): parsed = parser.parse_wire_protocol(data) # [b"SET", b"foo", b"bar"] command = parsed[0].lower() if command == b'get': response = self._db.get(parsed[1]) elif command == b'set': response = self._db.set(parsed[1], parsed[2]) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response) b'*3\r\n$3\r\nSET\r\n$3\r\nfoo\r\n$3\r\nbar\r\n' b'+OK\r\n'
  36. rserver/db.py _DB = {} class DB: def __init__(self, db=None): if

    db is None: db = _DB self._db = db def get(self, item): return self._db.get(item) def set(self, item, value): self._db[item] = value return True ‣ DB is in its own separate module ‣ It doesn’t know anything about asyncio
  37. rserver/db.py class DB: def rpush(self, item, values): current_list = self._db.setdefault(item,

    []) current_list.extend(values) return len(current_list) def lrange(self, key, start, stop): if stop == -1: end = None else: stop += 1 return self._db.get(key, [])[start:stop] def lpop(self, key): value = self._db.get(key, []) if value: return value.pop(0)
  38. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def data_received(self, data): parsed = parser.parse_wire_protocol(data) #

    [b"SET", b"foo", b"bar"] command = parsed[0].lower() if command == b'get': response = self._db.get(parsed[1]) elif command == b'set': response = self._db.set(parsed[1], parsed[2]) elif command == b'rpush': response = self._db.rpush(parsed[1], parsed[2:]) elif command == b'lrange': response = self._db.lrange(parsed[1], int(parsed[2]), int(parsed[3])) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  39. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def data_received(self, data): parsed = parser.parse_wire_protocol(data) #

    [b"SET", b"foo", b"bar"] command = parsed[0].lower() if command == b'get': response = self._db.get(parsed[1]) elif command == b'set': response = self._db.set(parsed[1], parsed[2]) elif command == b'rpush': response = self._db.rpush(parsed[1], parsed[2:]) elif command == b'lrange': response = self._db.lrange(parsed[1], int(parsed[2]), int(parsed[3])) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  40. Redis Client GET foo bar What we have

  41. PUBLISH / SUBSCRIBE

  42. Redis Client SUBSCRIBE foo Client SUBSCRIBE foo Client PUBLISH foo

    hello hello What we want - PUBLISH/SUBSCRIBE hello
  43. None
  44. Protocol Transport Protocol Transport Protocol Transport client_connected client_connected client_connected

  45. Protocol ProtocolFactory Protocol Protocol def _accept_connection2(…): try: protocol = protocol_factory()

    waiter = futures.Future(loop=self) transport = _SelectorSocketTransport( self, sock, protocol, waiter, extra, server) # ... except Exception as exc: # ... pass
  46. rserver/server.py class PubSub: def __init__(self): self._channels = {} def subscribe(self,

    channel, transport): self._channels.setdefault(channel, []).append(transport) return ['subscribe', channel, 1] def publish(self, channel, message): transports = self._channels.get(channel, []) message = serializer.serialize_to_wire( ['message', channel, message]) for transport in transports: transport.write(message) return len(transports)
  47. rserver/server.py class PubSub: def __init__(self): self._channels = {} def subscribe(self,

    channel, transport): self._channels.setdefault(channel, []).append(transport) return ['subscribe', channel, 1] def publish(self, channel, message): transports = self._channels.get(channel, []) message = serializer.serialize_to_wire( ['message', channel, message]) for transport in transports: transport.write(message) return len(transports)
  48. rserver/server.py class PubSub: def __init__(self): self._channels = {} def subscribe(self,

    channel, transport): self._channels.setdefault(channel, []).append(transport) return ['subscribe', channel, 1] def publish(self, channel, message): transports = self._channels.get(channel, []) message = serializer.serialize_to_wire( ['message', channel, message]) for transport in transports: transport.write(message) return len(transports)
  49. rserver/server.py class RedisServerProtocol(asyncio.Protocol): def data_received(self, data): parsed = parser.parse_wire_protocol(data) #

    [COMMAND, arg1, arg2] command = parsed[0].lower() if command == b'subscribe': response = self._pubsub.subscribe( parsed[1], self.transport) elif command == b'publish': response = self._pubsub.publish(parsed[1], parsed[2]) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  50. rserver/server.py class RedisServerProtocol(asyncio.Protocol): def data_received(self, data): parsed = parser.parse_wire_protocol(data) #

    [COMMAND, arg1, arg2] command = parsed[0].lower() if command == b'subscribe': response = self._pubsub.subscribe( parsed[1], self.transport) elif command == b'publish': response = self._pubsub.publish(parsed[1], parsed[2]) wire_response = serializer.serialize_to_wire(response) self.transport.write(wire_response)
  51. server.py import asyncio loop = asyncio.get_event_loop() coro = loop.create_server(RedisServerProtocol, '127.0.0.1',

    6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  52. server.py import asyncio loop = asyncio.get_event_loop() factory = ProtocolFactory( RedisServerProtocol,

    db.DB(), PubSub(), ) coro = loop.create_server(factory, '127.0.0.1', 6379) server = loop.run_until_complete(coro) try: loop.run_forever() except KeyboardInterrupt: pass server.close() loop.run_until_complete(server.wait_closed()) loop.close()
  53. rserver/server.py class ProtocolFactory: def __init__(self, protocol_cls, *args, **kwargs): self._protocol_cls =

    protocol_cls self._args = args self._kwargs = kwargs def __call__(self): # No arg callable is used to instantiate # protocols in asyncio. return self._protocol_cls(*self._args, **self._kwargs)
  54. ProtocolFactory PubSub Protocol Protocol Protocol DB Transport Transport Transport

  55. BLOCKING LIST POP

  56. Redis Client BLPOP foo 0 Client BLPOP foo 0 Client

    RPUSH foo bar bar What we want - BLPOP
  57. None
  58. How do we do this?

  59. ProtocolFactory KeyBlocker Protocol Protocol Protocol

  60. rserver/db.py from rserver import types class DB: def blpop(self, key):

    value = self._db.get(key, []) if value: element = value.pop(0) return element return types.MUST_WAIT
  61. rserver/db.py from rserver import types class DB: def blpop(self, key):

    value = self._db.get(key, []) if value: element = value.pop(0) return element return types.MUST_WAIT
  62. rserver/db.py from rserver import types class DB: def blpop(self, key):

    value = self._db.get(key, []) if value: element = value.pop(0) return element return types.MUST_WAIT
  63. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def __init__(self, db, keyblocker, loop): self._db =

    db self._keyblocker = keyblocker self._loop = loop def data_received(self, data): # … if command == b'blpop': response = self._db.blpop( parsed[1], timeout=int(parsed[2])) if response is types.MUST_WAIT: q = self._keyblocker.wait_for_key(parsed[1], self.transport) self._loop.create_task(q) return
  64. class RedisServerProtocol(asyncio.Protocol): def __init__(self, db, keyblocker, loop): self._db = db

    self._keyblocker = keyblocker self._loop = loop def data_received(self, data): # … if command == b'blpop': response = self._db.blpop( parsed[1], timeout=int(parsed[2])) if response is types.MUST_WAIT: q = self._keyblocker.wait_for_key(parsed[1], self.transport) self._loop.create_task(q) return rserver/protocol.py
  65. class RedisServerProtocol(asyncio.Protocol): def __init__(self, db, keyblocker, loop): self._db = db

    self._keyblocker = keyblocker self._loop = loop def data_received(self, data): # … if command == b'blpop': response = self._db.blpop( parsed[1], timeout=int(parsed[2])) if response is types.MUST_WAIT: q = self._keyblocker.wait_for_key(parsed[1], self.transport) self._loop.create_task(q) return rserver/protocol.py
  66. class RedisServerProtocol(asyncio.Protocol): def __init__(self, db, keyblocker, loop): self._db = db

    self._keyblocker = keyblocker self._loop = loop def data_received(self, data): # … if command == b'blpop': response = self._db.blpop( parsed[1], timeout=int(parsed[2])) if response is types.MUST_WAIT: q = self._keyblocker.wait_for_key(parsed[1], self.transport) self._loop.create_task(q) return rserver/protocol.py
  67. class RedisServerProtocol(asyncio.Protocol): def __init__(self, db, keyblocker, loop): self._db = db

    self._keyblocker = keyblocker self._loop = loop def data_received(self, data): # … if command == b'blpop': response = self._db.blpop( parsed[1], timeout=int(parsed[2])) if response is types.MUST_WAIT: q = self._keyblocker.wait_for_key(parsed[1], self.transport) self._loop.create_task(q) return rserver/protocol.py
  68. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def data_received(self, data): # … command =

    parsed[0].lower() if command == b'rpush': response = self._db.rpush(parsed[1], parsed[2:]) self._loop.create_task( self._keyblocker.data_for_key(parsed[1], parsed[2]))
  69. rserver/protocol.py class RedisServerProtocol(asyncio.Protocol): def data_received(self, data): # … command =

    parsed[0].lower() if command == b'rpush': response = self._db.rpush(parsed[1], parsed[2:]) self._loop.create_task( self._keyblocker.data_for_key(parsed[1], parsed[2]))
  70. rserver/server.py class KeyBlocker: def __init__(self): self._blocked_keys = {} async def

    wait_for_key(self, key, transport): if key not in self._blocked_keys: self._blocked_keys[key] = asyncio.Queue() q = self._blocked_keys[key] value = await q.get() transport.write( serializer.serialize_to_wire(value) ) async def data_for_key(self, key, value): _LOG.debug("Running data_for_key: %s, value: %s", key, value) if key in self._blocked_keys: q = self._blocked_keys[key] await q.put(value) _LOG.debug("item put in q via q.put()")
  71. class KeyBlocker: def __init__(self): self._blocked_keys = {} async def wait_for_key(self,

    key, transport): if key not in self._blocked_keys: self._blocked_keys[key] = asyncio.Queue() q = self._blocked_keys[key] value = await q.get() transport.write( serializer.serialize_to_wire(value) ) async def data_for_key(self, key, value): _LOG.debug("Running data_for_key: %s, value: %s", key, value) if key in self._blocked_keys: q = self._blocked_keys[key] await q.put(value) _LOG.debug("item put in q via q.put()") rserver/server.py
  72. class KeyBlocker: def __init__(self): self._blocked_keys = {} async def wait_for_key(self,

    key, transport): if key not in self._blocked_keys: self._blocked_keys[key] = asyncio.Queue() q = self._blocked_keys[key] value = await q.get() transport.write( serializer.serialize_to_wire(value) ) async def data_for_key(self, key, value): _LOG.debug("Running data_for_key: %s, value: %s", key, value) if key in self._blocked_keys: q = self._blocked_keys[key] await q.put(value) _LOG.debug("item put in q via q.put()") rserver/server.py
  73. class KeyBlocker: def __init__(self): self._blocked_keys = {} async def wait_for_key(self,

    key, transport): if key not in self._blocked_keys: self._blocked_keys[key] = asyncio.Queue() q = self._blocked_keys[key] value = await q.get() transport.write( serializer.serialize_to_wire(value) ) async def data_for_key(self, key, value): _LOG.debug("Running data_for_key: %s, value: %s", key, value) if key in self._blocked_keys: q = self._blocked_keys[key] await q.put(value) _LOG.debug("item put in q via q.put()") rserver/server.py
  74. rserver/server.py class KeyBlocker: def __init__(self): self._blocked_keys = {} async def

    wait_for_key(self, key, transport): if key not in self._blocked_keys: self._blocked_keys[key] = asyncio.Queue() q = self._blocked_keys[key] value = await q.get() transport.write( serializer.serialize_to_wire(value) ) async def data_for_key(self, key, value): _LOG.debug("Running data_for_key: %s, value: %s", key, value) if key in self._blocked_keys: q = self._blocked_keys[key] await q.put(value) _LOG.debug("item put in q via q.put()")
  75. Event Loop

  76. wait_for_key Event Loop

  77. q.get() wait_for_key Event Loop

  78. q.get() wait_for_key Event Loop yield

  79. q.get() wait_for_key Event Loop yield

  80. q.get() wait_for_key Event Loop yield future

  81. q.get() wait_for_key Event Loop yield future

  82. q.get() wait_for_key Event Loop yield future data_for_key

  83. q.get() wait_for_key Event Loop yield future q.put() data_for_key

  84. q.get() wait_for_key Event Loop yield future q.put() data_for_key value

  85. q.get() wait_for_key Event Loop yield future q.put() data_for_key value

  86. q.get() wait_for_key Event Loop yield q.put() data_for_key value

  87. q.get() wait_for_key Event Loop yield q.put() data_for_key value

  88. q.get() wait_for_key Event Loop value

  89. rserver/server.py class KeyBlocker: def __init__(self): self._blocked_keys = {} async def

    wait_for_key(self, key, transport): if key not in self._blocked_keys: self._blocked_keys[key] = asyncio.Queue() q = self._blocked_keys[key] value = await q.get() transport.write( serializer.serialize_to_wire(value) )
  90. ADDITIONAL CONSIDERATIONS ‣ “Real” parsing is more complicated ‣ Pub/sub

    handles clients disconnecting ‣ Pub/sub globs ‣ Blocking queues can wait on multiple keys
  91. PERFORMANCE ‣ redis-benchmark -n 100000 -t set,get -c 50 ‣

    redis-server: 82563 requests per second (gets/sets) ‣ pyredis-server: 24192 requests per second ‣ pyredis-server (uvloop): 38285 requests per second
  92. WHAT WE LEARNED ‣ Transports and Protocols ‣ Simple request

    response ‣ Publish / Subscribe ‣ Blocking queue like behavior
  93. THANKS! ‣ For more info: @jsaryer