Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Using NSQ in Python

Using NSQ in Python

The talk was performed at the PyCon Russia 2021 conference in collaboration with Oleg Ivashov.

Serge Matveenko

September 06, 2021
Tweet

More Decks by Serge Matveenko

Other Decks in Programming

Transcript

  1. Assaia ApronAI • Real-time • Computer Vision • Generate Insights

    • Optimize Operations • Reduce Accidents The future of airside operations
  2. Advanced Real-time Inference Video Stream Step 1 frames Step 2

    DB View some data final data Side Step 1 Side Step N
  3. Real-time Inference Video Stream Step 1 Step 2 DB View

    any data Side Step final data frames some data frames frames some data final data ?
  4. Message Brokers • RabbitMQ ◦ Erlang ◦ AMQP • Apache

    Kafka ◦ Java ◦ Large memory footprint • NSQ ◦ Go ◦ Lightweight ◦ Easy to maintain ◦ Worth a try ☺
  5. Real-time Inference Video Stream Step 1 Step 2 DB View

    any data Side Step final data frames some data frames frames some data final data NSQ
  6. NSQ routing Application A topic A :: channel B NSQ

    Application B topic A Application C1 Application C2 topic A :: channel C
  7. Serialization • JSON ◦ Text • BSON ◦ MongoDB oriented

    ◦ No RFC • Pickle ◦ Python specific • Protobuf ◦ Code generation ◦ Complex external systems interoperability • CBOR ◦ RFC ◦ Schemaless ◦ Has Schema (CDDL) RFC ◦ Worth a try ☺
  8. NSQ Adoption • Simple binary message payload • Need to

    choose serialization • Maintain schema • Manage topics/channels structure • Manage channel overflow
  9. Journey plan • Choose a Python library • Support CBOR

    serialization • A way to maintain the scheme • We need a better buffer for NSQ
  10. Journey plan • Choose a Python library • Support CBOR

    serialization • A way to maintain the scheme • We need a better buffer for NSQ
  11. Choose a Python library • PyNSQ ◦ Official ◦ Tornado

    • Gnsq ◦ Gevent • Asycnsq ◦ asyncio • Ansq ◦ asyncio too
  12. ansq: how it works nsq = await open_connection() await nsq.pub(topic='books',

    message='War and Peace') await nsq.subscribe(topic='books', channel='student') async for message in nsq.messages(): print(f"Book updated: {message}") await message.fin()
  13. Journey plan • Choose a Python library • Support CBOR

    serialization • A way to maintain the scheme • We need a better buffer for NSQ
  14. Support CBOR serialization: cbor2 how-to >>> import cbor2 >>> data

    = cbor2.dumps(['py', 'con']) >>> data b'\x82bpyccon' >>> cbor2.loads(data) ['py', 'con']
  15. Journey plan • Choose a Python library • Support CBOR

    serialization • A way to maintain the scheme • We need a better buffer for NSQ
  16. A way to maintain the scheme • Around 20 different

    schemes • 11 different applications from video stream to view • Data types • Soft scheme upgrade
  17. A way to maintain the scheme: attrs >>> import attr

    >>> @attr.s(auto_attribs=True) ... class Book: ... title: str = attr.ib( ... converter=str.capitalize, ... validator=is_str, ... default='' ... )
  18. A way to maintain the scheme: attrs >>> book =

    Book(title='War and Peace') >>> payload = attr.asdict(book) >>> payload {'title': 'War and peace'} >>> Book(**payload) Book(title='War and peace')
  19. A way to maintain the scheme: nested objects >>> class

    Box(BaseSchema): ... x0: float; y0: float; ... ... >>> class Prediction(BaseSchema): ... box: Box ... value: float ...
  20. A way to maintain the scheme: nested objects >>> object

    = Prediction(bbox=BBox(value=1)) >>> payload = attr.asdict(object) >>> payload {'bbox': {'value': 1}} >>> Prediction(**payload) Prediction(bbox={'value': 1})
  21. A way to maintain the scheme: cattrs >>> import cattr

    >>> payload = cattr.unstructure(message) >>> payload {'bbox': {'value': 1}} >>> cattr.structure(payload, Prediction) Prediction(bbox=BBox(value=1))
  22. Journey plan • Choose a Python library • Support CBOR

    serialization • A way to maintain the scheme • We need a better buffer for NSQ
  23. NSQ can’t deal with backpressure • “Backpressure explained — the

    resisted flow of data through software” Jay Phelps, VP of Engineering @ Outsmartly https://medium.com/@jayphelps/backpressure-explained-the-flow-of-data- through-software-2350b3e77ce7 • “I'm not feeling the async pressure” Armin Ronacher, Director Of Engineering @ Sentry https://lucumr.pocoo.org/2020/1/1/async-pressure/
  24. NSQ and backpressure: ephemeral channels await nsq.subscribe( topic='book', channel='student#ephemeral' )

    async for message in nsq.messages(): await message.fin() # Channel is empty await nsq.subscribe( topic='book', channel='student#ephemeral' )
  25. NSQ and backpressure: application-side buffer class Buffer(collections.deque): ... async def

    write_buffer(): async for message in reader: buffer.put_nowait(message) loop.create_task(write_buffer())
  26. NSQ in Python • NSQ just works • ansq library

    • CBOR for serialization • attrs + cattrs for schema • … some customization may be required