Lock in $30 Savings on PRO—Offer Ends Soon! ⏳

2017 - Performant Asynchronous Programming at Q...

PyBay
August 21, 2017

2017 - Performant Asynchronous Programming at Quora

Description
In this talk, we will discuss the design of Quora's asynq framework, which provides an asynchronous API to a global scheduler for data requests. We will explore in depth the common problem that motivated it, the design of the framework, and how it has been used in practice to make both the product and development faster at Quora.

Abstract
In order to provide a fast distributed web application to millions of Quora users, we need to be smart about batching data requests to minimize the time spent blocked on network I/O. Moreover, it's important to accomplish this batching in a general way that doesn't require repetitive work every time we make a change or require new data. In this talk, we will discuss the design of Quora's asynq framework, which provides an asynchronous API to a global scheduler for data requests. We will explore in depth the common problem that motivated it, the design of the framework, and how it has been used in practice to make both the product and development faster at Quora.

Bio
Riley Patterson is a software engineering manager on the Platform Frameworks Team at Quora. Our core web application platform is built in Python on top of a web framework that we built on the core of Pylons. As such, the Platform Team uses and builds a wide variety of Python tools and abstractions to enable faster, more effective, and more enjoyable development across the entire team at Quora.

https://www.youtube.com/watch?v=0iqibyfxw3w

PyBay

August 21, 2017
Tweet

More Decks by PyBay

Other Decks in Programming

Transcript

  1. • Most of our data is stored in MySQL or

    HBase • These services are reliable but they are also slow
  2. def render_profile(uid): name = render_name(uid) photo = render_profile_photo(uid) short_bio =

    render_short_bio(uid) follow_button = render_follow_button(uid) ...
  3. def prime_render_profile(uid): mc.multiget([ ‘user-name’, ‘short-bio’, ‘image-url’, ‘follow-count’, ... ]) def

    render_profile(uid): name = render_name(uid) photo = render_profile_photo(uid) short_bio = render_short_bio(uid) follow_button = render_follow_button(uid) ...
  4. • Developers have to write most application logic twice ◦

    ◦ Once for the actual logic that uses that data • And we have to maintain code in two places
  5. def gen(arg): g1 = yield arg * 2 g2 =

    yield g1 + 1 yield g2 / 2 def caller(): obj = gen(10) v1 = obj.next() v2 = obj.send(v1 - 10) return obj.send(v2 + 1)
  6. def gen(arg): g1 = yield arg * 2 g2 =

    yield g1 + 1 yield g2 / 2 def caller(): obj = gen(10) v1 = obj.next() v2 = obj.send(v1 - 10) return obj.send(v2 + 1)
  7. def gen(arg): g1 = yield arg * 2 g2 =

    yield g1 + 1 yield g2 / 2 def caller(): obj = gen(10) v1 = obj.next() v2 = obj.send(v1 - 10) return obj.send(v2 + 1)
  8. def gen(arg): g1 = yield arg * 2 g2 =

    yield g1 + 1 yield g2 / 2 def caller(): obj = gen(10) v1 = obj.next() v2 = obj.send(v1 - 10) return obj.send(v2 + 1)
  9. def gen(arg): g1 = yield arg * 2 g2 =

    yield g1 + 1 yield g2 / 2 def caller(): obj = gen(10) v1 = obj.next() v2 = obj.send(v1 - 10) return obj.send(v2 + 1)
  10. @async() def render_name(uid): name = yield Future(user-name:’ + uid) return

    ‘<b>’ + name + ‘</b>’ def scheduler(): gen = render_name(uid) obj = gen.next() val = mc.get(obj.key) gen.send(val)
  11. def render_profile(uid): name = render_name(uid) photo = render_profile_photo(uid) bio =

    render_bio(uid) follow_button = render_follow_button(uid) ...
  12. @async() def render_profile(uid): (photo, name, bio, follow_button) = yield (

    render_profile_photo(uid), render_name(uid), render_bio(uid), render_follow_button(uid) )
  13. • Relationship to asyncio ◦ More constrained API for users

    ◦ Minority of our code is the scheduler, which we originally built in python2.7 and could largely be replaced with asyncio
  14. • Adventures with making this intuitive ◦ At Quora, developers

    with a wide variety of backgrounds work in our Python codebase, including designers with relatively little coding experience. ◦ Several API decisions were made to make using this as easy or easier than priming ◦ Fun story about returning from generators
  15. • Migrating huge codebase to new data fetching API ◦

    Made heavy use of static-analysis/AST-based auto migration scripts ◦ Saved 50%+ time (including developing those scripts) vs. an estimated manual approach