Upgrade to Pro — share decks privately, control downloads, hide ads and more …

High-Performance Asynchronous Applications with (or without) Ruby - By Scott Tadman

High-Performance Asynchronous Applications with (or without) Ruby - By Scott Tadman

Scott Tadman's presentation at Burlington Ruby Conf. 2012

adam bouchard

July 30, 2012
Tweet

More Decks by adam bouchard

Other Decks in Programming

Transcript

  1. PostageApp • Simply: “Email as a Service” • Created in

    2009 as an experiment. • Boils down to: Laziness, Impatience and Hubris. 2 Monday, July 30, 12
  2. Why PostageApp? • Sending email is generally not fun. •

    Sending email correctly is even less fun. 3 Monday, July 30, 12
  3. Why PostageApp? • Sending email is generally not fun: •

    Email support in most frameworks is usually primitive. • SMTP is not a reliable transport. • Creating a background job is not always possible. 4 Monday, July 30, 12
  4. Why PostageApp? • Sending email correctly is even less fun:

    • Standards for composing HTML email are tricky. • Changes to templates usually require re-deploying. • Previews not always easy to simulate. 5 Monday, July 30, 12
  5. Mission Accomplished • Prototype is a Ruby on Rails application

    • Sending engine is a Rails-based process. • 100% ActiveRecord and MySQL 6 Monday, July 30, 12
  6. How does Rails Scale? • Front-end performance is great! •

    Rails background workers...not so great. 10 Monday, July 30, 12
  7. Rails-Based Workers • Each worker process was limited: • Could

    only send one email at a time. • Every email required a new connection. • Process footprint was large (~100MB) 11 Monday, July 30, 12
  8. Why Use Rails for Workers? • Easy. • Literally. •

    Synchronous, blocking code is: • Easy to write. • Not hard to debug. 12 Monday, July 30, 12
  9. Why is Rails Synchronous? • Long tradition in PHP, Python,

    Perl, etc. • Database calls are supposed to be fast. • Not fast enough? • Optimize. • Cache. • Optimize some more! Get creative. 13 Monday, July 30, 12
  10. Limits of Synchronous • Some things can never be cached

    or optimized: • Third-party external servers. • Third-party external networks. • Basically stuff you have no control over. 14 Monday, July 30, 12
  11. Problems with Blocking 18 • Entire process grinds to a

    halt while waiting. • Memory is still in use. • Lights on, nobody home. • More workers requires more memory. Monday, July 30, 12
  12. If only... • Your processing application could: • Do other

    things while waiting for a response? • Juggle multiple jobs at the same time? • Make use of multiple CPU cores? 19 Monday, July 30, 12
  13. Multi-Threaded Code 21 • So you make your application multi-threaded.

    • Now yohaveu protwoblems: • Partition resources carefully? • Lock shared resources aggressively? • No magic bullet here. • Oh no. Deadlocks. Monday, July 30, 12
  14. Let’s Go Async! • All the cool kids are doing

    it. • JavaScript in the browser: • jQuery $.ajax • JavaScript on the server: • Node.js 23 Monday, July 30, 12
  15. Asynchronous JavaScript • In the browser: • Blocking calls can

    lock the JavaScript VM. • Your computer never needs more beach-ball. • The “A” in AJAX actually stands for “Awesome” • The “X” stands for “JSON” 24 Monday, July 30, 12
  16. Asynchronous Server? • Services implemented this way: • IRC •

    DNS • SNMP • SMTP! 25 Monday, July 30, 12
  17. select() • Workhorse of traditional UNIX networking. • Improved upon

    with epoll and kqueue • Want to use those in Ruby? • EventMachine 26 Monday, July 30, 12
  18. EventMachine • Ruby’s answer to Node.js • Implements the “Event

    Machine” pattern. • Built on similar libraries. • Recognizes slow networking is the problem to solve. 27 Monday, July 30, 12
  19. Network Calls • In your application is heavily networked: •

    Database queries. • HTTP requests. • External APIs. • Shared cache systems. 28 Monday, July 30, 12
  20. Event Driven • An “event” is something that happens. •

    Client example: Clicking on a page. • Server example: Receiving DNS request. 29 Monday, July 30, 12
  21. Event Loop • In a nutshell: • Wait for stuff

    to happen. • When stuff happens, deal with it... • ...quickly. • Seriously, are you done yet? 30 Monday, July 30, 12
  22. Event Responses • Break up longer operations into small units

    of work. • Keep the event stream flowing. • For best results: • Use external resources to do heavy lifting (DB, etc.) • Enable long-running tasks to pause and resume. • “Break time” 31 Monday, July 30, 12
  23. Blocks • A block of code that’s passed in to

    a function. • Many ways to create: • lambda • Proc.new • do ... end • Blocks make Ruby very flexible. 33 Monday, July 30, 12
  24. Definition of a Block • A small unit of code

    passed to a method. • May be executed zero or more times. • May be executed immediately... • ...or at some unspecified time in the future. 34 Monday, July 30, 12
  25. lambda vs. function() • Ruby has lambda { } •

    JavaScript has function () • Superficially very similar. • Ruby’s syntax advantage: • Append do...end to any method call. • Makes passing methods almost too easy. 35 Monday, July 30, 12
  26. Async JavaScript • Example: • async_method(function(r) { ... }); •

    async_method({ callback: function(r) { ... } }); • $.ajax(...).done(function (data) { ... }); 36 Monday, July 30, 12
  27. Async JavaScript • Example from jQuery’s documentation: var jqxhr =

    $.ajax( "example.php" ) .done(function() { alert("success"); }) .fail(function() { alert("error"); }) .always(function() { alert("complete"); }); • That looks easy enough. • What about chaining operations? • Uh oh... 37 Monday, July 30, 12
  28. Asynchronous Ruby • Give a method a block to call:

    • ...when the operation is complete. • ...when something went wrong. • ...when it timed out. 38 Monday, July 30, 12
  29. Asynchronous Ruby • Chaining operations: • Often the result of

    one action informs the next: • Fetching additional records. • Error recovery. • May skip steps if data already cached. 39 Monday, July 30, 12
  30. Nested Calls • Typical Example: • User... • ...belongs to

    an Account... • ...which has Notices. • ...has many Messages. 40 Monday, July 30, 12
  31. Nested Calls • Synchronous Implementation: • user = User.find(...) •

    account = user.account • notice = Notice.find(account.last_notice_id) • m_count = Message.where(:user_id => user.id).count 41 Monday, July 30, 12
  32. Nested Calls • Asynchronous Implementation: • User.async_find(...) do |user| •

    Account.async_find(user.account_id) do |account| • Notice.async_find(account.last_notice_id) do | notice| • Message.where(:user_id => user.id).async_count ... 42 Monday, July 30, 12
  33. Asynchronous Calls • Without convention you have anarchy: • Multiple

    callback styles. • Differing return types and result codes. • Predictable, dependable behavior is essential. • Limitations imposed by Ruby need to be respected: • Don’t try to make it something it isn’t. 43 Monday, July 30, 12
  34. Asynchronous House Rules • A well-behaving asynchronous method will... •

    Call the supplied block with a well-defined response: • Once and once only. • Always. • Even if something unexpected happens. • Never trigger any exceptions it can’t handle. 44 Monday, July 30, 12
  35. Asynchronous Exceptions • Don’t do things that might cause trouble:

    • Be aware of what exceptions methods can cause. • Catch and handle them where they occur. • Exceptions will not be caught by the caller. • Exceptions will crash your entire application. 48 Monday, July 30, 12
  36. Asynchronous Conditions conditional_async_method do |result_id| if (cached = @cache[result_id]) yield(cached)

    else fetch_and_cache(result_id) do |result| yield(result) end end end 49 Monday, July 30, 12
  37. Leaving the Nest • Multiply nested asynchronous calls: • Tend

    to grow more complicated. • Optional steps are hard to express. • End up very hard to debug. • Make for a very deep stack. • Maybe there’s a better way. 50 Monday, July 30, 12
  38. Leaving the Nest • Reasons to find an alternative: •

    Do you know what your application is doing? • What callbacks are still outstanding? • Why the application is not responding? • Where that asynchronous call was initiated? 51 Monday, July 30, 12
  39. Stack Trace • EventMachine’s core is an event loop. •

    while (true) do ... end • The ... is the magical EventMachine stuff. • Asynchronous code not reflected in stack trace... • ...unless executing at that exact moment. • ...which is unlikely. 52 Monday, July 30, 12
  40. State Machine Pattern • A state machine is: • One

    or more formally defined steps: • To complete an operation. • To handle conditions. • Basically like a flow-chart. 54 Monday, July 30, 12
  41. State Machine Benefits • Using a state machine to track

    async code: • Encapsulates a multi-stage process. • Provides insight into completion status. • Easy to hook in to for logging and benchmarking. • Makes it easy to find where things have stalled. • Easily rendered as a diagrams. 55 Monday, July 30, 12
  42. Use Case: SMTP • Line-based protocol • Very simple syntax

    • Well documented in various RFCs 56 Monday, July 30, 12
  43. Use Case: SMTP • Example command stream sent to remote

    server: • HELO myhostname.net • MAIL FROM:<[email protected]> • RCPT TO:<[email protected]> • DATA • QUIT 57 Monday, July 30, 12
  44. State Machine Example Example from remailer library: state :helo do

    enter do send_line("HELO #{hostname}") end interpret(250) do if (requires_authentication?) enter_state(:auth) else enter_state(:established) end end end 58 Monday, July 30, 12
  45. State Machine Example Example from remailer library: state :auth do

    enter do send_line("AUTH PLAIN #{encode_authentication(username, password)}") end interpret(235) do enter_state(:established) end interpret(535) do |reply_message, continues| handle_reply_continuation(535, reply_message, continues) do |reply_code, reply_message| error_notification(reply_code, reply_message) enter_state(:quit) end end end 59 Monday, July 30, 12
  46. State Machine Use Cases • Best applied to problems that:

    • Have a complicated, multi-step procedure. • May need to recover from crashes. • A standardized way of tracking progress is required. • Insight into what’s “going on” is necessary. 60 Monday, July 30, 12
  47. Fibers • Facility within Ruby since 1.8.6 • Even more

    confusing than blocks to the uninitiated. • Not many libraries make use of them. • ...but that seems to be changing. 63 Monday, July 30, 12
  48. Fibers in Ruby • Fibers are like blocks that you

    can pause and resume. • Asynchronous operations often involve a lot of that. • So... • Fibers + asynchronous code are best friends? 64 Monday, July 30, 12
  49. High Fiber Ruby • Instead of passing callback blocks to

    async methods... • ...yield control to that method... • ...then resume control when a response is received. • Seems simple enough, right? • Simplicity is a compelling reason to use them. 65 Monday, July 30, 12
  50. High Fiber Ruby • Blocking synchronous code: • user =

    User.find(...) • account = user.account • ... 66 Monday, July 30, 12
  51. High Fiber Ruby • Callback-driven asynchronous code: • User.find_async(...) do

    |user| • Account.where(...).find_async do |account| • ... 67 Monday, July 30, 12
  52. High Fiber Ruby • Fibered asynchronous code: • user =

    User.find(...) • account = user.account • ... 68 Monday, July 30, 12
  53. Fiber Wrapper Example 70 def find_async(id) fiber = Fiber.current standard_async_query(...)

    do |result| fiber.resume(result) end Fiber.yield end Fiber.new do user = find_async(id) end Monday, July 30, 12
  54. Fiber Future • Making Rails more Fiber-friendly. • Need more

    Fiber-aware libraries. • Not unlike being “thread-aware” or “thread-safe” • All non-trivial applications can benefit. • ...if the cost of change is low. • Fibers can make it easy. 71 Monday, July 30, 12
  55. GitHub Resources • https://github.com/eventmachine/eventmachine • eventmachine • https://github.com/igrigorik • em-synchrony,

    em-websocket, etc. • em-mysqlplus based on em-mysql • https://github.com/tmm1/ • em-mysql 72 Monday, July 30, 12