Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bringing Concurrency to Ruby - RubyConf India 2014

headius
April 10, 2014

Bringing Concurrency to Ruby - RubyConf India 2014

A talk on the why and how of high-concurrency Ruby, delivered at RubyConf India 2014 in Goa.

headius

April 10, 2014
Tweet

More Decks by headius

Other Decks in Technology

Transcript

  1. Examples • Thread APIs: concurrency • Actore APIs: concurrency •

    Native thread, process: parallelism • If the underlying system supports it • SIMD, GPU, vector operations: parallelism
  2. You Need Both • Work that can split into concurrent

    jobs • Platform that runs those jobs in parallel • In an ideal world, scales with job count • In our world, each job adds overhead
  3. Process-level Concurrency • Separate processes running concurrently • As parallel

    as OS/CPU can make them • Low risk due to isolated memory space • High memory requirements • High communication overhead
  4. Thread-level Concurrency • Threads in-process running concurrently • As parallel

    as OS/CPU can make them • Higher risk due to shared memory space • Lower memory requirements • Low communication overhead
  5. Popular Platforms Concurrency Parallelism GC Notes MRI 1.8.7 ✔ ✘

    Single thread, stop- the-world Large C core would need much work MRI 1.9+ ✔ ✘ Single thread, stop- the-world Few changes since 1.9.3 JRuby (JVM) ✔ ✔ Many concurrent and parallel options JVM is the “best” platform for conc Rubinius ✔ ✔ Single thread, stop- the-world, partial concurrent old gen Promising, but a long road ahead Topaz ✘ ✘ Single thread, stop- the-world Incomplete impl Node.js (V8) ✘ ✘ Single thread, stop- the-world No threads in JS CPython ✔ ✘ Reference-counting Reference counting kills parallelism Pypy ✔ ✘ Single thread, stop- the-world Exploring STM to enable concurrency
  6. Timeslicing Thread 1 Thread 2 Thread 3 Thread 4 Native

    thread Native thread Native thread Native thread “Green” or “virtual” or “userspace” threads share a single native thread. The CPU then schedules that thread on available CPUs. Time’s up Time’s up Time’s up
  7. GVL: Global VM Lock Thread 1 Thread 2 Thread 3

    Thread 4 CPU CPU CPU CPU In 1.9+, each thread gets its own native thread, but a global lock prevents concurrent execution. Time slices are finer grained and variable, but threads still can’t run in parallel. Lock xfer CPU Lock xfer Lock xfer
  8. Why Do We See Parallelism? • Hotspot JVM has many

    background threads • GC with concurrent and parallel options • JIT threads • Signal handling • Monitoring and management
  9. Time Matters Too 0 1.75 3.5 5.25 7 Time per

    iteration MRI 1.8.7 MRI 1.9.3 JRuby Nearly 10x faster than 1.9.3
  10. Rules of Concurrency 1. Don’t do it, if you don’t

    have to. 2. If you must do it, don’t share data. 3. If you must share data, make it immutable. 4. If it must be mutable, coordinate all access.
  11. #1: Don’t • Many problems won’t benefit • Explicitly sequential

    things, e.g • Bad code can get worse • Multiply perf, GC, alloc overhead by N • Fixes may not be easy (esp. in Ruby) • The risks can get tricky to address
  12. I’m Not Perfect • Wrote a naive algorithm • Measured

    it taking N seconds • Wrote the concurrent version • Measured it taking roughly N seconds • Returned to original to optimize
  13. Fix Single-thread First! Time in seconds 0 5 10 15

    20 big_list time v1 v2 v3 v4 String slice instead of unpack/pack Simpler loops Stream from file
  14. Time in seconds 0 17.5 35 52.5 70 Processing 23M

    word file Non-threaded Two threads Four threads
  15. Before Conc Work • Fix excessive allocation (and GC) •

    Fix algorithmic complexity • Test on the runtime you want to target • If serial perf is still poor after optimization, the task, runtime, or system may not be appropriate for a concurrent version.
  16. #2: Don’t Share Data • Process-level concurrency • …have to

    sync up eventually, though • Threads with their own data objects • Rails request objects, e.g. • APIs with a “master” object, usually • Weakest form of concurrency
  17. #3: Immutable Data • In other words… • Data can

    be shared • Threads can pass it around safely • Cross-thread view of data can’t mutate • Threads can’t see concurrent mutations as they happen, avoiding data races
  18. Object#freeze • Simplest mechanism for immutability • For read-only: make

    changes, freeze • Read-mostly: dup, change, freeze, replace • Write-mostly: same, but O(n) complexity
  19. Immutable Data Structure • Designed to avoid visible mutation but

    still have good performance characteristics • Copy-on-write is poor-man’s IDS • Better: persistent data structures like Ctrie http://en.wikipedia.org/wiki/Ctrie
  20. Persistent? • Collection you have a reference to is guaranteed

    never to change • Modifications return a new reference • …and only duplicate affected part of trie
  21. Hamster • Pure-Ruby persistent data structures • Set, List, Stack,

    Queue, Vector, Hash • Based on Clojure’s Ctrie collections • https://github.com/hamstergem/hamster
  22. person = Hamster.hash(! :name => “Simon",! :gender => :male)! #

    => {:name => "Simon", :gender => :male}! ! person[:name]! # => "Simon"! person.get(:gender)! # => :male! ! friend = person.put(:name, "James")! # => {:name => "James", :gender => :male}! person! # => {:name => "Simon", :gender => :male}! friend[:name]! # => "James"! person[:name]! # => "Simon"
  23. Coming Soon • Reimplementation by Smit Shah • Mostly “native”

    impl of Ctrie • Considerably better perf than Hamster • https://github.com/Who828/persistent_data_structures
  24. Other Techniques • Known-immutable data like Symbol, Fixnum • Mutate

    for a while, then freeze • Hand-off: if you pass mutable data, assume you can’t mutate it anymore • Sometimes enforced by runtime, e.g. “thread-owned objects”
  25. #4: Synchronize Mutation • Trickiest to get right; usually best

    perf • Fully-immutable generates lots of garbage • Locks, atomics, and specialized collections
  26. Locks • Avoid concurrent operations • Read + write, in

    general • Many varieties: reentrant, read/write • Many implementations
  27. Mutex • Simplest form of lock • Acquire, do work,

    release • Not reentrant semaphore = Mutex.new! ...! a = Thread.new {! semaphore.synchronize {! # access shared resource! }! }
  28. ConditionVariable • Release mutex temporarily • Signal others waiting on

    the mutex • …and be signaled • Similar to wait/notify/notifyAll in Java
  29. resource = ConditionVariable.new! ! a = Thread.new {! mutex.synchronize {!

    # Thread 'a' now needs the resource! resource.wait(mutex)! # 'a' can now have the resource! }! }! ! b = Thread.new {! mutex.synchronize {! # Thread 'b' has finished using the resource! resource.signal! }! }!
  30. Monitor require 'monitor'! ! class SynchronizedArray < Array! ! include

    MonitorMixin! ! alias :old_shift :shift! ! def shift(n=1)! self.synchronize do! self.old_shift(n)! end! end! ...
  31. Atomics • Without locking… • …replace a value only if

    unchanged • …increment, decrement safely • Thread-safe code can use atomics instead of locks, usually with better performance
  32. require 'atomic'! ! my_atomic = Atomic.new(0)! my_atomic.value! # => 0!

    my_atomic.value = 1! my_atomic.swap(2)! # => 1! my_atomic.compare_and_swap(2, 3)! # => true, updated to 3! my_atomic.compare_and_swap(2, 3)! # => false, current is not 2
  33. Specialized Collections • thread_safe gem • Fully-synchronized Array and Hash

    • Atomic-based hash impl (“Cache”) • java.util.concurrent • Numerous tools for concurrency
  34. thread_count = (ARGV[2] || 1).to_i! queue = SizedQueue.new(thread_count * 4)!

    ! word_file.each_line.each_slice(50) do |words|! queue << words! end! queue << nil # terminating condition
  35. threads = thread_count.times.map do |i|! Thread.new do! while true! words

    = queue.pop! if words.nil? # terminating condition! queue.shutdown! break! end! words.each do |word|! # analyze the word
  36. Putting It All Together • These are a lot of

    tools to sort out • Others have sorted them out for you
  37. Celluloid • Actor model implementation • OO/Ruby sensibilities • Normal

    classes, normal method calls • Async support • Growing ecosystem • Celluloid-IO and DCell (distributed actors) • https://github.com/celluloid/celluloid
  38. class Sheen! include Celluloid! ! def initialize(name)! @name = name!

    end! ! def set_status(status)! @status = status! end! ! def report! "#{@name} is #{@status}"! end! end
  39. >> charlie = Sheen.new "Charlie Sheen"! => #<Celluloid::Actor(Sheen:0x00000100a312d0) @name="Char >>

    charlie.set_status "winning!"! => "winning!"! >> charlie.report! => "Charlie Sheen is winning!"! >> charlie.async.set_status "asynchronously winning!"! => nil! >> charlie.report! => "Charlie Sheen is asynchronously winning!"
  40. Sidekiq • Simple, efficient background processing • Think Resque or

    DelayedJob but better • Normal-looking Ruby class is the job • Simple call to start it running in background • http://mperham.github.io/sidekiq/
  41. class HardWorker! include Sidekiq::Worker! ! def perform(name, count)! puts 'Doing

    hard work'! end! end! ! ...later, in a controller...! ! HardWorker.perform_async('bob', 5)
  42. Concurrent Ruby • Grab bag of concurrency patterns • Actor,

    Agent, Channel, Future, Promise, ScheduledTask, TimerTask, Supervisor • Thread pools, executors, timeouts, conditions, latches, atomics • May grow into a central lib for conc stuff • https://github.com/jdantonio/concurrent-ruby
  43. Recap • The future of Ruby is concurrent • The

    tools are there to help you • Let’s all help move Ruby forward