Concurrency in Ruby: In search of inspiration

Concurrency in Ruby: In search of inspiration

Concurrency has been a popular topic in the programming community in the last decade and has received special attention in the Ruby community in recent years. José Valim will start his talk by explaining why concurrency has become so important and why it is particularly hard to write safe concurrent software in Ruby, based on his experience from working on Ruby on Rails. The goal of this talk is to highlight the current limitations and start the search for possible solutions, looking at how other languages are tackling this issue today.

7c12adb8b5521c060ab4630360a4fa27?s=128

Plataformatec

May 31, 2013
Tweet

Transcript

  1. @josevalim / Plataformatec Concurrency Ruby in

  2. About me

  3. •Started with Ruby in 2006 About me

  4. •Started with Ruby in 2006 •Open Source Contributor About me

  5. •Started with Ruby in 2006 •Open Source Contributor •Joined Rails

    Core in 2010 About me
  6. •Started with Ruby in 2006 •Open Source Contributor •Joined Rails

    Core in 2010 •Author of elixir-lang.org About me
  7. None
  8. Topics

  9. •Why concurrency? Topics

  10. •Why concurrency? •The problem today Topics

  11. •Why concurrency? •The problem today •Learning from others Topics

  12. Why concurrency?

  13. Server Server Server Server

  14. 50 cores $2600

  15. It is no longer about the future, it is about

    now
  16. Server

  17. In Ruby, we talk about threads

  18. We need our applications to be thread-safe

  19. The problem

  20. class User def self.current_user=(user) @@current_user = user end def self.current_user

    @@current_user end end
  21. post "/do_something" do User.current_user = user_from_request # some work Mailer.deliver_to(

    User.current_user) end
  22. 1. Assign current_user 2. Do some work 3. Get current_user

    to send an e-mail
  23. thread 1 current_ user thread 2

  24. thread 1 current_ user thread 2 jose

  25. thread 1 current_ user thread 2 jose jose

  26. thread 1 current_ user thread 2 jose jose jose

  27. thread 1 current_ user thread 2 jose jose jose jose

  28. thread 1 current_ user thread 2 jose jose matz jose

    jose
  29. thread 1 current_ user thread 2 jose jose matz matz

    jose jose
  30. thread 1 current_ user thread 2 jose jose matz matz

    jose jose jose
  31. thread 1 current_ user thread 2 jose jose matz matz

    jose jose jose jose
  32. thread 1 current_ user thread 2 jose jose matz matz

    jose jose jose jose jose
  33. thread 1 current_ user thread 2 jose jose matz matz

    jose jose jose jose jose jose
  34. thread 1 current_ user thread 2 jose jose matz matz

    jose jose jose jose jose jose
  35. User.current_user is shared mutable state (global)

  36. ActionMailer::Base.from ActionController::Base.logger Devise.password_length SimpleForm.form_options

  37. What is Ruby?

  38. Ruby is a programming language with different implementations

  39. Rails 2.2 threadsafe

  40. Not really ...

  41. Thread-safety means different things for different implementations

  42. class ActionController::Base prepend_view_path( ActionView::Resolver.new("app/views") ) end

  43. # It finds Rails templates and # caches them in

    between requests class ActionView::Resolver def initialize @cached = Hash.new { ... } end end
  44. @cached[template] = File.read(template)

  45. Thread-safe YARV Yes JRUBY No RBX 2.0 No Hashes

  46. global virtual machine lock (YARV) @cached[template] = File.read(template) static VALUE

    env_aset(VALUE obj, VALUE nm, VALUE val) { ... hash.c
  47. global virtual machine lock (YARV) @cached[template] = File.read(template) static VALUE

    env_aset(VALUE obj, VALUE nm, VALUE val) { ... hash.c
  48. Without the GVL, other Ruby implementations may try to change

    the same Hash at the same time, corrupting it
  49. None
  50. None
  51. None
  52. None
  53. Very low-level abstractions

  54. None
  55. Code maintainability

  56. Code maintainability YARV performance

  57. Code maintainability YARV performance Developer happiness

  58. • Class definitions • Instance variables • Class variables •

    Data structures ...
  59. We need to define proper semantics and provide better abstractions

  60. Learning from others

  61. Java

  62. thread.rb Thread Mutex ConditionVariable Queue SizedQueue

  63. java.util.concurrent ArrayBlockingQueue ConcurrentHashMap ConcurrentLinkedQueue ConcurrentSkipListMap ConcurrentSkipListSet CopyOnWriteArrayList CopyOnWriteArraySet CountDownLatch CyclicBarrier

    DelayQueue Exchanger FutureTask LinkedBlockingDeque LinkedBlockingQueue PriorityBlockingQueue PriorityQueue Semaphore SynchronousQueue
  64. java.util.concurrent ArrayBlockingQueue ConcurrentHashMap ConcurrentLinkedQueue ConcurrentSkipListMap ConcurrentSkipListSet CopyOnWriteArrayList CopyOnWriteArraySet CountDownLatch CyclicBarrier

    DelayQueue Exchanger FutureTask LinkedBlockingDeque LinkedBlockingQueue PriorityBlockingQueue PriorityQueue Semaphore SynchronousQueue ETOOMANYCLASSES
  65. Hashes

  66. None
  67. “There are heaps of non thread-safe usage of Hashes in

    Rails and the surrounding gems”
  68. Using hashes as a cache is an extremely common pattern

    in Ruby
  69. @cached[template] = File.read(template)

  70. Thread-safe YARV Yes JRUBY No RBX 2.0 No Hashes

  71. - @cached = {} + @cached = ConcurrentHash.new

  72. - @cached = {} + @cached = ConcurrentHash.new

  73. Ruby under a microscope “How hashes scale from one to

    one million elements”
  74. One class to rule them all!

  75. Proposal #1

  76. 1. add Hash#concurrent_read and Hash#concurrent_write Proposal #1

  77. 1. add Hash#concurrent_read and Hash#concurrent_write 2. allow implementations to have

    different default values Proposal #1
  78. # YARV hash = Hash.new hash.concurrent_read #=> true hash.concurrent_write #=>

    true # JRUBY hash = Hash.new hash.concurrent_read #=> false hash.concurrent_write #=> false # RBX hash = Hash.new hash.concurrent_read #=> ? hash.concurrent_write #=> ?
  79. # When you need thread safety # in any implementation:

    hash = Hash.new hash.concurrent_read! hash.concurrent_write!
  80. AtomicReference

  81. class User def name=(name) @name = name end def name

    @name end end user = User.new
  82. 10.times do Thread.new { user.name = "Hello #{rand}" } end

    Threadsafe?
  83. Defining Updating YARV Yes Yes JRUBY Yes No RBX 2.0

    ? ? Instance variables
  84. class User def initialize @name = AtomicReference.new end def name

    @name.get end def name=(name) @name.set(name) end end
  85. class User atomic_accessor :name end

  86. class User def initialize # Provide a default # lazily

    calculated value @name = AtomicReference.new do expensive_calculation end end end
  87. Defining Updating YARV Yes Yes JRUBY Yes No RBX 2.0

    ? ? Instance variables
  88. Proposal #2

  89. 1. updating instance variables are declared unsafe Proposal #2

  90. 1. updating instance variables are declared unsafe 2. any thread-safe

    Ruby code needs to treat them as such Proposal #2
  91. Proposal #2

  92. 3. implementations like YARV can still provide safe semantics due

    to technical limitation Proposal #2
  93. 3. implementations like YARV can still provide safe semantics due

    to technical limitation 4. but code that relies on such behavior works by “accident” Proposal #2
  94. Proposal #2

  95. 5. add AtomicReference to thread.rb Proposal #2

  96. 5. add AtomicReference to thread.rb Proposal #2 AtomicReference#new(&default) AtomicReference#set(value) AtomicReference#get()

  97. AtomicReference #get_and_set(new) AtomicReference #compare_and_swap(old, new)

  98. compare_and_swap is the first step required for implementing lock-free data

    structures
  99. Go

  100. Go shows us we can go a long way with

    simple (but powerful) abstractions and great education
  101. “Do not communicate by sharing memory; instead, share memory by

    communicating”
  102. is_done = false Thread.new { # do something expensive is_done

    = true # do something else } sleep(0.5) until is_done # a bit more work puts :DONE
  103. Go made Channels and Goroutines first-class citizens of the language

  104. done := make(chan bool, 1) go func() { // do

    something expensive done <- true // do something else }() <-done // a bit more work
  105. queue = SizedQueue.new(1) Thread.new { # do something expensive queue

    << true # do something else } queue.pop # a bit more work puts :DONE
  106. Ruby already has Queues! Let’s do more on top of

    it!
  107. select { case <-ch: // a read from ch has

    occurred case <-done: // the done channel is done }
  108. thread.rb Thread Mutex ConditionVariable Queue SizedQueue

  109. But there is something more...

  110. Erlang

  111. Both Erlang and Go have lightweight concurrent primitives, scheduled by

    the language
  112. How much does a thread cost in Ruby?

  113. Erlang -> Processes Go -> Goroutines Ruby -> ?

  114. Proposal #3

  115. 1. Let’s make our Queue more powerful (with sizes and

    de-multiplexing) Proposal #3
  116. Proposal #4

  117. 1. Provide lightweight, concurrent scheduled primitives Proposal #4

  118. Finally, what about the Actor model?

  119. github.com/celluloid

  120. The Actor model is about building systems. Ruby should make

    it easier for them to flourish.
  121. Conclusion

  122. The different semantics existing in different implementations makes concurrency in

    Ruby harder
  123. We are also not used to think about concurrency. We

    need more education on how to “think concurrently”
  124. We need...

  125. We need... • well-defined semantics

  126. We need... • well-defined semantics • a thread-safe stdlib (example:

    Hash#concurrent_write!)
  127. We need... • well-defined semantics • a thread-safe stdlib (example:

    Hash#concurrent_write!) • high-level abstractions & education (example: atomic, queues)
  128. My goal is not to propose final solutions but instigate

    discussion. I hope I have fulfilled it!
  129. Thank you! @josevalim / Plataformatec