Thread Mist

00e4a880b1262a125b5e342e4b536765?s=47 Zete
October 11, 2015

Thread Mist

Ruby Conf China 2015 Talk

00e4a880b1262a125b5e342e4b536765?s=128

Zete

October 11, 2015
Tweet

Transcript

  1. Thread Mist luikore

  2. About Me

  3. 精通各种…​ hello world, 喜欢 各种…​ 彩虹小马 和什么 GUI, 钻 研各种…​

    虚拟 机编译 器和, 有空就写一个…​ 跳票多年的编辑 器 @luikore
  4. Odigo https://www.odigo.travel

  5. Threads

  6. Theory

  7. Practice

  8. Threads are Hard mutex futex condvar semaphore deadlock, livelock…​

  9. Questions No "TRUE" threads in Ruby? GIL is bad?

  10. Remove Threads?

  11. Remove Threads? Celluloid! Fiber! CSP! pi-calculus!

  12. But…​

  13. Thread is what we have from OS or C

  14. Threads are working great in…​

  15. And…​ Thread Multiplexing Libraries I18n.locale Mongoid query cache …​

  16. And…​ System provides threads OS optimize threads over years Preemptive

    scheduling is a need
  17. So let’s look deep down how threads schedule.

  18. Glossary CRuby = MRI yield GVL = GIL lock, mutex,

    spin lock condvar time slice
  19. Threads Scheduling

  20. Green Threads Ruby < = 1.8 comes with green threads

    Blocking IO (read/write) blocks other threads
  21. Native Threads Introduced in Ruby 1.9 Very same implementation as

    green threads at first
  22. Experiment Let’s check a simple program r u b y

    - e ' g e t s '
  23. Experiment

  24. Timer Thread There is one thread that not visible in

    Ruby program, which schedules other threads.
  25. Why Timer? Supervisor Ensure threads run "Fairly"

  26. Timer Overview

  27. We are Going Down…​

  28. When does timer initialized?

  29. What does timer thread actually do? s t a t

    i c v o i d t i m e r _ t h r e a d _ f u n c t i o n ( v o i d * a r g ) { r b _ v m _ t * v m = G E T _ V M ( ) ; n a t i v e _ m u t e x _ l o c k ( & v m - > t h r e a d _ d e s t r u c t _ l o c k ) ; i f ( v m - > r u n n i n g _ t h r e a d ) A T O M I C _ O R ( v m - > r u n n i n g _ t h r e a d - > i n t e r r u p t _ f l a g , T I M E R _ I N T E R R U P T _ M A S n a t i v e _ m u t e x _ u n l o c k ( & v m - > t h r e a d _ d e s t r u c t _ l o c k ) ; }
  30. What other thread does in safe-points? i f ( (

    t h ) - > i n t e r r u p t _ f l a g & ~ ( t h ) - > i n t e r r u p t _ m a s k ) { r b _ t h r e a d p t r _ e x e c u t e _ i n t e r r u p t s ( ) ; }
  31. What are safe-points? Returning from a function …​

  32. What is "execute interrupts"? v o i d r b

    _ t h r e a d p t r _ e x e c u t e _ i n t e r r u p t s ( ) { t i m e r _ i n t e r r u p t = i n t e r r u p t & T I M E R _ I N T E R R U P T _ M A S K ; . . . i f ( t i m e r _ i n t e r r u p t ) { . . . t h - > r u n n i n g _ t i m e _ u s + = r u n n i n t _ t i m e _ u s ; . . . r b _ t h r e a d _ s c h e d u l e _ l i m i t s ( l i m i t s _ u s ) ; } }
  33. What does "schedule limits" do? s t a t i

    c v o i d r b _ t h r e a d _ s c h e d u l e _ l i m i t s ( u n s i g n e d l o n g l i m i t s _ u s ) { . . . i f ( t h - > r u n n i n g _ t i m e _ u s > = l i m i t s _ u s ) { R B _ G C _ S A V E _ M A C H I N E _ C O N T E X T ( t h ) ; g v l _ y i e l d ( t h - > v m , t h ) ; / / r e l e a s e a n d c o m p e t e G V L } }
  34. What does the GVL look like? t y p e

    d e f s t r u c t r b _ g l o b a l _ v m _ l o c k _ s t r u c t { / * f a s t p a t h * / u n s i g n e d l o n g a c q u i r e d ; r b _ n a t i v e t h r e a d _ l o c k _ t l o c k ; / * s l o w p a t h * / v o l a t i l e u n s i g n e d l o n g w a i t i n g ; r b _ n a t i v e t h r e a d _ c o n d _ t c o n d ; / * y i e l d * / r b _ n a t i v e t h r e a d _ c o n d _ t s w i t c h _ c o n d ; r b _ n a t i v e t h r e a d _ c o n d _ t s w i t c h _ w a i t _ c o n d ; i n t n e e d _ y i e l d ; i n t w a i t _ y i e l d ; } r b _ g l o b a l _ v m _ l o c k _ t ;
  35. What is gvl_yield? n a t i v e _

    m u t e x _ l o c k ( & v m - > g v l . l o c k ) ; g v l _ r e l e a s e _ c o m m o n ( v m ) ; n a t i v e _ m u t e x _ u n l o c k ( & v m - > g v l . l o c k ) ; . . . s c h e d _ y i e l d ( ) ; . . . n a t i v e _ m u t e x _ l o c k ( & v m - > g v l . l o c k ) ; n a t i v e _ c o n d _ b r o a d c a s t ( & v m - > g v l . s w i t c h _ w a i t _ c o n d ) ; g v l _ a c q u i r e _ c o m m o n ( v m ) ; n a t i v e _ m u t e x _ u n l o c k ( & v m - > g v l . l o c k ) ;
  36. What is gvl_release_common? v m - > g v l

    . a c q u i r e d = 0 ; i f ( v m - > g v l . w a i t i n g > 0 ) n a t i v e _ c o n d _ s i g n a l ( & v m - > g v l . c o n d ) ;
  37. What is gvl_acquire_common? w h i l e ( v

    m - > g v l . a c q u i r e d ) { n a t i v e _ c o n d _ w a i t ( & v m - > g v l . c o n d , & v m - > g v l . l o c k ) ; }
  38. What does sched_yield come from? OS call to put current

    thread to lowest priority Think about Fiber.yield
  39. Condvars rb_nativethread_cond_t represent condition variables. Condition variable can wait and

    release a mutex until some condition is met.
  40. Condvars Resource Consumer # p s e u d o

    c o d e l o c k ( m u t e x ) w h i l e c o n d w a i t ( c o n d v a r , m u t e x ) e n d u n l o c k ( m u t e x )
  41. Condvars Resource Producer pthread_cond_signal(condvar) pthread_cond_broadcast(condvar)

  42. Conclusion Threads are preemptive…​ right? Ruby threads are cooperative too…​

  43. What GIL Guarantees Standalone C functions are "atomic" in MRI

    Easy for writing C extensions Release GIL when you know you are going into a "blocking region"
  44. Blocking Region V A L U E r b _

    t h r e a d _ i o _ b l o c k i n g _ r e g i o n ( r b _ b l o c k i n g _ f u n c t i o n _ t * f u n c , v o i d * d a t a 1 , { . . . t h - > w a i t i n g _ f d = f d ; . . . B L O C K I N G _ R E G I O N ( { v a l = f u n c ( d a t a 1 ) ; s a v e d _ e r r n o = e r r n o ; } , u b f _ s e l e c t , t h , F A L S E ) ; }
  45. Blocking Region (2) What BLOCKING_REGION macro does R B _

    G C _ S A V E _ M A C H I N E _ C O N T E X T ( t h ) ; g v l _ r e l e a s e ( t h - > v m ) ; . . . / / d o y o u r b l o c k i n g w o r k ! g v l _ a c q u i r e ( t h - > v m , t h ) ;
  46. A "blocking work" is a call that may sleep current

    thread and woke up later.
  47. Blocking Calls accept — accept an incoming connection read — read some data from

    a socket or file select — choose available file descriptors poll — simpler select epoll — inverse control, wake up by OS kevent — the kqueue version of epoll …​ many more
  48. Blocking Calls Costs a lot of CPU, or cost an

    unknown amount of time
  49. Optimization Regarding the threads with GIL

  50. Parallelism? Don’t Use Threads For Parallelism It is slower …​

    a = 1 . . 2 0 b = 1 . . 2 0 a s u m , b s u m = 0 , 0 t = T h r e a d . n e w { a s u m = a . i n j e c t : + } b s u m = b . i n j e c t : + t . j o i n p u t s a s u m + b s u m
  51. Parallelism? Just: a . i n j e c t

    ( : + ) + b . i n j e c t ( : + )
  52. Choose # of Threads Too many: time wasted in context

    switching Too few: users have to wait in a queue while your CPU io-waits
  53. Reduce Thread Stack Size less Middleware MVC

  54. Manually Release GVL Already used in zlib, so ruby multi-thread

    spiders can utilize up to 2 cores g v l _ r e l e a s e ( t h ) . . . / / y o u k n o w t h a t c o d e d o e s n ' t a f f e c t R u b y g v l _ a c q u i r e ( t h )
  55. When It’s not Ruby’s Fault…​

  56. IO: your Bottle Neck 1 0 0 . t i

    m e s d o C o m m e n t . c r e a t e p o s t : p o s t , " r a b b i t # { r a n d } " e n d
  57. Batch: Reduce IO Latency Simple way to batch SQL — transaction t

    r a n s a c t i o n d o 1 0 0 . t i m e s d o C o m m e n t . c r e a t e p o s t : p o s t , " r a b b i t # { r a n d } " e n d e n d
  58. IO is Everywhere…​ NUMA abstracts memory visits DMA reduces memory

    copying Sharing between APU and FPU Sharing between LLC and main memory Sharing between CPU and GPU …​
  59. Connection Pool Limit Some long jobs took too long to

    finish, exhausting the connection pool, then blocked other threads…​
  60. Manually Release Connection Release it manually before starting something that

    takes a long time: # i n r a i l s a c t i o n o r s i d e k i q j o b A c t i v e R e c o r d : : B a s e . c l e a r _ a c t i v e _ c o n n e c t i o n s ! N e t : : H T T P . g e t ' h t t p : / / e x a m p l e . c o m '
  61. Removing GIL JVM and Rubinius posses no GIL…​ How?

  62. Fine Grained Locks Problem: Very complex VM — 50+ Locks!

  63. Use HTM instead Hardware transactional memory "Spinlock" Problem: different usage

    for different hardware platforms and compilers
  64. Thread-safe Data Structures Problem: slower single threaded code

  65. Concurrent GC GC must make sure mark/sweep Problem: slower single

    threaded code
  66. Thread-safe Extensions Simple solution: still acquire GVL when calling non-

    thread-safe cfunc
  67. Long Long Way to Go Removing threads is actually easier:

  68. Ref http://www.jstorimer.com/blogs/workingwithcode/8100871- understands-the-gil-part-2-implementation http://www.cs.fsu.edu/~baker/realtime/restricted/notes/prodc