Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Chasing Real-Time Observability for CRuby

Chasing Real-Time Observability for CRuby

Avatar for White-Green

White-Green

April 23, 2026

More Decks by White-Green

Other Decks in Technology

Transcript

  1. Background: It is difficult to understand program behavior • It

    is difficult to accurately grasp the behavior from the literal text of the program (especially in Ruby). • We can only observe the results of the operation, not the operation itself. • How many times is a method called? • How many threads are running? • How often does GC run? • How do these change when a button is clicked? 3
  2. Background: Tools such as profilers exist • Convenient for performance

    analysis • Focused on performance analysis, so only statistical information can be seen • Not real-time • Challenging for interactive applications 4
  3. Background: We should be able to do better today •

    Development machines has many (>=10) cores. • All cores other than the single core Ruby uses can be employed for analysis. • The z-axis we perceive is not being utilized. • CRuby is well-suited for creating such an observation tool. • Wouldn't it be fun to see CRuby's behavior in real-time? 5
  4. rrtrace • Real-time 3D visualization of CRuby behavior: Time x

    Thread x Stack. • https://github.com/White-Green/rrtrace 7
  5. internal: Overview • A separate Visualizer Process runs apart from

    CRuby. • Events like TracePoint are sent to the Visualizer Process for handling. 8 CRuby YARV C-ext TracePoint INTERNAL_THREAD_EVENT Visualizer Process Event Processor Event Window
  6. internal: Capturing events in CRuby • Use observation APIs available

    in CRuby • TracePoint API ◦ CALL, RETURN ◦ INTERNAL_GC_ENTER, INTERNAL_GC_EXIT ← C-ext only • INTERNAL_THREAD_EVENT ◦ STARTED, READY, EXITED ◦ SUSPENDED, RESUMED ◦ unavailable on Windows… • Convert into a unified format struct along with a timestamp 9 CRuby YARV C-ext Visualizer proc Window
  7. internal: Capturing events in CRuby def add(a, b) a +

    b end add(1, 2) 10 CRuby YARV C-ext Visualizer proc Window ←CALL :add ←RETURN :add t1 = Thread.new { heavy_process } t2 = Thread.new { heavy_process } t1.join t2.join STARTED :t1 READY :t1 RESUMED :t1 STARTED :t2 READY :t2 SUSPENDED :t1 READY :t1 RESUMED :t2 SUSPENDED :t2 … EXITED :t1
  8. internal: Capturing events in CRuby • Convert to a common

    format data structure along with a timestamp. 11 CRuby YARV C-ext Visualizer proc Window timestamp (60bit) method id / thread id (64bit) ↑ event type (4bit) 0 = CALL 1 = RETURN 2 = GC_START …
  9. internal: Sending event to visualizer process • Create OS-managed shared

    memory for sharing between both processes • Construct a ring buffer on the shared memory ◦ Memory space for data + read and write indices • Most transmissions involve 2 atomic rcw + 1 atomic addition • Occasional reads including cache misses • If the Visualizer processing is blocked, wait ◦ All events obtained from TracePoint etc., must be processed without omission ◦ We must ensure the Visualizer implementation is efficient to prevent this 12 CRuby YARV C-ext Visualizer proc Window
  10. internal: Receiving event on visualizer process • Treat the shared

    memory as a ring buffer with the same memory layout • Collect events in batches and pass them to an internal processing queue • Prioritize clearing the ring buffer space to avoid interfering with the CRuby process 13 CRuby YARV C-ext Visualizer proc Window
  11. internal: Visualize • Traverse events to simulate the stack for

    each thread. stack = [] events.each do |event| case event in [:call, timestamp, method_id] stack << [timestamp, method_id] in [:return, end_timestamp, _method_id] start_timestamp, method_id = stack.pop # x: start..end, y: thread_id, z: stack.size, color: method_id end 14 CRuby YARV C-ext Visualizer proc Window
  12. internal: Visualize • Traverse events to simulate the stack for

    each thread • Thread-related events switch the active stack • Drawing with the GPU 15 CRuby YARV C-ext Visualizer proc Window
  13. internal: Parallel visualize • Visualization must proceed at the same

    speed as CRuby's method calls • Stack simulation is slower than calling the Integer#+ method • Parallelization is the solution ◦ Since Visualization doesn't need to be single-threaded 16 CRuby YARV C-ext Visualizer proc Window
  14. internal: Parallel visualize • Stack depth depend on the result

    of processing previous events • cannot parallelize simply • “Method information that hasn't been pushed cannot be popped” 17 CRuby YARV C-ext Visualizer proc Window Stack CALL 1 CALL 2 RETURN 2 RETURN 1 CALL 3
  15. internal: Parallel visualize • Aggregate from blocks of several events

    to determine "what couldn't be popped” and “what is finally left on the stack" ◦ Parallelizable per block • Merging the results for each block. • By using this aggregate result and the events within each block again, the stack state at any point can be determined ◦ parallelizable per block 18 CRuby YARV C-ext Visualizer proc Window CALL 1 CALL 2 RETURN 2 RETURN 1 CALL 3 CALL 4 pop: [1] stack: 3 4 pop: [] stack: 1 pop: [] stack: 3 4
  16. internal: Parallel scan algorithm 19 CRuby YARV C-ext Visualizer proc

    Window Stack CALL 1 CALL 2 RETURN 2 RETURN 1 CALL 3 Stack CALL 1 CALL 2 RETURN 2 RETURN 1 CALL 3 ↑ Only serial processing is needed here
  17. Performance Benchmark 20 function call / s rails server rps

    plain CRuby 73,417,127 (x1.00) 203.19 (x1.00) empty TracePoint handler 26,094,059 (x0.36) 153.94 (x0.76) rrtrace without sending event 13,866,260 (x0.19) 134.30 (x0.66) rrtrace 12,760,131 (x0.17) 110.84 (x0.55) TracePoint Timestamp Ring buffer
  18. Conclusion • 3D real-time visualization. ◦ https://github.com/White-Green/rrtrace • Real-time visualization

    of CRuby internals is possible with modern resources ◦ with not small performance overhead... • Open problems ◦ Multi-process/Ractor support ◦ GUI design 21