Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How I debugged debugger #RubyKaigi 2015

How I debugged debugger #RubyKaigi 2015

RubyKaigi 2015 Lightning Talk
"How I debugged debugger"

http://rubykaigi.org/2015

Takashi Kokubun

December 12, 2015
Tweet

More Decks by Takashi Kokubun

Other Decks in Programming

Transcript

  1. How I debugged debugger
    @k0kubun

    View full-size slide

  2. Hi, I'm @k0kubun
    • Takashi Kokubun
    • Twitter, GitHub: @k0kubun
    • Developer Productivity Group,
    Cookpad Inc.

    View full-size slide

  3. What's debugger?
    • "A tool to test and debug other programs"
    • Typically it has:
    • breakpoint
    • step execution
    • tracking values of variables

    View full-size slide

  4. Ruby's debuggers
    • Ruby 1.9.2+
    • debugger, pry-debugger
    • Ruby 2.0+
    • byebug, pry-byebug
    • partially working but not supported in 2.0

    View full-size slide

  5. byebug's
    performance issue

    View full-size slide

  6. Ruby process got over
    10x slower after using it
    • Rails' response: 41ms -> 685ms

    View full-size slide

  7. But why did it matter for us?
    • We also use it in rails
    • The issue is just 40ms -> 685ms, isn't it?
    • Why don't you restart rails server?

    View full-size slide

  8. But why did it matter for us?
    • We also use it in rails
    • The issue is just 40ms -> 685ms, isn't it?
    • Why don't you restart rails server?
    No

    View full-size slide

  9. Our rails application is
    "World's Largest Rails Monolith"
    https://speakerdeck.com/a_matsuda/the-recipe-for-the-worlds-largest-rails-monolith

    View full-size slide

  10. Our rails application is
    "World's Largest Rails Monolith"

    View full-size slide

  11. Our rails application is
    "World's Largest Rails Monolith"
    • It was TOO hard for us...
    • Some heavy req: 6,000ms -> 80,000ms
    • Restarting `rails s` takes 40,000ms

    View full-size slide

  12. Our rails application is
    "World's Largest Rails Monolith"
    80,053ms

    View full-size slide

  13. Let's optimize

    View full-size slide

  14. Optimization idea
    • byebug author commented about how to
    optimize on the issue
    • "disable the debugger when user hits continue"

    View full-size slide

  15. Optimization idea
    • byebug author commented about how to
    optimize on the issue
    • but he hadn't got to it for a long time
    • so I got to it

    View full-size slide

  16. First enemy:
    uncertain backtrace

    View full-size slide

  17. Many test failures
    • Just disabling tracepoints (breakpoints for Ruby)
    after `continue` made byebug test suite broken
    and it had never finished running
    # Running:
    ...........FFEE.......EEE.E...EE.......F.........................F...........
    .........F............................................................S..F..F
    .............................E.EFEEEEE....S....E..E..E...F...............F.F.
    ........................EE...FF..F.FF.........E.EEEE...........E..EE.EEE..EE.
    ...F........................S......EE.....E.E..........
    No output has been received in the last 10 minutes, this potentially
    https://travis-ci.org/deivid-rodriguez/byebug/jobs/74734077

    View full-size slide

  18. Reproduce a test failure
    1. module Byebug
    2. Context.interface = TestInterface.new
    3. Context.interface.input << 'continue'
    4. byebug
    5. b = 5
    5. end
    • Minimum code to reproduce failure of a
    test case

    View full-size slide

  19. Execution result
    1. module Byebug
    2. Context.interface = TestInterface.new
    3. Context.interface.input << 'continue'
    4. byebug
    5. b = 5
    5. end
    $ bundle exec ruby -Ilib -rbyebug test.rb
    test.rb:5:in `': wrong argument type nil
    (expected Data) (TypeError)
    from test.rb:1:in `'

    View full-size slide

  20. ????
    1. module Byebug
    2. Context.interface = TestInterface.new
    3. Context.interface.input << 'continue'
    4. byebug
    5. b = 5
    5. end
    $ bundle exec ruby -Ilib -rbyebug test.rb
    test.rb:5:in `': wrong argument type nil
    (expected Data) (TypeError)
    from test.rb:1:in `'

    View full-size slide

  21. No information in backtrace
    • I could understand nothing in Ruby-level
    backtrace...

    View full-size slide

  22. I tried to debug with gdb
    • Since byebug is written in C, byebug can't
    debug itself
    • I had to use gdb to debug byebug

    View full-size slide

  23. Still no luck
    (gdb) run -Ilib -rbyebug test.rb
    Starting program: /usr/local/bin/ruby -Ilib -rbyebug
    test.rb
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library "/lib/x86_64-linux-
    gnu/libthread_db.so.1".
    [New Thread 0x7ffff7ff8700 (LWP 14327)]
    test.rb:5:in `': wrong argument type
    nil (expected Data) (TypeError)
    from test.rb:1:in `'
    [Thread 0x7ffff7ff8700 (LWP 14327) exited]
    [Inferior 1 (process 14324) exited with code 01]

    View full-size slide

  24. Debugging C extension
    • I could understand nothing with gdb too
    • Since it's not SEGV, gdb does not stop
    • Data? But where is it raised from?

    View full-size slide

  25. Break with rb_raise
    • My solution was to break in rb_raise
    • This was my first experience of using gdb and
    debugging C extension, so please tell me if
    there is smarter solution

    View full-size slide

  26. Breakpoint 1, rb_raise (exc=93824997653520, fmt=0x555555760d70 "wrong argument type %s
    (expected %s)") at error.c:1939
    1939 va_start(args, fmt);
    (gdb) bt
    #0 rb_raise (exc=93824997653520, fmt=0x555555760d70 "wrong argument type %s (expected
    %s)") at error.c:1939
    #1 0x000055555571e598 in rb_check_type (x=8, t=12) at error.c:567
    #2 0x00007ffff5fa4d79 in rb_data_object_get (obj=8) at /usr/local/include/ruby-2.2.0/
    ruby/ruby.h:1191
    #3 0x00007ffff5fa4f91 in cleanup_dead_threads () at ../../../../ext/byebug/threads.c:105
    #4 0x00007ffff5fa50c5 in release_lock () at ../../../../ext/byebug/threads.c:157
    #5 0x00007ffff5fa0e96 in cleanup (dc=0x555556204310) at ../../../../ext/byebug/byebug.c:
    132
    Break with rb_raise
    • A place raising an error is found!!

    View full-size slide

  27. I could see proper C-level
    backtrace by breaking in rb_raise
    • Cause:
    • Unexpected nil reference in Data_Get_Struct
    • I fixed to return before Data_Get_Struct in
    hooked function
    • I managed to pass all tests!

    View full-size slide

  28. Second enemy:
    dead locks

    View full-size slide

  29. Dead locks
    • My fix was not enough
    • While all tests are passing, it didn't work with
    Rails app

    View full-size slide

  30. Dead locks
    • I became crazy

    View full-size slide

  31. Debug hard!
    • Using gdb, I checked the state of not-
    debugging threads
    • I found that it breaked inside hooked
    function

    View full-size slide

  32. What's wrong?
    • Since byebug supports multi-thread
    debugging, it locks not-debugging threads
    • byebug releases it after debugging
    • My patch prevented thread release by
    byebug...

    View full-size slide

  33. How to release locks safely?
    • I asked byebug author how to release locks safely

    View full-size slide

  34. How to release locks safely?
    • NO HELP FROM AUTHOR

    View full-size slide

  35. How to release locks safely?
    • So I read code deeply and managed to
    find a way to achieve it

    View full-size slide

  36. THE SOLUTION
    • ALL WE NEED WAS JUST DELETING ONE LINE

    View full-size slide

  37. THE SOLUTION
    • ALL WE NEED WAS JUST DELETING ONE LINE

    View full-size slide

  38. Final boss:
    Random SEGV by ruby's bug

    View full-size slide

  39. • It worked with our Rails app and
    sometimes it could pass all tests
    • But randomly crashed
    Random SEGV

    View full-size slide

  40. SEGV debugging
    • Load core file with gdb
    • Search which variable caused a crash

    View full-size slide

  41. SEGV debugging
    • How to do?: http://techlife.cookpad.com

    View full-size slide

  42. See backtrace
    #4 0x00007f7e2c74a452 in sigsegv (sig=11, info=0x7f7e2e0ccbb0,
    ctx=0x7f7e2e0cca80) at signal.c:879
    #5
    #6 0x0000000000000000 in ?? ()
    #7 0x00007f7e2c7db126 in exec_hooks_body (th=0x7f7e2e05e5e0,
    list=0x7f7e2e05e4e8, trace_arg=0x7fff6f732880)
    at vm_trace.c:256
    #8 0x00007f7e2c7db327 in exec_hooks_protected (th=0x7f7e2e05e5e0,
    list=0x7f7e2e05e4e8, trace_arg=0x7fff6f732880)
    at vm_trace.c:299

    View full-size slide

  43. Debug and find unexpected xfree

    View full-size slide

  44. Sent patch to fix it

    View full-size slide

  45. Sent patch to fix it

    View full-size slide

  46. All problems are resolved!!!
    • Released as byebug v7.0.0!

    View full-size slide

  47. Conclusion
    • I made byebug over 10x faster in v7.0.0
    • Use latest byebug and pry-byebug

    View full-size slide