How I debugged debugger #RubyKaigi 2015

How I debugged debugger #RubyKaigi 2015

RubyKaigi 2015 Lightning Talk
"How I debugged debugger"

http://rubykaigi.org/2015

08d5432a5bc31e6d9edec87b94cb1db1?s=128

Takashi Kokubun

December 12, 2015
Tweet

Transcript

  1. 2.

    Hi, I'm @k0kubun • Takashi Kokubun • Twitter, GitHub: @k0kubun

    • Developer Productivity Group, Cookpad Inc.
  2. 4.

    What's debugger? • "A tool to test and debug other

    programs" • Typically it has: • breakpoint • step execution • tracking values of variables
  3. 5.

    Ruby's debuggers • Ruby 1.9.2+ • debugger, pry-debugger • Ruby

    2.0+ • byebug, pry-byebug • partially working but not supported in 2.0
  4. 7.
  5. 8.

    But why did it matter for us? • We also

    use it in rails • The issue is just 40ms -> 685ms, isn't it? • Why don't you restart rails server?
  6. 9.

    But why did it matter for us? • We also

    use it in rails • The issue is just 40ms -> 685ms, isn't it? • Why don't you restart rails server? No
  7. 12.

    Our rails application is "World's Largest Rails Monolith" • It

    was TOO hard for us... • Some heavy req: 6,000ms -> 80,000ms • Restarting `rails s` takes 40,000ms
  8. 15.

    Optimization idea • byebug author commented about how to optimize

    on the issue • "disable the debugger when user hits continue"
  9. 16.

    Optimization idea • byebug author commented about how to optimize

    on the issue • but he hadn't got to it for a long time • so I got to it
  10. 18.

    Many test failures • Just disabling tracepoints (breakpoints for Ruby)

    after `continue` made byebug test suite broken and it had never finished running # Running: ...........FFEE.......EEE.E...EE.......F.........................F........... .........F............................................................S..F..F .............................E.EFEEEEE....S....E..E..E...F...............F.F. ........................EE...FF..F.FF.........E.EEEE...........E..EE.EEE..EE. ...F........................S......EE.....E.E.......... No output has been received in the last 10 minutes, this potentially https://travis-ci.org/deivid-rodriguez/byebug/jobs/74734077
  11. 19.

    Reproduce a test failure 1. module Byebug 2. Context.interface =

    TestInterface.new 3. Context.interface.input << 'continue' 4. byebug 5. b = 5 5. end • Minimum code to reproduce failure of a test case
  12. 20.

    Execution result 1. module Byebug 2. Context.interface = TestInterface.new 3.

    Context.interface.input << 'continue' 4. byebug 5. b = 5 5. end $ bundle exec ruby -Ilib -rbyebug test.rb test.rb:5:in `<module:Byebug>': wrong argument type nil (expected Data) (TypeError) from test.rb:1:in `<main>'
  13. 21.

    ???? 1. module Byebug 2. Context.interface = TestInterface.new 3. Context.interface.input

    << 'continue' 4. byebug 5. b = 5 5. end $ bundle exec ruby -Ilib -rbyebug test.rb test.rb:5:in `<module:Byebug>': wrong argument type nil (expected Data) (TypeError) from test.rb:1:in `<main>'
  14. 23.

    I tried to debug with gdb • Since byebug is

    written in C, byebug can't debug itself • I had to use gdb to debug byebug
  15. 24.

    Still no luck (gdb) run -Ilib -rbyebug test.rb Starting program:

    /usr/local/bin/ruby -Ilib -rbyebug test.rb [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux- gnu/libthread_db.so.1". [New Thread 0x7ffff7ff8700 (LWP 14327)] test.rb:5:in `<module:Byebug>': wrong argument type nil (expected Data) (TypeError) from test.rb:1:in `<main>' [Thread 0x7ffff7ff8700 (LWP 14327) exited] [Inferior 1 (process 14324) exited with code 01]
  16. 25.

    Debugging C extension • I could understand nothing with gdb

    too • Since it's not SEGV, gdb does not stop • Data? But where is it raised from?
  17. 26.

    Break with rb_raise • My solution was to break in

    rb_raise • This was my first experience of using gdb and debugging C extension, so please tell me if there is smarter solution
  18. 27.

    Breakpoint 1, rb_raise (exc=93824997653520, fmt=0x555555760d70 "wrong argument type %s (expected

    %s)") at error.c:1939 1939 va_start(args, fmt); (gdb) bt #0 rb_raise (exc=93824997653520, fmt=0x555555760d70 "wrong argument type %s (expected %s)") at error.c:1939 #1 0x000055555571e598 in rb_check_type (x=8, t=12) at error.c:567 #2 0x00007ffff5fa4d79 in rb_data_object_get (obj=8) at /usr/local/include/ruby-2.2.0/ ruby/ruby.h:1191 #3 0x00007ffff5fa4f91 in cleanup_dead_threads () at ../../../../ext/byebug/threads.c:105 #4 0x00007ffff5fa50c5 in release_lock () at ../../../../ext/byebug/threads.c:157 #5 0x00007ffff5fa0e96 in cleanup (dc=0x555556204310) at ../../../../ext/byebug/byebug.c: 132 Break with rb_raise • A place raising an error is found!!
  19. 28.

    I could see proper C-level backtrace by breaking in rb_raise

    • Cause: • Unexpected nil reference in Data_Get_Struct • I fixed to return before Data_Get_Struct in hooked function • I managed to pass all tests!
  20. 30.

    Dead locks • My fix was not enough • While

    all tests are passing, it didn't work with Rails app
  21. 32.

    Debug hard! • Using gdb, I checked the state of

    not- debugging threads • I found that it breaked inside hooked function
  22. 33.

    What's wrong? • Since byebug supports multi-thread debugging, it locks

    not-debugging threads • byebug releases it after debugging • My patch prevented thread release by byebug...
  23. 36.

    How to release locks safely? • So I read code

    deeply and managed to find a way to achieve it
  24. 40.

    • It worked with our Rails app and sometimes it

    could pass all tests • But randomly crashed Random SEGV
  25. 41.
  26. 43.

    See backtrace #4 0x00007f7e2c74a452 in sigsegv (sig=11, info=0x7f7e2e0ccbb0, ctx=0x7f7e2e0cca80) at

    signal.c:879 #5 <signal handler called> #6 0x0000000000000000 in ?? () #7 0x00007f7e2c7db126 in exec_hooks_body (th=0x7f7e2e05e5e0, list=0x7f7e2e05e4e8, trace_arg=0x7fff6f732880) at vm_trace.c:256 #8 0x00007f7e2c7db327 in exec_hooks_protected (th=0x7f7e2e05e5e0, list=0x7f7e2e05e4e8, trace_arg=0x7fff6f732880) at vm_trace.c:299
  27. 47.
  28. 49.

    Conclusion • I made byebug over 10x faster in v7.0.0

    • Use latest byebug and pry-byebug