Our rails application is "World's Largest Rails Monolith" • It was TOO hard for us... • Some heavy req: 6,000ms -> 80,000ms • Restarting `rails s` takes 40,000ms
Many test failures • Just disabling tracepoints (breakpoints for Ruby) after `continue` made byebug test suite broken and it had never finished running # Running: ...........FFEE.......EEE.E...EE.......F.........................F........... .........F............................................................S..F..F .............................E.EFEEEEE....S....E..E..E...F...............F.F. ........................EE...FF..F.FF.........E.EEEE...........E..EE.EEE..EE. ...F........................S......EE.....E.E.......... No output has been received in the last 10 minutes, this potentially https://travis-ci.org/deivid-rodriguez/byebug/jobs/74734077
Reproduce a test failure 1. module Byebug 2. Context.interface = TestInterface.new 3. Context.interface.input << 'continue' 4. byebug 5. b = 5 5. end • Minimum code to reproduce failure of a test case
Break with rb_raise • My solution was to break in rb_raise • This was my first experience of using gdb and debugging C extension, so please tell me if there is smarter solution
Breakpoint 1, rb_raise (exc=93824997653520, fmt=0x555555760d70 "wrong argument type %s (expected %s)") at error.c:1939 1939 va_start(args, fmt); (gdb) bt #0 rb_raise (exc=93824997653520, fmt=0x555555760d70 "wrong argument type %s (expected %s)") at error.c:1939 #1 0x000055555571e598 in rb_check_type (x=8, t=12) at error.c:567 #2 0x00007ffff5fa4d79 in rb_data_object_get (obj=8) at /usr/local/include/ruby-2.2.0/ ruby/ruby.h:1191 #3 0x00007ffff5fa4f91 in cleanup_dead_threads () at ../../../../ext/byebug/threads.c:105 #4 0x00007ffff5fa50c5 in release_lock () at ../../../../ext/byebug/threads.c:157 #5 0x00007ffff5fa0e96 in cleanup (dc=0x555556204310) at ../../../../ext/byebug/byebug.c: 132 Break with rb_raise • A place raising an error is found!!
I could see proper C-level backtrace by breaking in rb_raise • Cause: • Unexpected nil reference in Data_Get_Struct • I fixed to return before Data_Get_Struct in hooked function • I managed to pass all tests!
What's wrong? • Since byebug supports multi-thread debugging, it locks not-debugging threads • byebug releases it after debugging • My patch prevented thread release by byebug...
See backtrace #4 0x00007f7e2c74a452 in sigsegv (sig=11, info=0x7f7e2e0ccbb0, ctx=0x7f7e2e0cca80) at signal.c:879 #5 #6 0x0000000000000000 in ?? () #7 0x00007f7e2c7db126 in exec_hooks_body (th=0x7f7e2e05e5e0, list=0x7f7e2e05e4e8, trace_arg=0x7fff6f732880) at vm_trace.c:256 #8 0x00007f7e2c7db327 in exec_hooks_protected (th=0x7f7e2e05e5e0, list=0x7f7e2e05e4e8, trace_arg=0x7fff6f732880) at vm_trace.c:299