Project

General

Profile

Bug #16689

[BUG] try to mark T_NONE object

Added by byroot (Jean Boussier) 3 months ago. Updated 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.7.0p0 (2019-12-25 revision 647ee6f091)
[ruby-core:97449]

Description

This kinds of superseeds https://bugs.ruby-lang.org/issues/16682. I'm now able to trigger what I believe to be the true root cause, without having any active TracePoint.

This crash happen very reliably (over 50% of the cases) when booting our application with 2.7.0p0.

Repro script: https://github.com/Shopify/ruby-repro

It crashes after a semi-random amount of iterations.

It also crashes on the current ruby-core master.

ruby/2.7.0/json/common.rb:156: [BUG] try to mark T_NONE object
ruby 2.7.0p0 (2019-12-25 revision 647ee6f091) [x86_64-darwin19]

-- Crash Report log information --------------------------------------------
   See Crash Report log file under the one of following:                    
     * ~/Library/Logs/DiagnosticReports                                     
     * /Library/Logs/DiagnosticReports                                      
   for more details.                                                        
Don't forget to include the above Crash Report log file in bug reports.     

-- Control frame information -----------------------------------------------
c:0006 p:---- s:0026 e:000025 CFUNC  :parse
c:0005 p:0033 s:0022 e:000021 METHOD ~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/json/common.rb:156
c:0004 p:0042 s:0016 e:000013 BLOCK  crash.rb:30 [FINISH]
c:0003 p:---- s:0011 e:000010 CFUNC  :loop
c:0002 p:0041 s:0007 E:0008c8 EVAL   crash.rb:28 [FINISH]
c:0001 p:0000 s:0003 E:0014a0 (none) [FINISH]

-- Ruby level backtrace information ----------------------------------------
crash.rb:28:in `<main>'
crash.rb:28:in `loop'
crash.rb:30:in `block in <main>'
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/json/common.rb:156:in `parse'
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/json/common.rb:156:in `parse'

-- C level backtrace information -------------------------------------------
~/.rubies/ruby-2.7.0/bin/ruby(rb_vm_bugreport+0x96) [0x1057dd1f6]
~/.rubies/ruby-2.7.0/bin/ruby(rb_bug+0xcc) [0x1057e9b86]
~/.rubies/ruby-2.7.0/bin/ruby(gc_mark_ptr+0x17a) [0x10563b72a]
~/.rubies/ruby-2.7.0/bin/ruby(mark_keyvalue+0x49) [0x10563c4d9]
~/.rubies/ruby-2.7.0/bin/ruby(st_general_foreach+0xa9) [0x105747389]
~/.rubies/ruby-2.7.0/bin/ruby(rb_st_foreach+0x33) [0x105747a53]
~/.rubies/ruby-2.7.0/bin/ruby(gc_mark_children+0x8e8) [0x105631078]
~/.rubies/ruby-2.7.0/bin/ruby(gc_mark_stacked_objects_incremental+0x9e) [0x105639e0e]
~/.rubies/ruby-2.7.0/bin/ruby(newobj_slowpath+0x50f) [0x105636e9f]
~/.rubies/ruby-2.7.0/bin/ruby(newobj_slowpath_wb_protected+0x14) [0x105636964]
~/.rubies/ruby-2.7.0/bin/ruby(rb_str_buf_new+0x1e) [0x10575314e]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_string+0x35) [0x109082e35]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_value+0x437) [0x109081487]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_value+0xee7) [0x109081f37]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_value+0xee7) [0x109081f37]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_value+0xee7) [0x109081f37]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_value+0xee7) [0x109081f37]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_value+0xee7) [0x109081f37]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_value+0xee7) [0x109081f37]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(JSON_parse_value+0xee7) [0x109081f37]
~/.rubies/ruby-2.7.0/lib/ruby/2.7.0/x86_64-darwin19/json/ext/parser.bundle(cParser_parse+0x142) [0x109080c62]
~/.rubies/ruby-2.7.0/bin/ruby(vm_call_cfunc+0x170) [0x1057ce220]
~/.rubies/ruby-2.7.0/bin/ruby(vm_exec_core+0x38df) [0x1057b40af]
~/.rubies/ruby-2.7.0/bin/ruby(rb_vm_exec+0xadc) [0x1057c903c]
~/.rubies/ruby-2.7.0/bin/ruby(loop_i+0x29) [0x1057d8f99]
~/.rubies/ruby-2.7.0/bin/ruby(rb_vrescue2+0x114) [0x10561b024]
~/.rubies/ruby-2.7.0/bin/ruby(rb_rescue2+0x7b) [0x10561aeeb]
~/.rubies/ruby-2.7.0/bin/ruby(vm_call_cfunc+0x170) [0x1057ce220]
~/.rubies/ruby-2.7.0/bin/ruby(vm_exec_core+0x3782) [0x1057b3f52]
~/.rubies/ruby-2.7.0/bin/ruby(rb_vm_exec+0xadc) [0x1057c903c]
~/.rubies/ruby-2.7.0/bin/ruby(rb_ec_exec_node+0xc6) [0x10561a5a6]
~/.rubies/ruby-2.7.0/bin/ruby(ruby_run_node+0x55) [0x10561a485]
~/.rubies/ruby-2.7.0/bin/ruby(main+0x5d) [0x105571c9d]

Updated by alanwu (Alan Wu) 3 months ago

The repro also crashes on a recent master. It looks like a hash somewhere is corrupted?

-- C level backtrace information -------------------------------------------
~/tip-de15a26/bin/ruby(rb_vm_bugreport+0xde6) [0x55ccbe087d96] vm_dump.c:763
~/tip-de15a26/bin/ruby(rb_bug+0xe4) [0x55ccbdea41a3] error.c:659
~/tip-de15a26/bin/ruby(gc_mark_ptr+0x19d) [0x55ccbdec4d5d] gc.c:5270
~/tip-de15a26/bin/ruby(mark_keyvalue+0x41) [0x55ccbdec6291] gc.c:5301
~/tip-de15a26/bin/ruby(apply_functor+0x13) [0x55ccbdff15a3] st.c:1565
~/tip-de15a26/bin/ruby(rb_st_foreach) st.c:1475
~/tip-de15a26/bin/ruby(mark_hash+0xf) [0x55ccbdec5b17] gc.c:4942
~/tip-de15a26/bin/ruby(gc_mark_children) gc.c:5491
~/tip-de15a26/bin/ruby(gc_mark_stacked_objects+0x31) [0x55ccbdec734d] gc.c:5612
~/tip-de15a26/bin/ruby(gc_mark_stacked_objects_all) gc.c:5652
~/tip-de15a26/bin/ruby(gc_marks_rest) gc.c:6552
~/tip-de15a26/bin/ruby(gc_marks+0x8) [0x55ccbdec88d8] gc.c:6611
~/tip-de15a26/bin/ruby(gc_start) gc.c:7396
~/tip-de15a26/bin/ruby(garbage_collect_with_gvl+0x90) [0x55ccbdec8d70] gc.c:7293
~/tip-de15a26/bin/ruby(objspace_malloc_fixup+0x17) [0x55ccbdecf1ab] gc.c:9865
~/tip-de15a26/bin/ruby(ruby_xmalloc) gc.c:9905

EDIT:
Seems to be a WB miss somewhere:

$ ruby -v crash.rb
ruby 2.8.0dev (2020-03-13T13:27:54Z master de15a26e9e) [x86_64-linux]
2.8.0
start repro (should crash after 14 dots)
.verify_internal_consistency_reachable_i: WB miss (O->Y) 0x000055c526ceaa90 [3LM   ] T_HASH (Hash)[S ] 563 -> 0x000055c52eea4180 [0     ] T_HASH (Hash)[S ] 9
verify_internal_consistency_reachable_i: WB miss (O->Y) 0x000055c526ceaa90 [3LM   ] T_HASH (Hash)[S ] 563 -> 0x000055c52eeb40d0 [0     ] T_HASH (Hash)[AT] 3
verify_internal_consistency_reachable_i: WB miss (O->Y) 0x000055c526ceaa90 [3LM   ] T_HASH (Hash)[S ] 563 -> 0x000055c52eeba9f8 [0     ] T_HASH (Hash)[AT] 3
verify_internal_consistency_reachable_i: WB miss (O->Y) 0x000055c526ceaa90 [3LM   ] T_HASH (Hash)[S ] 563 -> 0x000055c52eeb91e8 [0     ] T_HASH (Hash)[AT] 3
verify_internal_consistency_reachable_i: WB miss (O->Y) 0x000055c526ceaa90 [3LM   ] T_HASH (Hash)[S ] 563 -> 0x000055c52eebfbb0 [0     ] T_HASH (Hash)[AT] 3
verify_internal_consistency_reachable_i: WB miss (O->Y) 0x000055c526ceaa90 [3LM   ] T_HASH (Hash)[S ] 563 -> 0x000055c52eebe6e8 [0     ] T_HASH (Hash)[AT] 7
#2

Updated by alanwu (Alan Wu) 3 months ago

I bisected it to this commit:

commit 21994b7fd686f263544fcac1616ecf3189fb78b3
Avoid rehashing keys in transform_values

    Previously, calling transform_values would call rb_hash_aset for each
    key, needing to rehash it and look up its location.

    Instead, we can use rb_hash_stlike_foreach_with_replace to replace the
    values as we iterate without rehashing the keys.

We should revert. Actually I think this can be fixed easily enough. Let me take a shot.
https://github.com/ruby/ruby/pull/2964

#3

Updated by alanwu (Alan Wu) 2 months ago

  • Status changed from Open to Closed

Applied in changeset git|713dc619f5372a645b66bef9dacee217c4101cb4.


Add missing write barrier for Hash#transform_values{,!}

21994b7fd686f263544fcac1616ecf3189fb78b3 removed the write barrier that
was present in rb_hash_aset(). Re-insert it to not crash during GC.

[Bug #16689]

#4

Updated by naruse (Yui NARUSE) 2 months ago

  • Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED

Updated by byroot (Jean Boussier) 2 months ago

I tested the patch against our CI and can confirm it does fix the problem.

Updated by naruse (Yui NARUSE) 2 months ago

  • Backport changed from 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE

ruby_2_7 2a3027b7b54a3118731f70c9e88aabbd495bb9fe.

Also available in: Atom PDF