Bug #18501
closed[BUG] try to mark T_NONE object in RubyVM::InstructionSequence. load_from_binary
Description
<OBJ_INFO:gc_mark_ptr@gc.c:6709> 0x00007fbf1fba1270 [2 M ] T_NONE
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:49: [BUG] try to mark T_NONE object
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0024 p:---- s:0126 e:000125 CFUNC :load_from_binary
c:0023 p:0017 s:0121 e:000120 METHOD /tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:49 [FINISH]
c:0022 p:---- s:0114 e:000113 CFUNC :fetch
c:0021 p:0061 s:0106 e:000105 METHOD /tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:60
c:0020 p:0053 s:0099 e:000098 METHOD /tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:85 [FINISH]
c:0019 p:---- s:0093 e:000092 CFUNC :require
c:0018 p:0065 s:0088 e:000087 METHOD /tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:28
c:0017 p:0066 s:0077 e:000076 METHOD /tmp/bundle/ruby/3.1.0/gems/zeitwerk-2.5.3/lib/zeitwerk/kernel.rb:35
c:0016 p:0005 s:0069 e:000068 BLOCK /app/test/load_selected_tests.rb:11 [FINISH]
c:0015 p:---- s:0065 e:000064 CFUNC :each
c:0014 p:0006 s:0061 e:000060 METHOD /app/test/load_selected_tests.rb:10
c:0013 p:0025 s:0057 e:000056 TOP /app/test/load_selected_tests.rb:28 [FINISH]
c:0012 p:---- s:0054 e:000053 CFUNC :require
c:0011 p:0016 s:0049 e:000048 BLOCK /tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:287 [FINISH]
c:0010 p:---- s:0045 e:000044 CFUNC :each
c:0009 p:0008 s:0041 e:000040 METHOD /tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:286
c:0008 p:0221 s:0037 E:001fd0 METHOD /tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:78 [FINISH]
c:0007 p:---- s:0031 e:000030 CFUNC :public_send
c:0006 p:0073 s:0026 e:000025 METHOD /tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:35
c:0005 p:0007 s:0021 e:000020 METHOD /tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:19
c:0004 p:0034 s:0016 e:000015 TOP /tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/exe/minitest-queue:5 [FINISH]
c:0003 p:---- s:0013 e:000012 CFUNC :load
c:0002 p:0124 s:0008 E:0023a0 EVAL /tmp/bundle/ruby/3.1.0/bin/minitest-queue:25 [FINISH]
c:0001 p:0000 s:0003 E:002230 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
/tmp/bundle/ruby/3.1.0/bin/minitest-queue:25:in `<main>'
/tmp/bundle/ruby/3.1.0/bin/minitest-queue:25:in `load'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/exe/minitest-queue:5:in `<top (required)>'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:19:in `invoke'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:35:in `run!'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:35:in `public_send'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:78:in `run_command'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:286:in `load_tests'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:286:in `each'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:287:in `block in load_tests'
/tmp/bundle/ruby/3.1.0/gems/ci-queue-0.22.0/lib/minitest/queue/runner.rb:287:in `require'
/app/test/load_selected_tests.rb:28:in `<top (required)>'
/app/test/load_selected_tests.rb:10:in `require_tests'
/app/test/load_selected_tests.rb:10:in `each'
/app/test/load_selected_tests.rb:11:in `block in require_tests'
/tmp/bundle/ruby/3.1.0/gems/zeitwerk-2.5.3/lib/zeitwerk/kernel.rb:35:in `require'
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:28:in `require'
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:28:in `require'
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:85:in `load_iseq'
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:60:in `fetch'
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:60:in `fetch'
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:49:in `storage_to_output'
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/compile_cache/iseq.rb:49:in `load_from_binary'
-- C level backtrace information -------------------------------------------
/usr/local/ruby/bin/ruby(rb_print_backtrace+0x11) [0x555912686188] vm_dump.c:759
/usr/local/ruby/bin/ruby(rb_vm_bugreport) vm_dump.c:1045
/usr/local/ruby/bin/ruby(bug_report_end+0x0) [0x555912732edf] error.c:797
/usr/local/ruby/bin/ruby(rb_bug_without_die) error.c:797
/usr/local/ruby/bin/ruby(die+0x0) [0x5559124a492c] error.c:805
/usr/local/ruby/bin/ruby(rb_bug) error.c:807
/usr/local/ruby/bin/ruby(gc_mark_ptr+0x138) [0x5559124c6468] gc.c:6710
/usr/local/ruby/bin/ruby(gc_mark+0xb) [0x5559124c6b9b] gc.c:6743
/usr/local/ruby/bin/ruby(mark_keyvalue) gc.c:6303
/usr/local/ruby/bin/ruby(apply_functor+0x13) [0x5559125e7516] st.c:1570
/usr/local/ruby/bin/ruby(st_general_foreach) st.c:1480
/usr/local/ruby/bin/ruby(rb_st_foreach) st.c:1577
/usr/local/ruby/bin/ruby(mark_hash+0xf) [0x5559124c81d7] gc.c:6335
/usr/local/ruby/bin/ruby(gc_mark_children) gc.c:6947
/usr/local/ruby/bin/ruby(gc_marks_wb_unprotected_objects_plane+0x1e) [0x5559124c997e] gc.c:7864
/usr/local/ruby/bin/ruby(rgengc_rememberset_mark_plane) gc.c:7856
/usr/local/ruby/bin/ruby(rgengc_rememberset_mark) gc.c:8319
/usr/local/ruby/bin/ruby(gc_marks_start) gc.c:7844
/usr/local/ruby/bin/ruby(gc_marks) gc.c:8145
/usr/local/ruby/bin/ruby(gc_start) gc.c:8963
/usr/local/ruby/bin/ruby(rb_multi_ractor_p+0x0) [0x5559124ce372] gc.c:8849
/usr/local/ruby/bin/ruby(rb_vm_lock_leave) vm_sync.h:92
/usr/local/ruby/bin/ruby(garbage_collect) gc.c:8851
/usr/local/ruby/bin/ruby(garbage_collect_with_gvl) gc.c:9221
/usr/local/ruby/bin/ruby(objspace_malloc_increase_body) gc.c:11294
/usr/local/ruby/bin/ruby(objspace_malloc_increase_body) gc.c:11272
/usr/local/ruby/bin/ruby(objspace_malloc_fixup) gc.c:11372
/usr/local/ruby/bin/ruby(objspace_xmalloc0) gc.c:11443
/usr/local/ruby/bin/ruby(rb_st_init_table_with_size+0x79) [0x5559125e4f59] st.c:551
/usr/local/ruby/bin/ruby(rebuild_table+0x1e7) [0x5559125e51f7] st.c:727
/usr/local/ruby/bin/ruby(rebuild_table_if_necessary+0xc) [0x5559125e6f14] st.c:1071
/usr/local/ruby/bin/ruby(st_add_direct_with_hash) st.c:1131
/usr/local/ruby/bin/ruby(rb_st_update) st.c:1428
/usr/local/ruby/bin/ruby(register_fstring+0x34) [0x55591260af6e] string.c:450
/usr/local/ruby/bin/ruby(rb_enc_interned_str) string.c:11960
/usr/local/ruby/bin/ruby(ibf_load_object+0xca) [0x5559126cb7aa] compile.c:12567
/usr/local/ruby/bin/ruby(ibf_load_code+0x427) [0x5559126f03e7] compile.c:12541
/usr/local/ruby/bin/ruby(ibf_load_iseq_each+0xca1) [0x5559126ec99b] compile.c:11866
/usr/local/ruby/bin/ruby(rb_ibf_load_iseq_complete) compile.c:12748
/usr/local/ruby/bin/ruby(ibf_load_iseq+0xf2) [0x5559126eff82] compile.c:12803
/usr/local/ruby/bin/ruby(ibf_load_code+0x478) [0x5559126f0438] compile.c:11166
/usr/local/ruby/bin/ruby(ibf_load_iseq_each+0xca1) [0x5559126ec99b] compile.c:11866
/usr/local/ruby/bin/ruby(rb_ibf_load_iseq_complete) compile.c:12748
/usr/local/ruby/bin/ruby(ibf_load_iseq+0xf2) [0x5559126eff82] compile.c:12803
/usr/local/ruby/bin/ruby(rb_iseq_ibf_load+0xb3) [0x5559126f1393] compile.c:12909
/usr/local/ruby/bin/ruby(iseqw_s_load_from_binary+0x10) [0x555912506dc0] iseq.c:3600
/usr/local/ruby/bin/ruby(vm_cfp_consistent_p+0x0) [0x555912666909] vm_insnhelper.c:3037
/usr/local/ruby/bin/ruby(vm_call_cfunc_with_frame) vm_insnhelper.c:3039
/usr/local/ruby/bin/ruby(vm_sendish+0xe) [0x555912677a93] vm_insnhelper.c:4751
/usr/local/ruby/bin/ruby(vm_exec_core) insns.def:778
/usr/local/ruby/bin/ruby(rb_vm_exec+0xb6) [0x555912668f96] vm.c:2211
/usr/local/ruby/bin/ruby(vm_call0_body+0x4db) [0x555912670e7b] vm_eval.c:205
/usr/local/ruby/bin/ruby(rb_funcallv_scope+0x1aa) [0x555912674caa] vm_eval.c:86
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/bootsnap.so(try_storage_to_output+0x6c) [0x7fc01c93fc4c] bootsnap.c:943
/usr/local/ruby/bin/ruby(rb_protect+0xfc) [0x5559124aaeec] eval.c:967
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/bootsnap.so(fetch_cached_data+0x48) [0x7fc01c9407e0] bootsnap.c:955
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/bootsnap.so(bs_fetch) bootsnap.c:737
/tmp/bundle/ruby/3.1.0/gems/bootsnap-1.10.1/lib/bootsnap/bootsnap.so(bs_rb_fetch) bootsnap.c:359
/usr/local/ruby/bin/ruby(vm_cfp_consistent_p+0x0) [0x555912666909] vm_insnhelper.c:3037
/usr/local/ruby/bin/ruby(vm_call_cfunc_with_frame) vm_insnhelper.c:3039
/usr/local/ruby/bin/ruby(vm_sendish+0xe) [0x555912677a93] vm_insnhelper.c:4751
/usr/local/ruby/bin/ruby(vm_exec_core) insns.def:778
/usr/local/ruby/bin/ruby(rb_vm_exec+0xb6) [0x555912668f96] vm.c:2211
/usr/local/ruby/bin/ruby(vm_call0_body+0x4db) [0x555912670e7b] vm_eval.c:205
/usr/local/ruby/bin/ruby(rb_vm_call0+0xc7) [0x5559126716d7] vm_eval.c:57
/usr/local/ruby/bin/ruby(rb_vm_call_kw+0x1e) [0x555912672a1c] vm_eval.c:302
/usr/local/ruby/bin/ruby(rb_check_funcall_default_kw) vm_eval.c:690
/usr/local/ruby/bin/ruby(RB_IMMEDIATE_P+0x0) [0x55591250d79a] iseq.c:956
/usr/local/ruby/bin/ruby(RB_SPECIAL_CONST_P) ./include/ruby/internal/special_consts.h:262
/usr/local/ruby/bin/ruby(rb_iseq_load_iseq) iseq.c:958
/usr/local/ruby/bin/ruby(load_iseq_eval+0xa) [0x5559125152c6] load.c:644
/usr/local/ruby/bin/ruby(require_internal) load.c:1132
/usr/local/ruby/bin/ruby(rb_require_string+0x2c) [0x5559125154b9] load.c:1223
/usr/local/ruby/bin/ruby(rb_f_require) load.c:904
/usr/local/ruby/bin/ruby(vm_cfp_consistent_p+0x0) [0x555912666909] vm_insnhelper.c:3037
/usr/local/ruby/bin/ruby(vm_call_cfunc_with_frame) vm_insnhelper.c:3039
/usr/local/ruby/bin/ruby(vm_call_method_each_type+0x79) [0x55591266a0d9] vm_insnhelper.c:3639
/usr/local/ruby/bin/ruby(vm_call_alias+0x87) [0x55591266b987] vm_insnhelper.c:3189
/usr/local/ruby/bin/ruby(vm_sendish+0xe) [0x555912677a93] vm_insnhelper.c:4751
/usr/local/ruby/bin/ruby(vm_exec_core) insns.def:778
/usr/local/ruby/bin/ruby(rb_vm_exec+0xb6) [0x555912668f96] vm.c:2211
/usr/local/ruby/bin/ruby(rb_yield+0x256) [0x55591266e346] vm.c:1316
/usr/local/ruby/bin/ruby(RB_FL_TEST_RAW+0x0) [0x55591269630c] array.c:2522
/usr/local/ruby/bin/ruby(RB_FL_ANY_RAW) ./include/ruby/internal/fl_type.h:558
/usr/local/ruby/bin/ruby(rb_array_len) ./include/ruby/internal/core/rarray.h:302
/usr/local/ruby/bin/ruby(rb_ary_each) array.c:2521
/usr/local/ruby/bin/ruby(vm_cfp_consistent_p+0x0) [0x555912666909] vm_insnhelper.c:3037
/usr/local/ruby/bin/ruby(vm_call_cfunc_with_frame) vm_insnhelper.c:3039
/usr/local/ruby/bin/ruby(vm_sendish+0xc) [0x555912677b9d] vm_insnhelper.c:4751
/usr/local/ruby/bin/ruby(vm_exec_core) insns.def:759
/usr/local/ruby/bin/ruby(rb_vm_exec+0xb6) [0x555912668f96] vm.c:2211
/usr/local/ruby/bin/ruby(load_iseq_eval+0xa) [0x5559125152fa] load.c:656
/usr/local/ruby/bin/ruby(require_internal) load.c:1132
/usr/local/ruby/bin/ruby(rb_require_string+0x2c) [0x5559125154b9] load.c:1223
/usr/local/ruby/bin/ruby(rb_f_require) load.c:904
/usr/local/ruby/bin/ruby(vm_cfp_consistent_p+0x0) [0x555912666909] vm_insnhelper.c:3037
/usr/local/ruby/bin/ruby(vm_call_cfunc_with_frame) vm_insnhelper.c:3039
/usr/local/ruby/bin/ruby(vm_call_method_each_type+0x79) [0x55591266a0d9] vm_insnhelper.c:3639
/usr/local/ruby/bin/ruby(vm_call_alias+0x87) [0x55591266b987] vm_insnhelper.c:3189
/usr/local/ruby/bin/ruby(vm_call_method_each_type+0x269) [0x55591266a2c9] vm_insnhelper.c:3675
/usr/local/ruby/bin/ruby(vm_call_method+0xb4) [0x55591266a9a4] vm_insnhelper.c:3750
/usr/local/ruby/bin/ruby(vm_sendish+0xe) [0x555912677a93] vm_insnhelper.c:4751
/usr/local/ruby/bin/ruby(vm_exec_core) insns.def:778
/usr/local/ruby/bin/ruby(rb_vm_exec+0xb6) [0x555912668f96] vm.c:2211
/usr/local/ruby/bin/ruby(rb_yield+0x256) [0x55591266e346] vm.c:1316
/usr/local/ruby/bin/ruby(RB_FL_TEST_RAW+0x0) [0x55591269630c] array.c:2522
/usr/local/ruby/bin/ruby(RB_FL_ANY_RAW) ./include/ruby/internal/fl_type.h:558
/usr/local/ruby/bin/ruby(rb_array_len) ./include/ruby/internal/core/rarray.h:302
/usr/local/ruby/bin/ruby(rb_ary_each) array.c:2521
/usr/local/ruby/bin/ruby(vm_cfp_consistent_p+0x0) [0x555912666909] vm_insnhelper.c:3037
/usr/local/ruby/bin/ruby(vm_call_cfunc_with_frame) vm_insnhelper.c:3039
/usr/local/ruby/bin/ruby(vm_sendish+0xc) [0x555912677b9d] vm_insnhelper.c:4751
/usr/local/ruby/bin/ruby(vm_exec_core) insns.def:759
/usr/local/ruby/bin/ruby(rb_vm_exec+0xb6) [0x555912668f96] vm.c:2211
/usr/local/ruby/bin/ruby(vm_call0_body+0x4db) [0x555912670e7b] vm_eval.c:205
/usr/local/ruby/bin/ruby(rb_call0+0x20d) [0x555912673bad] vm_eval.c:86
/usr/local/ruby/bin/ruby(send_internal+0xef) [0x5559126742af] vm_eval.c:1261
/usr/local/ruby/bin/ruby(vm_cfp_consistent_p+0x0) [0x555912666909] vm_insnhelper.c:3037
/usr/local/ruby/bin/ruby(vm_call_cfunc_with_frame) vm_insnhelper.c:3039
/usr/local/ruby/bin/ruby(vm_call_method_each_type+0x79) [0x55591266a0d9] vm_insnhelper.c:3639
/usr/local/ruby/bin/ruby(vm_call_method+0xb4) [0x55591266a9a4] vm_insnhelper.c:3750
/usr/local/ruby/bin/ruby(vm_sendish+0xe) [0x555912677a93] vm_insnhelper.c:4751
/usr/local/ruby/bin/ruby(vm_exec_core) insns.def:778
/usr/local/ruby/bin/ruby(rb_vm_exec+0xb6) [0x555912668f96] vm.c:2211
/usr/local/ruby/bin/ruby(raise_load_if_failed+0x0) [0x5559125140cc] load.c:656
/usr/local/ruby/bin/ruby(rb_load_internal) load.c:719
/usr/local/ruby/bin/ruby(rb_f_load+0xb1) [0x555912514271] load.c:792
/usr/local/ruby/bin/ruby(vm_cfp_consistent_p+0x0) [0x555912666909] vm_insnhelper.c:3037
/usr/local/ruby/bin/ruby(vm_call_cfunc_with_frame) vm_insnhelper.c:3039
/usr/local/ruby/bin/ruby(vm_call_method_each_type+0x79) [0x55591266a0d9] vm_insnhelper.c:3639
/usr/local/ruby/bin/ruby(vm_call_method+0xb4) [0x55591266a9a4] vm_insnhelper.c:3750
/usr/local/ruby/bin/ruby(vm_sendish+0xe) [0x555912677a93] vm_insnhelper.c:4751
/usr/local/ruby/bin/ruby(vm_exec_core) insns.def:778
/usr/local/ruby/bin/ruby(rb_vm_exec+0xb6) [0x555912668f96] vm.c:2211
/usr/local/ruby/bin/ruby(rb_ec_exec_node+0xbb) [0x5559124a4efb] eval.c:280
/usr/local/ruby/bin/ruby(ruby_run_node+0x4f) [0x5559124aa13f] eval.c:321
/usr/local/ruby/bin/ruby(main+0x5f) [0x5559124a4c7f] error.c:3180
From my limited understanding it seems to happen if GC triggers at a very specific point.
Files
Updated by tenderlovemaking (Aaron Patterson) over 2 years ago
Interesting. Are you able to get a core file? I can poke around that code, but a core file would be very helpful.
Thanks!
Updated by byroot (Jean Boussier) over 2 years ago
I sent the core files to Aaron privately.
Updated by tenderlovemaking (Aaron Patterson) over 2 years ago
Hash is writing T_NONE references¶
I think it's possible that T_NONE objects are being written in to a hash. Basically we dup the string key, and if the hash needs to expand, it does so before actually inserting the key. The expansion of the hash causes a malloc which kicks the GC, and the compiler has optimized the code such that the GC doesn't see a reference to the object in the stack or a register.
The GC ends up collecting the string, then the Hash writes a T_NONE to the underlying table.
Lets follow the order of events for code like this:
hash["foo"] = "bar"
1. Ruby -> rb_hash_aset
is called¶
This function checks if the key is a string, and if so it does something special. Namely, it calls RHASH_UPDATE_ITER
with a special callback hash_aset_str
. RHASH_UPDATE_ITER
is just a wrapper for tbl_update
.
2. rb_hash_aset
-> tbl_update
.¶
This function just calls in to rb_hash_stlike_update
along with a struct. The func
member of the struct is our callback hash_aset_str
.
3. tbl_update
-> rb_hash_stlike_update
This function just checks if we have an AR table or not. In this case, we have an AR table, so we just call ar_update
. arg
is passed to ar_update
and it has the hash_aset_str
function pointer. We also pass tbl_update_modify
as a callback.
4. rb_hash_stlike_update
-> ar_update
ar_update
does some checks, then eventually calls tbl_update_modify
as a callback, passing arg.
5. ar_update
-> tbl_update_modify
tbl_update_modify
finally calls hash_aset_str
as a callback
6. tbl_update_modify
-> hash_aset_str
hash_aset_str
checks the key. The key doesn't exist (it's new), but it's also not frozen. So we allocate a new frozen string object and assign that to *key
so that the caller can read it.
Now we return up the stack. hash_aset_str
returns control to tbl_update_modify
7. tbl_update_modify
tbl_update_modify
executes a write barrier on the key and value.
This is odd because the key and value haven't been written yet. Neither are reachable via the hash yet. Since we have allocated a new string, the string's liveness depends on being found in the C stack or a register. It's not referenced from the hash yet.
I think it's OK if the write barrier happens here, but it is odd that we execute the barrier but the reference isn't "real" yet.
tbl_update_modify
returns control to ar_update
8. ar_update
ar_update
finally adds the reference to the hash.
I think this is where the bug is. Adding a reference to the hash could cause the hash to expand which would cause xmalloc to execute, and this could cause GC to run. The compiler could have optimized this code in such a way that the reference to the key is not on the stack and no longer in a register.
Updated by byroot (Jean Boussier) over 2 years ago
Thanks @tenderlove, your explanation makes sense to me.
From your description we should be able to reproduce with GC.stress = true
no?
Updated by byroot (Jean Boussier) over 2 years ago
- Backport changed from 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN to 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: REQUIRED
@tenderlovemaking (Aaron Patterson) implemented a patch for this: https://github.com/ruby/ruby/pull/5525, we deployed it on our infra and so far the bug is gone. I'd like to let it run for a few more days to be fully certain though.
@tenderlovemaking (Aaron Patterson), do you think this bug might have been present on any older version?
Updated by tenderlovemaking (Aaron Patterson) over 2 years ago
- Status changed from Open to Closed
Applied in changeset git|2a76440fac62bb0f6e53ccada07caf4b47b78cf9.
[Bug #18501] Fire write barrier after hash has been written
Before this change the write barrier was executed before the key and
value were actually reachable via the Hash. This could cause
inconsistencies in object coloration which would lead to accidental
collection of dup'd keys.
Example:
- Object O is grey, Object P is white.
- Write barrier fires O -> P
- Write barrier does nothing
- Malloc happens, which starts GC
- GC colors O black
- P is written in to O (now we have O -> P reference)
- P is now accidentally treated as garbage
Updated by byroot (Jean Boussier) over 2 years ago
- Backport changed from 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: REQUIRED to 2.6: DONTNEED, 2.7: DONTNEED, 3.0: DONTNEED, 3.1: REQUIRED
I'd like to let it run for a few more days to be fully certain though.
We ran it quite enough and saw no more crashes, so we're quite confident the patches work.
I'm marking 3.0 and older as DONTNEED because ne never noticed that crash when we were running those versions.
Updated by naruse (Yui NARUSE) over 2 years ago
- Backport changed from 2.6: DONTNEED, 2.7: DONTNEED, 3.0: DONTNEED, 3.1: REQUIRED to 2.6: DONTNEED, 2.7: DONTNEED, 3.0: DONTNEED, 3.1: DONE
ruby_3_1 86c8e15170484fe23b311e567717053f147ffd9c merged revision(s) 2a76440fac62b.