Project

General

Profile

Bug #10768

segfault during ruby_vm_destruct() in cont_free()

Added by tmm1 (Aman Gupta) almost 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:67734]

Description

(gdb) where
#0  rb_vm_bugreport () at vm_dump.c:738
#1  0x00007ff4f279de2c in report_bug (file=<optimized out>, line=<optimized out>, fmt=0x7ff4f27cfce7 "Segmentation fault at %p", args=0x7ff4f4afd998) at error.c:312
#2  0x00007ff4f279f747 in rb_bug (fmt=0x7ff4f27cfce7 "Segmentation fault at %p") at error.c:339
#3  0x00007ff4f26c0057 in sigsegv (sig=<optimized out>, info=<optimized out>, ctx=<optimized out>) at signal.c:812
#4  <signal handler called>
#5  0x00007ff4f274eee6 in cont_free (ptr=0x7ff4f96fd200) at cont.c:244
#6  0x00007ff4f261af6a in obj_free (obj=140690513703000, objspace=0x7ff4f4aee000) at gc.c:1619
#7  gc_page_sweep (sweep_page=0x7ff50c621100, heap=0x7ff4f4aee010, objspace=0x7ff4f4aee000) at gc.c:2787
#8  gc_heap_lazy_sweep (objspace=0x7ff4f4aee000, heap=0x7ff4f4aee010) at gc.c:3058
#9  0x00007ff4f261b4e3 in gc_heap_rest_sweep (heap=0x7ff4f4aee010, objspace=0x7ff4f4aee000) at gc.c:3083
#10 gc_rest_sweep (objspace=0x7ff4f4aee000) at gc.c:3093
#11 rb_objspace_free (objspace=0x7ff4f4aee000) at gc.c:923
#12 0x00007ff4f273be71 in ruby_vm_destruct (vm=0x7ff4f4af4000) at vm.c:1840
#13 0x00007ff4f26018d7 in ruby_cleanup (ex=0) at eval.c:236
#14 0x00007ff4f2601c0d in ruby_run_node (n=<optimized out>) at eval.c:310
#15 0x00007ff4f25fe35b in main (argc=12, argv=0x7fff35737458) at main.c:36
(gdb) frame 5
#5  0x00007ff4f274eee6 in cont_free (ptr=0x7ff4f96fd200) at cont.c:244
244         if (GET_THREAD()->fiber != cont->self) {
(gdb) disas
Dump of assembler code for function cont_free:
   0x00007ff4f274eea0 <+0>: test   %rdi,%rdi
   0x00007ff4f274eea3 <+3>: push   %rbx
   0x00007ff4f274eea4 <+4>: mov    %rdi,%rbx
   0x00007ff4f274eea7 <+7>: je     0x7ff4f274ef58 <cont_free+184>
   0x00007ff4f274eead <+13>:    mov    0x60(%rdi),%rdi
   0x00007ff4f274eeb1 <+17>:    test   %rdi,%rdi
   0x00007ff4f274eeb4 <+20>:    je     0x7ff4f274eec3 <cont_free+35>
   0x00007ff4f274eeb6 <+22>:    callq  0x7ff4f261cef0 <ruby_xfree>
   0x00007ff4f274eebb <+27>:    movq   $0x0,0x60(%rbx)
   0x00007ff4f274eec3 <+35>:    mov    0x3100de(%rip),%rax        # 0x7ff4f2a5efa8
   0x00007ff4f274eeca <+42>:    mov    (%rax),%rdi
   0x00007ff4f274eecd <+45>:    callq  0x7ff4f25fcc80 <fflush@plt>
   0x00007ff4f274eed2 <+50>:    mov    (%rbx),%eax
   0x00007ff4f274eed4 <+52>:    test   %eax,%eax
   0x00007ff4f274eed6 <+54>:    je     0x7ff4f274ef30 <cont_free+144>
   0x00007ff4f274eed8 <+56>:    mov    0x30fec1(%rip),%rdx        # 0x7ff4f2a5eda0
   0x00007ff4f274eedf <+63>:    mov    0x8(%rbx),%rcx
   0x00007ff4f274eee3 <+67>:    mov    (%rdx),%rdx
=> 0x00007ff4f274eee6 <+70>:    cmp    %rcx,0x2f0(%rdx)
   0x00007ff4f274eeed <+77>:    je     0x7ff4f274ef0c <cont_free+108>
   0x00007ff4f274eeef <+79>:    mov    0x548(%rbx),%rdi
   0x00007ff4f274eef6 <+86>:    test   %rdi,%rdi

(gdb) info registers
rax            0x1  1
rbx            0x7ff4f96fd200   140690133602816
rcx            0x7ff51017b058   140690513703000
rdx            0x0  0

It appears GET_THREAD() is returning a NULL pointer.

Does the following patch make sense?

diff --git a/cont.c b/cont.c
index 78ae089..a94a408 100644
--- a/cont.c
+++ b/cont.c
@@ -236,8 +236,9 @@ cont_free(void *ptr)
    else {
        /* fiber */
        rb_fiber_t *fib = (rb_fiber_t*)cont;
+       rb_thread_t *th = GET_THREAD();
 #ifdef _WIN32
-       if (GET_THREAD()->fiber != fib && cont->type != ROOT_FIBER_CONTEXT) {
+       if (th && th->fiber != fib && cont->type != ROOT_FIBER_CONTEXT) {
        /* don't delete root fiber handle */
        rb_fiber_t *fib = (rb_fiber_t*)cont;
        if (fib->fib_handle) {
@@ -245,7 +246,7 @@ cont_free(void *ptr)
        }
        }
 #else /* not WIN32 */
-       if (GET_THREAD()->fiber != fib) {
+       if (th && th->fiber != fib) {
                 rb_fiber_t *fib = (rb_fiber_t*)cont;
                 if (fib->ss_sp) {
                     if (cont->type == ROOT_FIBER_CONTEXT) {

Associated revisions

Revision 829fcdb2
Added by tmm1 (Aman Gupta) almost 5 years ago

gc.c: ensure GC state is consistent during VM shutdown

  • gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress during rb_objspace_free. Adds extra protection for r46340. Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]
  • gc.c (rb_objspace_call_finalizer): Ensure GC is completed after finalizers have run. We already call gc_rest() before invoking finalizers, but finalizer can allocate new objects and start new GC cycle, so we call gc_rest() again after finalizers are complete.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@49474 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 49474
Added by tmm1 (Aman Gupta) almost 5 years ago

gc.c: ensure GC state is consistent during VM shutdown

  • gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress during rb_objspace_free. Adds extra protection for r46340. Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]
  • gc.c (rb_objspace_call_finalizer): Ensure GC is completed after finalizers have run. We already call gc_rest() before invoking finalizers, but finalizer can allocate new objects and start new GC cycle, so we call gc_rest() again after finalizers are complete.

Revision 49474
Added by tmm1 (Aman Gupta) almost 5 years ago

gc.c: ensure GC state is consistent during VM shutdown

  • gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress during rb_objspace_free. Adds extra protection for r46340. Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]
  • gc.c (rb_objspace_call_finalizer): Ensure GC is completed after finalizers have run. We already call gc_rest() before invoking finalizers, but finalizer can allocate new objects and start new GC cycle, so we call gc_rest() again after finalizers are complete.

Revision 49474
Added by tmm1 (Aman Gupta) almost 5 years ago

gc.c: ensure GC state is consistent during VM shutdown

  • gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress during rb_objspace_free. Adds extra protection for r46340. Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]
  • gc.c (rb_objspace_call_finalizer): Ensure GC is completed after finalizers have run. We already call gc_rest() before invoking finalizers, but finalizer can allocate new objects and start new GC cycle, so we call gc_rest() again after finalizers are complete.

Revision 49474
Added by tmm1 (Aman Gupta) almost 5 years ago

gc.c: ensure GC state is consistent during VM shutdown

  • gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress during rb_objspace_free. Adds extra protection for r46340. Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]
  • gc.c (rb_objspace_call_finalizer): Ensure GC is completed after finalizers have run. We already call gc_rest() before invoking finalizers, but finalizer can allocate new objects and start new GC cycle, so we call gc_rest() again after finalizers are complete.

Revision 49474
Added by tmm1 (Aman Gupta) almost 5 years ago

gc.c: ensure GC state is consistent during VM shutdown

  • gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress during rb_objspace_free. Adds extra protection for r46340. Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]
  • gc.c (rb_objspace_call_finalizer): Ensure GC is completed after finalizers have run. We already call gc_rest() before invoking finalizers, but finalizer can allocate new objects and start new GC cycle, so we call gc_rest() again after finalizers are complete.

Revision 513fefdd
Added by ko1 (Koichi Sasada) over 4 years ago

  • gc.c (rb_objspace_call_finalizer): control GC execution during force firnalizations at the end of interpreter process. [Bug #10768] 1) Prohibit incremental GC while running Ruby-level finalizers to avoid any danger. 2) Prohibit GC while invoking T_DATA/T_FILE data structure because these operations break object relations consistency. This patch can introduce another memory consuming issue because Ruby-level finalizers can run after (2), GC is disabled. However, basically object consistency was broken at (2) as I described above. So that running Ruby-level finalizers contains danger originally. Because of this point, I need to suggest to remove these 3 lines (invoking remaining finalizers). And add a rule to add that finalizers should not add new finalizers, or say there is no guarantee to invoke finalizers that added by another finalizer.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@49684 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 49684
Added by ko1 (Koichi Sasada) over 4 years ago

  • gc.c (rb_objspace_call_finalizer): control GC execution during force firnalizations at the end of interpreter process. [Bug #10768] 1) Prohibit incremental GC while running Ruby-level finalizers to avoid any danger. 2) Prohibit GC while invoking T_DATA/T_FILE data structure because these operations break object relations consistency. This patch can introduce another memory consuming issue because Ruby-level finalizers can run after (2), GC is disabled. However, basically object consistency was broken at (2) as I described above. So that running Ruby-level finalizers contains danger originally. Because of this point, I need to suggest to remove these 3 lines (invoking remaining finalizers). And add a rule to add that finalizers should not add new finalizers, or say there is no guarantee to invoke finalizers that added by another finalizer.

Revision 49684
Added by ko1 (Koichi Sasada) over 4 years ago

  • gc.c (rb_objspace_call_finalizer): control GC execution during force firnalizations at the end of interpreter process. [Bug #10768] 1) Prohibit incremental GC while running Ruby-level finalizers to avoid any danger. 2) Prohibit GC while invoking T_DATA/T_FILE data structure because these operations break object relations consistency. This patch can introduce another memory consuming issue because Ruby-level finalizers can run after (2), GC is disabled. However, basically object consistency was broken at (2) as I described above. So that running Ruby-level finalizers contains danger originally. Because of this point, I need to suggest to remove these 3 lines (invoking remaining finalizers). And add a rule to add that finalizers should not add new finalizers, or say there is no guarantee to invoke finalizers that added by another finalizer.

Revision 49684
Added by ko1 (Koichi Sasada) over 4 years ago

  • gc.c (rb_objspace_call_finalizer): control GC execution during force firnalizations at the end of interpreter process. [Bug #10768] 1) Prohibit incremental GC while running Ruby-level finalizers to avoid any danger. 2) Prohibit GC while invoking T_DATA/T_FILE data structure because these operations break object relations consistency. This patch can introduce another memory consuming issue because Ruby-level finalizers can run after (2), GC is disabled. However, basically object consistency was broken at (2) as I described above. So that running Ruby-level finalizers contains danger originally. Because of this point, I need to suggest to remove these 3 lines (invoking remaining finalizers). And add a rule to add that finalizers should not add new finalizers, or say there is no guarantee to invoke finalizers that added by another finalizer.

Revision 49684
Added by ko1 (Koichi Sasada) over 4 years ago

  • gc.c (rb_objspace_call_finalizer): control GC execution during force firnalizations at the end of interpreter process. [Bug #10768] 1) Prohibit incremental GC while running Ruby-level finalizers to avoid any danger. 2) Prohibit GC while invoking T_DATA/T_FILE data structure because these operations break object relations consistency. This patch can introduce another memory consuming issue because Ruby-level finalizers can run after (2), GC is disabled. However, basically object consistency was broken at (2) as I described above. So that running Ruby-level finalizers contains danger originally. Because of this point, I need to suggest to remove these 3 lines (invoking remaining finalizers). And add a rule to add that finalizers should not add new finalizers, or say there is no guarantee to invoke finalizers that added by another finalizer.

Revision 49684
Added by ko1 (Koichi Sasada) over 4 years ago

  • gc.c (rb_objspace_call_finalizer): control GC execution during force firnalizations at the end of interpreter process. [Bug #10768] 1) Prohibit incremental GC while running Ruby-level finalizers to avoid any danger. 2) Prohibit GC while invoking T_DATA/T_FILE data structure because these operations break object relations consistency. This patch can introduce another memory consuming issue because Ruby-level finalizers can run after (2), GC is disabled. However, basically object consistency was broken at (2) as I described above. So that running Ruby-level finalizers contains danger originally. Because of this point, I need to suggest to remove these 3 lines (invoking remaining finalizers). And add a rule to add that finalizers should not add new finalizers, or say there is no guarantee to invoke finalizers that added by another finalizer.

Revision 3bfc1d3d
Added by naruse (Yui NARUSE) over 4 years ago

merge revision(s) 49474,49541,49545,49684: [Backport #10768]

    * gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress
      during rb_objspace_free. Adds extra protection for r46340.
      Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]

    * gc.c (rb_objspace_call_finalizer): Ensure GC is completed after
      finalizers have run. We already call gc_rest() before invoking
      finalizers, but finalizer can allocate new objects and start new GC
      cycle, so we call gc_rest() again after finalizers are complete.

    * gc.c (rb_objspace_call_finalizer): control GC execution during
      force firnalizations at the end of interpreter process.
      [Bug #10768]
      1) Prohibit incremental GC while running Ruby-level finalizers
         to avoid any danger.
      2) Prohibit GC while invoking T_DATA/T_FILE data structure
         because these operations break object relations consistency.
      This patch can introduce another memory consuming issue because
      Ruby-level finalizers can run after (2), GC is disabled.
      However, basically object consistency was broken at (2) as I
      described above. So that running Ruby-level finalizers contains
      danger originally. Because of this point, I need to suggest to
      remove these 3 lines (invoking remaining finalizers). And add a
      rule to add that finalizers should not add new finalizers, or
      say there is no guarantee to invoke finalizers that added by
      another finalizer.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_2@49716 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 49716
Added by naruse (Yui NARUSE) over 4 years ago

merge revision(s) 49474,49541,49545,49684: [Backport #10768]

* gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress
  during rb_objspace_free. Adds extra protection for r46340.
  Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]

* gc.c (rb_objspace_call_finalizer): Ensure GC is completed after
  finalizers have run. We already call gc_rest() before invoking
  finalizers, but finalizer can allocate new objects and start new GC
  cycle, so we call gc_rest() again after finalizers are complete.

* gc.c (rb_objspace_call_finalizer): control GC execution during
  force firnalizations at the end of interpreter process.
  [Bug #10768]
  1) Prohibit incremental GC while running Ruby-level finalizers
     to avoid any danger.
  2) Prohibit GC while invoking T_DATA/T_FILE data structure
     because these operations break object relations consistency.
  This patch can introduce another memory consuming issue because
  Ruby-level finalizers can run after (2), GC is disabled.
  However, basically object consistency was broken at (2) as I
  described above. So that running Ruby-level finalizers contains
  danger originally. Because of this point, I need to suggest to
  remove these 3 lines (invoking remaining finalizers). And add a
  rule to add that finalizers should not add new finalizers, or
  say there is no guarantee to invoke finalizers that added by
  another finalizer.

History

Updated by tmm1 (Aman Gupta) almost 5 years ago

There are also some other threads present in this app at shutdown time, created by a c-extension as worker threads. These threads do not interact with the ruby vm directly, but instead communicate over a queue. I guess this must be related to the segfault, but I'm not sure why it would cause ruby_current_thread to be NULL.

(gdb) info threads
  Id   Target Id         Frame
  5    Thread 0x7ff4d8f02700 (LWP 10720) 0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
  4    Thread 0x7ff4d9703700 (LWP 10719) 0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
  3    Thread 0x7ff4d9f04700 (LWP 10718) 0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
  2    Thread 0x7ff4ea437700 (LWP 10717) 0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
* 1    Thread 0x7ff4f25c2740 (LWP 10051) rb_vm_bugreport () at vm_dump.c:738

(gdb) thread 5
[Switching to thread 5 (Thread 0x7ff4d8f02700 (LWP 10720))]
#0  0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0  0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ff4f0cfab7c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007ff4e217f798 in pop (item=..., this=0x7ff4e241eb80) at src/queue.hpp:29
#3  highlight_thread (thread_n=<optimized out>, theme=..., new_jobs=0x7ff4e241eb80, completed_jobs=0x7ff4e241ec40) at src/c.cpp:107
#4  0x00007ff4f0cfac78 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ff4f1eaee9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ff4f129b2ed in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

(gdb) thread 4
[Switching to thread 4 (Thread 0x7ff4d9703700 (LWP 10719))]
#0  0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) where
#0  0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ff4f0cfab7c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007ff4e217f798 in pop (item=..., this=0x7ff4e241eb80) at src/queue.hpp:29
#3  highlight_thread (thread_n=<optimized out>, theme=..., new_jobs=0x7ff4e241eb80, completed_jobs=0x7ff4e241ec40) at src/c.cpp:107
#4  0x00007ff4f0cfac78 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ff4f1eaee9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ff4f129b2ed in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

(gdb) thread 3
[Switching to thread 3 (Thread 0x7ff4d9f04700 (LWP 10718))]
#0  0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) where
#0  0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ff4f0cfab7c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007ff4e217f798 in pop (item=..., this=0x7ff4e241eb80) at src/queue.hpp:29
#3  highlight_thread (thread_n=<optimized out>, theme=..., new_jobs=0x7ff4e241eb80, completed_jobs=0x7ff4e241ec40) at src/c.cpp:107
#4  0x00007ff4f0cfac78 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ff4f1eaee9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ff4f129b2ed in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

(gdb) thread 2
[Switching to thread 2 (Thread 0x7ff4ea437700 (LWP 10717))]
#0  0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) where
#0  0x00007ff4f1eb2d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ff4f0cfab7c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007ff4e217f798 in pop (item=..., this=0x7ff4e241eb80) at src/queue.hpp:29
#3  highlight_thread (thread_n=<optimized out>, theme=..., new_jobs=0x7ff4e241eb80, completed_jobs=0x7ff4e241ec40) at src/c.cpp:107
#4  0x00007ff4f0cfac78 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ff4f1eaee9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#6  0x00007ff4f129b2ed in clone () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x0000000000000000 in ?? ()

Updated by tmm1 (Aman Gupta) almost 5 years ago

After some investigation, it appears the background threads in our app are unrelated to this segfault.

ruby_vm_destruct will call thread_free, which sets ruby_current_thread to NULL (https://github.com/ruby/ruby/blob/v2_1_4/vm.c#L2117-L2118).

This means any attempt to access GET_THREAD from objspace_free will cause a segfault.

We propose the following patch: https://github.com/github/ruby/commit/1ae74499395d

Updated by ko1 (Koichi Sasada) almost 5 years ago

Thank you. This patch seems good.

Updated by tmm1 (Aman Gupta) almost 5 years ago

We deployed the lazy-sweep/finalizer patch to production a few days ago and have confirmed that it stopped the segfaults we have been seeing.

Would you like me to commit it to trunk?

Updated by ko1 (Koichi Sasada) almost 5 years ago

Thank you for confirmation.

Would you like me to commit it to trunk?

Yes, please!

#6

Updated by tmm1 (Aman Gupta) almost 5 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r49474.


gc.c: ensure GC state is consistent during VM shutdown

  • gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress during rb_objspace_free. Adds extra protection for r46340. Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]
  • gc.c (rb_objspace_call_finalizer): Ensure GC is completed after finalizers have run. We already call gc_rest() before invoking finalizers, but finalizer can allocate new objects and start new GC cycle, so we call gc_rest() again after finalizers are complete.

Updated by tmm1 (Aman Gupta) almost 5 years ago

  • Backport changed from 2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN to 2.0.0: UNKNOWN, 2.1: REQUIRED, 2.2: REQUIRED

For backporting into 2.1, include r46340 first.

Updated by normalperson (Eric Wong) almost 5 years ago

r49474 (gc.c: ensure GC state is consistent during VM shutdown)
introduces a failure for me in test/ruby/test_io.rb.
Reverting this commit solves the problem for me on one of my
x86-64 (Debian 7.0) systems (could not reproduce the issue on
32-bit x86 nor some of my other, similar systems)

1) Error:
TestIO#test_io_select_with_many_files:
ArgumentError: invalid byte sequence in UTF-8
/home/ew/ruby/test/lib/envutil.rb:308:in gsub'
/home/ew/ruby/test/lib/envutil.rb:308:in
block (2 levels) in module:Assertions'

This is because bugreporter spew is dumping junk, so
I dumped the output via:

--- a/test/lib/envutil.rb
+++ b/test/lib/envutil.rb
@@ -286,6 +286,7 @@ module Test
FailDesc = proc do |status, message = "", out = ""|
pid = status.pid
now = Time.now

  • $stderr.write "#{out}\n" faildesc = proc do if signo = status.termsig signame = Signal.signame(signo)

To get the following output:

Using built-in specs.
COLLECT_GCC=/usr/bin/gcc-4.7.real
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/4.7/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.7.2-5' --with-bugurl=file:///usr/share/doc/gcc-4.7/README.Bugs --enable-languages=c,c++,go,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.7 --enable-shared --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.7 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin --enable-objc-gc --with-arch-32=i586 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.7.2 (Debian 4.7.2-5)

p:Z++: [BUG] rb_gc_mark(): 0x002b2b1dda6b00 is T_NONE
ruby 2.3.0dev (2015-02-06 starla 49516) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0001 p:5933022005200 s:0002 E:000610 TOP [FINISH]

-- C level backtrace information -------------------------------------------
/home/ew/ruby/ruby(rb_vm_bugreport+0x550) [0x2b2b1befd6e0] vm_dump.c:693
/home/ew/ruby/ruby(rb_bug+0xca) [0x2b2b1bf6508a] error.c:409
/home/ew/ruby/ruby(gc_mark_children+0x65a) [0x2b2b1bdd556a] gc.c:4252
/home/ew/ruby/ruby(gc_marks_rest+0x71) [0x2b2b1bdd83d1] gc.c:4280
/home/ew/ruby/ruby(gc_rest.part.64+0x6a) [0x2b2b1bdd924a] gc.c:5974
/home/ew/ruby/ruby(rb_gc_call_finalizer_at_exit+0x237) [0x2b2b1bddb7d7] gc.c:5967
/home/ew/ruby/ruby(ruby_cleanup+0x4b7) [0x2b2b1bdbc727] eval.c:129
/home/ew/ruby/ruby(ruby_run_node+0x4e) [0x2b2b1bdbca1e] eval.c:310
/home/ew/ruby/ruby(main+0x4b) [0x2b2b1bdb88db] parse.y:5350

-- Other runtime information -----------------------------------------------

  • Loaded script: -

  • Loaded features:

    0 enumerator.so
    1 rational.so
    2 complex.so
    3 /home/ew/ruby/.ext/x86_64-linux/enc/encdb.so
    4 /home/ew/ruby/.ext/x86_64-linux/enc/trans/transdb.so
    5 /home/ew/ruby/lib/unicode_normalize.rb
    6 /home/ew/ruby/rbconfig.rb
    7 thread.rb
    8 /home/ew/ruby/.ext/x86_64-linux/thread.so
    9 /home/ew/ruby/lib/rubygems/compatibility.rb
    10 /home/ew/ruby/lib/rubygems/defaults.rb
    11 /home/ew/ruby/lib/rubygems/deprecate.rb
    12 /home/ew/ruby/lib/rubygems/errors.rb
    13 /home/ew/ruby/lib/rubygems/version.rb
    14 /home/ew/ruby/lib/rubygems/requirement.rb
    15 /home/ew/ruby/lib/rubygems/platform.rb
    16 /home/ew/ruby/lib/rubygems/basic_specification.rb
    17 /home/ew/ruby/lib/rubygems/stub_specification.rb
    18 /home/ew/ruby/lib/rubygems/util/stringio.rb
    19 /home/ew/ruby/lib/rubygems/specification.rb
    20 /home/ew/ruby/lib/rubygems/exceptions.rb
    21 /home/ew/ruby/lib/rubygems/core_ext/kernel_gem.rb
    22 /home/ew/ruby/lib/monitor.rb
    23 /home/ew/ruby/lib/rubygems/core_ext/kernel_require.rb
    24 /home/ew/ruby/lib/rubygems.rb
    25 /home/ew/ruby/lib/delegate.rb
    26 /home/ew/ruby/.ext/x86_64-linux/etc.so
    27 /home/ew/ruby/lib/fileutils.rb
    28 /home/ew/ruby/lib/tmpdir.rb
    29 /home/ew/ruby/lib/tempfile.rb

  • Process memory map:

2b2b1bd91000-2b2b1c041000 r-xp 00000000 08:01 1494973 /home/ew/ruby/ruby
2b2b1c041000-2b2b1c043000 rw-p 00000000 00:00 0
2b2b1c043000-2b2b1c044000 ---p 00000000 00:00 0
2b2b1c044000-2b2b1c048000 rw-p 00000000 00:00 0 [stack:3577]
2b2b1c05c000-2b2b1c160000 rw-p 00000000 00:00 0
2b2b1c240000-2b2b1c246000 rw-p 002af000 08:01 1494973 /home/ew/ruby/ruby
2b2b1c246000-2b2b1c258000 rw-p 00000000 00:00 0
2b2b1c258000-2b2b1c278000 r-xp 00000000 08:01 4697265 /lib/x86_64-linux-gnu/ld-2.13.so
2b2b1c278000-2b2b1c379000 rw-p 00000000 00:00 0
2b2b1c477000-2b2b1c478000 r--p 0001f000 08:01 4697265 /lib/x86_64-linux-gnu/ld-2.13.so
2b2b1c478000-2b2b1c479000 rw-p 00020000 08:01 4697265 /lib/x86_64-linux-gnu/ld-2.13.so
2b2b1c479000-2b2b1c47a000 rw-p 00000000 00:00 0
2b2b1c47a000-2b2b1c491000 r-xp 00000000 08:01 4697270 /lib/x86_64-linux-gnu/libpthread-2.13.so
2b2b1c491000-2b2b1c690000 ---p 00017000 08:01 4697270 /lib/x86_64-linux-gnu/libpthread-2.13.so
2b2b1c690000-2b2b1c691000 r--p 00016000 08:01 4697270 /lib/x86_64-linux-gnu/libpthread-2.13.so
2b2b1c691000-2b2b1c692000 rw-p 00017000 08:01 4697270 /lib/x86_64-linux-gnu/libpthread-2.13.so
2b2b1c692000-2b2b1c696000 rw-p 00000000 00:00 0
2b2b1c696000-2b2b1c69d000 r-xp 00000000 08:01 4697261 /lib/x86_64-linux-gnu/librt-2.13.so
2b2b1c69d000-2b2b1c89c000 ---p 00007000 08:01 4697261 /lib/x86_64-linux-gnu/librt-2.13.so
2b2b1c89c000-2b2b1c89d000 r--p 00006000 08:01 4697261 /lib/x86_64-linux-gnu/librt-2.13.so
2b2b1c89d000-2b2b1c89e000 rw-p 00007000 08:01 4697261 /lib/x86_64-linux-gnu/librt-2.13.so
2b2b1c89e000-2b2b1c905000 r-xp 00000000 08:01 786903 /usr/lib/x86_64-linux-gnu/libgmp.so.10.0.5
2b2b1c905000-2b2b1cb05000 ---p 00067000 08:01 786903 /usr/lib/x86_64-linux-gnu/libgmp.so.10.0.5
2b2b1cb05000-2b2b1cb0d000 rw-p 00067000 08:01 786903 /usr/lib/x86_64-linux-gnu/libgmp.so.10.0.5
2b2b1cb0d000-2b2b1cb0f000 r-xp 00000000 08:01 4697262 /lib/x86_64-linux-gnu/libdl-2.13.so
2b2b1cb0f000-2b2b1cd0f000 ---p 00002000 08:01 4697262 /lib/x86_64-linux-gnu/libdl-2.13.so
2b2b1cd0f000-2b2b1cd10000 r--p 00002000 08:01 4697262 /lib/x86_64-linux-gnu/libdl-2.13.so
2b2b1cd10000-2b2b1cd11000 rw-p 00003000 08:01 4697262 /lib/x86_64-linux-gnu/libdl-2.13.so
2b2b1cd11000-2b2b1cd19000 r-xp 00000000 08:01 4697264 /lib/x86_64-linux-gnu/libcrypt-2.13.so
2b2b1cd19000-2b2b1cf18000 ---p 00008000 08:01 4697264 /lib/x86_64-linux-gnu/libcrypt-2.13.so
2b2b1cf18000-2b2b1cf19000 r--p 00007000 08:01 4697264 /lib/x86_64-linux-gnu/libcrypt-2.13.so
2b2b1cf19000-2b2b1cf1a000 rw-p 00008000 08:01 4697264 /lib/x86_64-linux-gnu/libcrypt-2.13.so
2b2b1cf1a000-2b2b1cf48000 rw-p 00000000 00:00 0
2b2b1cf48000-2b2b1cfc9000 r-xp 00000000 08:01 4697257 /lib/x86_64-linux-gnu/libm-2.13.so
2b2b1cfc9000-2b2b1d1c8000 ---p 00081000 08:01 4697257 /lib/x86_64-linux-gnu/libm-2.13.so
2b2b1d1c8000-2b2b1d1c9000 r--p 00080000 08:01 4697257 /lib/x86_64-linux-gnu/libm-2.13.so
2b2b1d1c9000-2b2b1d1ca000 rw-p 00081000 08:01 4697257 /lib/x86_64-linux-gnu/libm-2.13.so
2b2b1d1ca000-2b2b1d34c000 r-xp 00000000 08:01 4697260 /lib/x86_64-linux-gnu/libc-2.13.so
2b2b1d34c000-2b2b1d54c000 ---p 00182000 08:01 4697260 /lib/x86_64-linux-gnu/libc-2.13.so
2b2b1d54c000-2b2b1d550000 r--p 00182000 08:01 4697260 /lib/x86_64-linux-gnu/libc-2.13.so
2b2b1d550000-2b2b1d551000 rw-p 00186000 08:01 4697260 /lib/x86_64-linux-gnu/libc-2.13.so
2b2b1d551000-2b2b1d656000 rw-p 00000000 00:00 0
2b2b1d656000-2b2b1d658000 r-xp 00000000 08:01 1491407 /home/ew/ruby/.ext/x86_64-linux/enc/encdb.so
2b2b1d658000-2b2b1d857000 ---p 00002000 08:01 1491407 /home/ew/ruby/.ext/x86_64-linux/enc/encdb.so
2b2b1d857000-2b2b1d858000 rw-p 00001000 08:01 1491407 /home/ew/ruby/.ext/x86_64-linux/enc/encdb.so
2b2b1d858000-2b2b1d85a000 r-xp 00000000 08:01 1495011 /home/ew/ruby/.ext/x86_64-linux/enc/trans/transdb.so
2b2b1d85a000-2b2b1da5a000 ---p 00002000 08:01 1495011 /home/ew/ruby/.ext/x86_64-linux/enc/trans/transdb.so
2b2b1da5a000-2b2b1da5b000 rw-p 00002000 08:01 1495011 /home/ew/ruby/.ext/x86_64-linux/enc/trans/transdb.so
2b2b1da5b000-2b2b1db5b000 rw-p 00000000 00:00 0
2b2b1db5b000-2b2b1db5e000 r-xp 00000000 08:01 1494964 /home/ew/ruby/.ext/x86_64-linux/thread.so
2b2b1db5e000-2b2b1dd5e000 ---p 00003000 08:01 1494964 /home/ew/ruby/.ext/x86_64-linux/thread.so
2b2b1dd5e000-2b2b1dd5f000 rw-p 00003000 08:01 1494964 /home/ew/ruby/.ext/x86_64-linux/thread.so
2b2b1dd5f000-2b2b1df5f000 rw-p 00000000 00:00 0
2b2b1df5f000-2b2b1df65000 r-xp 00000000 08:01 1494885 /home/ew/ruby/.ext/x86_64-linux/etc.so
2b2b1df65000-2b2b1e164000 ---p 00006000 08:01 1494885 /home/ew/ruby/.ext/x86_64-linux/etc.so
2b2b1e164000-2b2b1e165000 rw-p 00005000 08:01 1494885 /home/ew/ruby/.ext/x86_64-linux/etc.so
2b2b1e165000-2b2b1e265000 rw-p 00000000 00:00 0
2b2b1e265000-2b2b1e27a000 r-xp 00000000 08:01 1515685 /lib/x86_64-linux-gnu/libgcc_s.so.1
2b2b1e27a000-2b2b1e47a000 ---p 00015000 08:01 1515685 /lib/x86_64-linux-gnu/libgcc_s.so.1
2b2b1e47a000-2b2b1e47b000 rw-p 00015000 08:01 1515685 /lib/x86_64-linux-gnu/libgcc_s.so.1
2b2b1e47b000-2b2b1f262000 r--s 00000000 08:01 1494973 /home/ew/ruby/ruby
2b2b1f262000-2b2b1f3ea000 r--s 00000000 08:01 4697260 /lib/x86_64-linux-gnu/libc-2.13.so
7fff6f825000-7fff6f846000 rw-p 00000000 00:00 0 [stack]
7fff6f9fc000-7fff6f9fd000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

[NOTE]
You may have encountered a bug in the Ruby interpreter or extension libraries.
Bug reports are welcome.
For details: http://www.ruby-lang.org/bugreport.html

make: *** [yes-test-all] Error 1

Updated by ko1 (Koichi Sasada) over 4 years ago

  • Status changed from Closed to Open

tmm1 (Aman Gupta) Could you try r49684?

Updated by naruse (Yui NARUSE) over 4 years ago

  • Backport changed from 2.0.0: UNKNOWN, 2.1: REQUIRED, 2.2: REQUIRED to 2.0.0: UNKNOWN, 2.1: REQUIRED, 2.2: DONE

ruby_2_2 r49716 merged revision(s) 49474,49541,49545,49684.

#11

Updated by naruse (Yui NARUSE) over 4 years ago

  • Status changed from Open to Closed

Applied in changeset backport22:r49716.


merge revision(s) 49474,49541,49545,49684: [Backport #10768]

* gc.c (rb_objspace_free): cause rb_bug if lazy sweep is in progress
  during rb_objspace_free. Adds extra protection for r46340.
  Patch by Vicent Marti. [Bug #10768] [ruby-core:67734]

* gc.c (rb_objspace_call_finalizer): Ensure GC is completed after
  finalizers have run. We already call gc_rest() before invoking
  finalizers, but finalizer can allocate new objects and start new GC
  cycle, so we call gc_rest() again after finalizers are complete.

* gc.c (rb_objspace_call_finalizer): control GC execution during
  force firnalizations at the end of interpreter process.
  [Bug #10768]
  1) Prohibit incremental GC while running Ruby-level finalizers
     to avoid any danger.
  2) Prohibit GC while invoking T_DATA/T_FILE data structure
     because these operations break object relations consistency.
  This patch can introduce another memory consuming issue because
  Ruby-level finalizers can run after (2), GC is disabled.
  However, basically object consistency was broken at (2) as I
  described above. So that running Ruby-level finalizers contains
  danger originally. Because of this point, I need to suggest to
  remove these 3 lines (invoking remaining finalizers). And add a
  rule to add that finalizers should not add new finalizers, or
  say there is no guarantee to invoke finalizers that added by
  another finalizer.

Also available in: Atom PDF