Process termination three seconds after thread termination dumps core

Added by mame (Yusuke Endoh) over 1 year ago. Updated over 1 year ago.

The following code occasionally dumps core.

sleep 2.99
$ ruby test.rb
[BUG] Segmentation fault at 0x0000000000000440
ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-linux]


An internal batch script in our company aborts randomly due to this issue.

@ko1 (Koichi Sasada) and I investigated the issue. When a Ruby Thread terminates, its internal pthread is reused by another Ruby Thread if a new Ruby Thread is created within three minutes. If it is not reused, the pthread frees sigaltstack memory (by using xfree) and terminates. However, if the Ruby process starts terminating while the pthread is waiting for three minutes, Ruby VM and GC is destructed. After that, xfree is no longer available, which leads to the segfault.

A simple solution is to use malloc/free instead of Ruby's xmalloc/xfree, but it would be better to trigger GC when memory allocation fails.

diff --git a/signal.c b/signal.c
index 764031e78a..f0ed7f90d4 100644
--- a/signal.c
+++ b/signal.c
@@ -560,7 +560,9 @@ rb_allocate_sigaltstack(void)
     if (!rb_sigaltstack_size_value) {
        rb_sigaltstack_size_value = rb_sigaltstack_size();
-    return xmalloc(rb_sigaltstack_size_value);
+    void *altstack = malloc(rb_sigaltstack_size_value);
+    if (!altstack) rb_memerror();
+    return altstack;

 /* alternate stack for SIGSEGV */
diff --git a/vm_core.h b/vm_core.h
index 5db3080b43..2962356212 100644
--- a/vm_core.h
+++ b/vm_core.h
@@ -136,7 +136,7 @@
 void *rb_allocate_sigaltstack(void);
 void *rb_register_sigaltstack(void *);
 #  define RB_ALTSTACK_INIT(var, altstack) var = rb_register_sigaltstack(altstack)
-#  define RB_ALTSTACK_FREE(var) xfree(var)
+#  define RB_ALTSTACK_FREE(var) free(var)
 #  define RB_ALTSTACK(var)  var
 #else /* noop */
 #  define RB_ALTSTACK_INIT(var, altstack)

Updated by nobu (Nobuyoshi Nakada) over 1 year ago

  • Backport changed from 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN to 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: REQUIRED

Seems OK.
As I can't reproduce it with 2.7.2p137 on Ubuntu 21.04, leaving backports to 2.6 and 2.7 UNKNOWN.

Updated by mame (Yusuke Endoh) over 1 year ago

Applied in changeset git|f336a3eb6c76890f3d8f878725b3d328c8fdcf33.

Use free instead of xfree to free altstack

The altstack memory of a thread may be free'ed even after the VM is
destructed. After that, GC is no longer available, so calling xfree
may lead to a segfault.

This changeset uses the bare free function to free the altstack memory
instead of xfree. [Bug #18126]

Updated by nagachika (Tomoyuki Chikanaga) over 1 year ago

  • Backport changed from 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: REQUIRED to 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: DONE

ruby_3_0 13f64b65e0476c2fe416a29274fcc91e3c0cf5d3 merged revision(s) f336a3eb6c76890f3d8f878725b3d328c8fdcf33.


