Project

General

Profile

Actions

Bug #21655

closed

segfault when building 3.3.10 with GCC 15.2.1, regression from 3.3.9

Bug #21655: segfault when building 3.3.10 with GCC 15.2.1, regression from 3.3.9

Added by kurly (Greg Kubaryk) 21 days ago. Updated 8 days ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.3.10 (2025-10-23 revision 343ea05002) [x86_64-linux]
[ruby-core:123582]

Description

ref downstream bug https://bugs.gentoo.org/965095 - reporting upstream because I was able to reproduce the problem from ruby-3.3.10.tar.xz manually

build log excerpt; the rest will be provided as an attachment

gcc -O2 -pipe -march=amdfam10  -L. -fstack-protector-strong -rdynamic -Wl,-export-dynamic -fstack-protector-strong -pie  main.o dmydln.o miniinit.o dmyext.o array.o ast.o bignum.o class.o compar.o compile.o complex.o cont.o debug.o debug_counter.o dir.o dln_find.o encoding.o enum.o enumerator.o error.o eval.o file.o gc.o hash.o inits.o io.o io_buffer.o iseq.o load.o marshal.o math.o memory_view.o rjit.o rjit_c.o node.o node_dump.o numeric.o object.o pack.o parse.o parser_st.o proc.o process.o ractor.o random.o range.o rational.o re.o regcomp.o regenc.o regerror.o regexec.o regparse.o regsyntax.o ruby.o ruby_parser.o scheduler.o shape.o signal.o sprintf.o st.o strftime.o string.o struct.o symbol.o thread.o time.o transcode.o util.o variable.o version.o vm.o vm_backtrace.o vm_dump.o vm_sync.o vm_trace.o weakmap.o prism/api_node.o prism/api_pack.o prism/diagnostic.o prism/encoding.o prism/extension.o prism/node.o prism/options.o prism/pack.o prism/prettyprint.o prism/regexp.o prism/serialize.o prism/token_type.o prism/util/pm_buffer.o prism/util/pm_char.o prism/util/pm_constant_pool.o prism/util/pm_list.o prism/util/pm_memchr.o prism/util/pm_newline_list.o prism/util/pm_state_stack.o prism/util/pm_string.o prism/util/pm_string_list.o prism/util/pm_strncasecmp.o prism/util/pm_strpbrk.o prism/prism.o prism_init.o yjit.o yjit/target/release/libyjit.o coroutine/amd64/Context.o  enc/ascii.o enc/us_ascii.o enc/unicode.o enc/utf_8.o enc/trans/newline.o setproctitle.o addr2line.o  -lz -lrt -lrt -lgmp -ldl -lcrypt -lm -lpthread  -o miniruby
:
./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -o builtin_binary.inc \
	./template/builtin_binary.inc.tmpl
make: *** [uncommon.mk:1316: builtin_binary.inc] Segmentation fault (core dumped)

Files

buildlog (76.5 KB) buildlog output of: ./configure CFLAGS="-O2 -pipe -march=amdfam10" && make -j8 V=1 kurly (Greg Kubaryk), 10/29/2025 05:33 AM

Related issues 1 (0 open1 closed)

Has duplicate Ruby - Bug #21516: Segfault in String#succ! on 32-bit i686ClosedActions

Updated by kurly (Greg Kubaryk) 21 days ago Actions #1 [ruby-core:123583]

backtrace using a gentoo-built build with -ggdb3 added to CFLAGS

beans ~ # cd /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/
beans /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10 # gdb --args ./miniruby -I./lib -I. -I.ext/common  -n -e 'BEGIN{version=ARGV.shift;mis=ARGV.dup}' -e 'END{abort "UNICODE version mismatch: #{mis}" unless mis.empty?}' -e '(mis.delete(ARGF.path); ARGF.close) if /ONIG_UNICODE_VERSION_STRING +"#{Regexp.quote(version)}"/o' 15.0.0 ./enc/unicode/15.0.0/casefold.h ./enc/unicode/15.0.0/name2ctype.h ./miniruby -I./lib -I. -I.ext/common  ./tool/generic_erb.rb -o builtin_binary.inc ./template/builtin_binary.inc.tmpl
GNU gdb (Gentoo 16.3 vanilla) 16.3
Copyright (C) 2024 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./miniruby...
warning: File "/var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/.gdbinit" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
	add-auto-load-safe-path /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/.gdbinit
line to your configuration file "/root/.config/gdb/gdbinit".
To completely disable this security protection add
	set auto-load safe-path /
line to your configuration file "/root/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"
(gdb) run
Starting program: /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/miniruby -I./lib -I. -I.ext/common -n -e BEGIN\{version=ARGV.shift\;mis=ARGV.dup\} -e END\{abort\ \"UNICODE\ version\ mismatch:\ \#\{mis\}\"\ unless\ mis.empty\?\} -e \(mis.delete\(ARGF.path\)\;\ ARGF.close\)\ if\ /ONIG_UNICODE_VERSION_STRING\ +\"\#\{Regexp.quote\(version\)\}\"/o 15.0.0 ./enc/unicode/15.0.0/casefold.h ./enc/unicode/15.0.0/name2ctype.h ./miniruby -I./lib -I. -I.ext/common ./tool/generic_erb.rb -o builtin_binary.inc ./template/builtin_binary.inc.tmpl
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7c76d69 in malloc_usable_size () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff7c76d69 in malloc_usable_size () from /lib64/libc.so.6
#1  0x000055555563b490 in objspace_malloc_size (objspace=0x5555559a7550, ptr=0x45005555559e6dc0, hint=<optimized out>)
    at gc.c:12381
#2  objspace_xrealloc (objspace=0x5555559a7550, ptr=0x45005555559e6dc0, new_size=18, old_size=<optimized out>) at gc.c:12689
#3  0x0000555555760a8a in rb_str_resize (str=str@entry=140737348585000, len=17) at string.c:3179
#4  0x0000555555760c2d in rb_fstring (str=str@entry=140737348585000) at ./include/ruby/internal/core/rstring.h:369
#5  0x00005555557a32c8 in build_const_pathname (head=<optimized out>, tail=140737348585040) at variable.c:397
#6  rb_set_class_path_string (klass=klass@entry=140737348547760, under=under@entry=140737348551920, name=140737348585040)
    at variable.c:417
#7  0x000055555559c2f4 in rb_define_class_id_under_no_pin (outer=140737348551920, id=10891, super=140737348552240, 
    super@entry=93824992527378) at class.c:1035
#8  0x000055555559c412 in rb_define_class_id_under (outer=<optimized out>, id=<optimized out>, super=super@entry=93824992527378)
    at class.c:1045
#9  0x000055555559c470 in rb_define_class_under (outer=<optimized out>, name=name@entry=0x555555846fdc "EADDRINUSE", 
    super=93824992527378) at class.c:1006
#10 0x000055555560e153 in set_syserr (n=n@entry=98, name=name@entry=0x555555846fdc "EADDRINUSE") at error.c:2711
#11 0x00005555556148e9 in Init_syserr () at /var/tmp/portage/dev-lang/ruby-3.3.10/work/ruby-3.3.10/known_errors.inc:18
#12 0x0000555555647eaa in rb_call_inits () at inits.c:40
#13 0x0000555555618943 in ruby_setup () at eval.c:89
#14 0x000055555561a40d in ruby_init () at eval.c:101
#15 0x0000555555576193 in rb_main (argc=22, argv=0x7fffffffe098) at ./main.c:38
#16 main (argc=<optimized out>, argv=<optimized out>) at ./main.c:64
(gdb) frame 3
#3  0x0000555555760a8a in rb_str_resize (str=str@entry=140737348585000, len=17) at string.c:3179
3179	            SIZED_REALLOC_N(RSTRING(str)->as.heap.ptr, char,
(gdb) p str
$1 = 140737348585000
(gdb) p *str
$2 = 8396805
(gdb) 

Updated by hsbt (Hiroshi SHIBATA) 21 days ago Actions #2

  • Description updated (diff)

Updated by kurly (Greg Kubaryk) 21 days ago Actions #3 [ruby-core:123584]

Thank you for fixing the markdown in the comment 0.

On an affected machine, I was able to bisect the git repo between tags v3_3_9 and v3_3_10:

5a8d7642168f4ea0d9331fded3033c225bbc36c5 is the first bad commit
commit 5a8d7642168f4ea0d9331fded3033c225bbc36c5 (HEAD)
Author:     nagachika <nagachika@ruby-lang.org>
AuthorDate: Wed Oct 8 22:55:33 2025 +0900
Commit:     nagachika <nagachika@ruby-lang.org>
CommitDate: Wed Oct 8 22:56:02 2025 +0900

    merge revision(s) 43dbb9a93f4de3f1170d7d18641c30e81cc08365, 2bb6fe3854e2a4854bb89bfce4eaaea9d848fd1b, 7c9dd0ecff61153b96473c6c51d5582e809da489: [Backport #21629]
    
            [PATCH] [Bug #21629] Enable `nonstring` attribute on clang 21
    
            [PATCH] [Bug #21629] Initialize `struct RString`
    
            [PATCH] [Bug #21629] Initialize `struct RArray`

 error.c                                | 2 +-
 ext/-test-/string/fstring.c            | 2 +-
 include/ruby/internal/attr/nonstring.h | 8 ++++++++
 include/ruby/internal/core/rbasic.h    | 3 +++
 include/ruby/internal/core/rstring.h   | 2 +-
 load.c                                 | 4 ++--
 marshal.c                              | 2 +-
 string.c                               | 8 ++++----
 symbol.c                               | 8 ++++----
 version.h                              | 2 +-
 10 files changed, 26 insertions(+), 15 deletions(-)

I was not able to reproduce the build failure for ruby 3.3.10 on an Ubuntu 24.04 machine using gcc-13.3.0.

Updated by kurly (Greg Kubaryk) 21 days ago Actions #4 [ruby-core:123585]

I manually bisected inside that "bad" commit and found that this minimal diff on top of v3_3_10 eliminates the build failure:

diff --git a/include/ruby/internal/core/rstring.h b/include/ruby/internal/core/rstring.h
index 9cf9daa97c..0bca74e688 100644
--- a/include/ruby/internal/core/rstring.h
+++ b/include/ruby/internal/core/rstring.h
@@ -395,7 +395,7 @@ rbimpl_rstring_getmem(VALUE str)
     }
     else {
         /* Expecting compilers to optimize this on-stack struct away. */
-        struct RString retval = {RBASIC_INIT};
+        struct RString retval;
         retval.len = RSTRING_LEN(str);
         retval.as.heap.ptr = RSTRING(str)->as.embed.ary;
         return retval;

Updated by alanwu (Alan Wu) 20 days ago · Edited Actions #5 [ruby-core:123604]

It's surprising that leaving the temporary struct uninitialized avoids the crash. Smells like a GCC bug or some UB on our end the optimizer is exploiting.

Does ./configure optflags=-fno-strict-aliasing ... help?

Updated by kurly (Greg Kubaryk) 20 days ago Actions #6 [ruby-core:123605]

alanwu (Alan Wu) wrote in #note-5:

It's surprising that leaving the temporary struct uninitialized avoids the crash. Smells like a GCC bug or some UB on our end the optimizer is exploiting.

Does ./configure optflags=-fno-strict-aliasing ... help?

It doesn't appear to, when added to CFLAGS nor optflags.

Updated by kurly (Greg Kubaryk) 20 days ago Actions #7 [ruby-core:123606]

alanwu (Alan Wu) wrote in #note-5:

It's surprising that leaving the temporary struct uninitialized avoids the crash. Smells like a GCC bug or some UB on our end the optimizer is exploiting.

Does ./configure optflags=-fno-strict-aliasing ... help?

Per a suggestion on the downstream bug, I tried adding -fno-ipa-modref to CFLAGS, and that was sufficient to fix the build.

Updated by alanwu (Alan Wu) 20 days ago Actions #8 [ruby-core:123607]

Looks like you're not building with LTO, so the miscomp from ipa-modref should be in rb_str_resize(). That should be enough for a bug report for GCC, since they need a preprocessed C file.

Maybe this is hitting the same GCC bug as this: https://patchwork.sourceware.org/project/gdb/patch/20250712131649.8372-1-tdevries@suse.de/#206552 which -fno-ipa-modref also fixes. Unfortunately the bug on GCC side is still unresolved: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120987

Updated by thesamesam (Sam James) 20 days ago · Edited Actions #9 [ruby-core:123609]

alanwu (Alan Wu) wrote in #note-8:

Looks like you're not building with LTO, so the miscomp from ipa-modref should be in rb_str_resize(). That should be enough for a bug report for GCC, since they need a preprocessed C file.

Maybe this is hitting the same GCC bug as this: https://patchwork.sourceware.org/project/gdb/patch/20250712131649.8372-1-tdevries@suse.de/#206552 which -fno-ipa-modref also fixes. Unfortunately the bug on GCC side is still unresolved: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120987

That bug has a patch which should workaround (or fix, it's not clear) that specific issue, but I asked OP to test that and it didn't help this bug. But it may (or may not) still be an issue in modref, just probably not that bug.

Updated by alanwu (Alan Wu) 10 days ago · Edited Actions #11 [ruby-core:123735]

  • Status changed from Open to Third Party's Issue

Thanks, I was able to repro this locally. I confirmed that it's a miscompilation,
ran reduction and sent a GCC bug report https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122610

The following workaround fixes the build for me:

diff --git a/include/ruby/internal/core/rstring.h b/patched-rstring.h
index 9cf9daa..d76ba9c 100644
--- a/include/ruby/internal/core/rstring.h
+++ b/patched-rstring.h
@@ -415,7 +415,9 @@ RBIMPL_ATTR_ARTIFICIAL()
 static inline char *
 RSTRING_PTR(VALUE str)
 {
-    char *ptr = rbimpl_rstring_getmem(str).as.heap.ptr;
+    char *ptr = RB_FL_TEST_RAW(str, RSTRING_NOEMBED) ?
+        RSTRING(str)->as.heap.ptr :
+        RSTRING(str)->as.embed.ary;
 
     if (RUBY_DEBUG && RB_UNLIKELY(! ptr)) {
         /* :BEWARE: @shyouhei thinks  that currently, there are  rooms for this

It fixes this particular instance, but with an optimizer bug in play, who knows where else
we're hitting it. To dodge the bug, maybe all usages of rbimpl_rstring_getmem() need to be
rewritten. I'll defer to @nagachika (Tomoyuki Chikanaga) whether we want to apply a workaround for ruby_3_3.

This shows up as a miscompilation of str_buf_cat4(), particularly this part:

RESIZE_CAPA_TERM(str, capa, termlen);
sptr = RSTRING_PTR(str);

GCC deletes the RSTRING_PTR() reload in case the string grows and turns from
embedded to heap. The memcpy afterwards using the stale sptr then stomps on the
newly allocated pointer.

Updated by alanwu (Alan Wu) 10 days ago Actions #12

  • Has duplicate Bug #21516: Segfault in String#succ! on 32-bit i686 added

Updated by alanwu (Alan Wu) 10 days ago Actions #13

  • Subject changed from segfault when building 3.3.10, regression from 3.3.9 to segfault when building 3.3.10 with GCC 15.2.1, regression from 3.3.9

Updated by nagachika (Tomoyuki Chikanaga) 10 days ago Actions #14 [ruby-core:123740]

  • Backport changed from 3.2: UNKNOWN, 3.3: UNKNOWN, 3.4: UNKNOWN to 3.2: UNKNOWN, 3.3: REQUIRED, 3.4: REQUIRED

Thanks @kurly and @alanwu (Alan Wu). I understand the situation now.
I'm ready to apply a workarounds for ruby_3_3.

The implementation of rbimpl_rstring_getmem() is the same on master and ruby_3_4. The issue with GCC 15.2.1 likely exist on these branches as well. I prefer to apply the workaround on master first and backport it to the stable branches.
Do you know the issue was reproducible on ruby-3.4.7 or master branch?

Updated by kurly (Greg Kubaryk) 10 days ago 1Actions #15 [ruby-core:123741]

nagachika (Tomoyuki Chikanaga) wrote in #note-14:

Do you know the issue was reproducible on ruby-3.4.7 or master branch?

Yes, the issue can be reproduced at master branch (529dd8d76efbe655fabce8933852504851266b2b) and ruby_3_4 tag (9e426489f00a1b7816e8c6f299daa6116c5f505d) both.

Updated by alanwu (Alan Wu) 10 days ago Actions #16 [ruby-core:123742]

A build of the master branch with options from OP crashes the same way as ruby_3_3 for me... I'll commit something later.
Because inter-procedural analysis is key to triggering the miscompilation, changes to seemingly unrelated places can dodge it.
Finding the best place to change will be easier after we learn how the GCC folks fix it.
Making code changes to dodge this while ipa-modref is active seems like playing with fire, though.
The GCC issue exists from version 12 through 15.2.1 and on a recent trunk build.
Maybe we need to resort to adding a check for this in configure.ac because even after a GCC fix not everyone will update.

Updated by nagachika (Tomoyuki Chikanaga) 10 days ago Actions #17 [ruby-core:123745]

Just FYI: I have filed a pull request based on alan's patch. https://github.com/ruby/ruby/pull/15113

IMO, rbimpl_rstring_getmem() is a little tricky, so I feel it's reasonable to remove it.
However, it's possible that I'm unaware of a reason why this function is necessary, such as for some optimization purposes...

Updated by nagachika (Tomoyuki Chikanaga) 8 days ago Actions #18 [ruby-core:123764]

  • Status changed from Third Party's Issue to Closed
Actions

Also available in: PDF Atom