Bug #10206
closedgarbage symbols crash symbol GC
Description
This is reproducible with just a test loop running for serveral minutes/hours:
while make test-all TESTS=-j8; do :; done
It looks like SYM2ID/rb_sym2id interacts badly with dsymbol_check
when it encounters garbage objects.
dsymbol_check replaces an invalid object and returns a new object
for the caller, but the original arg for SYM2ID remains usable
to the caller:
id = SYM2ID(garbage_sym);
do_something(garbage_sym); /* bad invalid object used */
Changing: rb_sym2id(VALUE) to rb_sym2id(VALUE *)
might solve the issue, but introduces many incompatibilities in existing
code:
id = rb_sym2id(&garbage_sym);
do_something(garbage_sym); /* id == garbage_sym, safe to use */
ref: ruby-core thread starting at [ruby-core:64671]
backtraces:
http://80x24.org/r35240/rb-dump.txt
http://80x24.org/r35240/gdb-bt.txt
Updated by nobu (Nobuyoshi Nakada) over 9 years ago
Or revert dsymbol_check()
?
Updated by normalperson (Eric Wong) over 9 years ago
nobu@ruby-lang.org wrote:
Or revert
dsymbol_check()
?
But we need to replace with rb_gc_resurrect, right?
I tried http://80x24.org/10206/resurrect.patch but test-all loop
failed with:
http://80x24.org/10206/resurrect-gdb-bt.txt
http://80x24.org/10206/resurrect-dump.txt
Haven't had time to investigate. I'm also not sure what the problem was
with rb_gc_resurrect originally
Updated by ko1 (Koichi Sasada) over 9 years ago
At first, Symbol is VALUE and it should be marked.
So that the following code should not be allowed.
id = SYM2ID(garbage_sym);
In this case, afeter sweeping, garbage_sym becomes freed VALUE.
What happen on it?
Updated by normalperson (Eric Wong) over 9 years ago
ko1@atdot.net wrote:
At first, Symbol is VALUE and it should be marked.
So that the following code should not be allowed.
id = SYM2ID(garbage_sym);
In this case, afeter sweeping, garbage_sym becomes freed VALUE.
What happen on it?
Looking at this more, we may run dsymbol_check too late in
dsymbol_pindown. I think we must run dsymbol_check immediately after
looking up dynamic syms from global_symbol.str_id, and not later.
I think this may be a fix (still testing):
--- a/symbol.c
+++ b/symbol.c
@@ -458,7 +458,10 @@ dsymbol_pindown(VALUE sym)
if (UNLIKELY(SYMBOL_PINNED_P(sym) == 0)) {
VALUE fstr = RSYMBOL(sym)->fstr;
- sym = dsymbol_check(sym);
+
+ if (UNLIKELY(rb_objspace_garbage_object_p(sym))) {
+ rb_bug("attempted to pindown garbage sym");
+ }
FL_SET(sym, SYMBOL_PINNED);
/* make it permanent object */
@@ -525,6 +528,9 @@ rb_intern_cstr_without_pindown(const char *name, long len, rb_encoding *enc)
OBJ_FREEZE(str);
if (st_lookup(global_symbols.str_id, str, &id)) {
+ if (ID_DYNAMIC_SYM_P((ID)id)) {
+ return (ID)dsymbol_check((VALUE)id);
+ }
return (ID)id;
}
Updated by normalperson (Eric Wong) over 9 years ago
Eric Wong normalperson@yhbt.net wrote:
I think this may be a fix (still testing):
Nope. However, I think it takes longer in the test-all loop to
reproduce the problem.
--- a/symbol.c +++ b/symbol.c @@ -458,7 +458,10 @@ dsymbol_pindown(VALUE sym) if (UNLIKELY(SYMBOL_PINNED_P(sym) == 0)) { VALUE fstr = RSYMBOL(sym)->fstr; - sym = dsymbol_check(sym); + + if (UNLIKELY(rb_objspace_garbage_object_p(sym))) { + rb_bug("attempted to pindown garbage sym"); + }
I still hit this rb_bug (similar backtraces as before).
FL_SET(sym, SYMBOL_PINNED); /* make it permanent object */ @@ -525,6 +528,9 @@ rb_intern_cstr_without_pindown(const char *name, long len, rb_encoding *enc) OBJ_FREEZE(str); if (st_lookup(global_symbols.str_id, str, &id)) { + if (ID_DYNAMIC_SYM_P((ID)id)) { + return (ID)dsymbol_check((VALUE)id); + }
However, I think this dsymbol_check still is worthwhile.
return (ID)id; }
Updated by normalperson (Eric Wong) over 9 years ago
I'm looking into uses of intern_cstr_without_pindown in parse.y causing
garbage syms.
Unfortunately, I do not yet understand why we avoid pindown in parse.y
(or much of parse.y). I thought symbol GC was only to help users who
use String#to_sym too aggressively.
compile.c:
case TS_ID: /* ID */
generated_iseq[pos + 1 + j] = SYM2ID(operands[j]);
Updated by ko1 (Koichi Sasada) over 9 years ago
Unfortunately, I do not yet understand why we avoid pindown in parse.y
(or much of parse.y). I thought symbol GC was only to help users who
use String#to_sym too aggressively.
Exactlly. However, nobu wants to reduce immoratal symbols from parse.y.
(I'm strongly against for such optimization)
My proposal is to avoid such `without_pindown' functions completely.
Updated by nobu (Nobuyoshi Nakada) over 9 years ago
One problem about it is ripper.
The result of Ripper.parse
is transient, but symbols by its side-effect are permanent, right now.
So now I'm thinking the plan:
- make all IDs permanent, as ko1 claims
- isolate Symbols in ripper from IDs
Updated by normalperson (Eric Wong) over 9 years ago
nobu@ruby-lang.org wrote:
One problem about it is ripper.
The result ofRipper.parse
is transient, but symbols by its side-effect are permanent, right now.So now I'm thinking the plan:
- make all IDs permanent, as ko1 claims
- isolate Symbols in ripper from IDs
nobu: can you fix this in time for 2.2.0-preview1 release?
Otherwise, I propose the following temporary fix:
--- a/parse.y
+++ b/parse.y
@@ -285,7 +285,7 @@ struct parser_params {
#ifdef RIPPER
#define intern_cstr_without_pindown(n,l,en) rb_intern3(n,l,en)
#else
-#define intern_cstr_without_pindown(n,l,en) rb_intern_cstr_without_pindown(n,l,en)
+#define intern_cstr_without_pindown(n,l,en) rb_intern3(n,l,en)
#endif
#define STR_NEW(p,n) rb_enc_str_new((p),(n),current_enc)
Updated by nobu (Nobuyoshi Nakada) over 9 years ago
- Status changed from Open to Closed
- % Done changed from 0 to 100
Applied in changeset r47569.
parse.y: intern_cstr
- parse.y (intern_cstr): remove
_without_pindown
suffix and use
rb_intern3() as well as RIPPER, for the time being.
[ruby-core:65009] [Bug #10206]
Updated by ko1 (Koichi Sasada) over 9 years ago
(2014/09/13 7:59), Eric Wong wrote:
- make all IDs permanent, as ko1 claims
+1
--
// SASADA Koichi at atdot dot net