Bug #10206
closedgarbage symbols crash symbol GC
Description
This is reproducible with just a test loop running for serveral minutes/hours:
while make test-all TESTS=-j8; do :; done
It looks like SYM2ID/rb_sym2id interacts badly with dsymbol_check
when it encounters garbage objects.
dsymbol_check replaces an invalid object and returns a new object
for the caller, but the original arg for SYM2ID remains usable
to the caller:
id = SYM2ID(garbage_sym);
do_something(garbage_sym); /* bad invalid object used */
Changing: rb_sym2id(VALUE) to rb_sym2id(VALUE *)
might solve the issue, but introduces many incompatibilities in existing
code:
id = rb_sym2id(&garbage_sym);
do_something(garbage_sym); /* id == garbage_sym, safe to use */
ref: ruby-core thread starting at [ruby-core:64671]
backtraces:
http://80x24.org/r35240/rb-dump.txt
http://80x24.org/r35240/gdb-bt.txt
Updated by nobu (Nobuyoshi Nakada) over 10 years ago
Or revert dsymbol_check()
?
Updated by normalperson (Eric Wong) over 10 years ago
nobu@ruby-lang.org wrote:
Or revert
dsymbol_check()
?
But we need to replace with rb_gc_resurrect, right?
I tried http://80x24.org/10206/resurrect.patch but test-all loop
failed with:
http://80x24.org/10206/resurrect-gdb-bt.txt
http://80x24.org/10206/resurrect-dump.txt
Haven't had time to investigate. I'm also not sure what the problem was
with rb_gc_resurrect originally
Updated by ko1 (Koichi Sasada) over 10 years ago
At first, Symbol is VALUE and it should be marked.
So that the following code should not be allowed.
id = SYM2ID(garbage_sym);
In this case, afeter sweeping, garbage_sym becomes freed VALUE.
What happen on it?
Updated by normalperson (Eric Wong) over 10 years ago
ko1@atdot.net wrote:
At first, Symbol is VALUE and it should be marked.
So that the following code should not be allowed.
id = SYM2ID(garbage_sym);
In this case, afeter sweeping, garbage_sym becomes freed VALUE.
What happen on it?
Looking at this more, we may run dsymbol_check too late in
dsymbol_pindown. I think we must run dsymbol_check immediately after
looking up dynamic syms from global_symbol.str_id, and not later.
I think this may be a fix (still testing):
--- a/symbol.c
+++ b/symbol.c
@@ -458,7 +458,10 @@ dsymbol_pindown(VALUE sym)
if (UNLIKELY(SYMBOL_PINNED_P(sym) == 0)) {
VALUE fstr = RSYMBOL(sym)->fstr;
- sym = dsymbol_check(sym);
+
+ if (UNLIKELY(rb_objspace_garbage_object_p(sym))) {
+ rb_bug("attempted to pindown garbage sym");
+ }
FL_SET(sym, SYMBOL_PINNED);
/* make it permanent object */
@@ -525,6 +528,9 @@ rb_intern_cstr_without_pindown(const char *name, long len, rb_encoding *enc)
OBJ_FREEZE(str);
if (st_lookup(global_symbols.str_id, str, &id)) {
+ if (ID_DYNAMIC_SYM_P((ID)id)) {
+ return (ID)dsymbol_check((VALUE)id);
+ }
return (ID)id;
}
Updated by normalperson (Eric Wong) over 10 years ago
Eric Wong normalperson@yhbt.net wrote:
I think this may be a fix (still testing):
Nope. However, I think it takes longer in the test-all loop to
reproduce the problem.
--- a/symbol.c +++ b/symbol.c @@ -458,7 +458,10 @@ dsymbol_pindown(VALUE sym) if (UNLIKELY(SYMBOL_PINNED_P(sym) == 0)) { VALUE fstr = RSYMBOL(sym)->fstr; - sym = dsymbol_check(sym); + + if (UNLIKELY(rb_objspace_garbage_object_p(sym))) { + rb_bug("attempted to pindown garbage sym"); + }
I still hit this rb_bug (similar backtraces as before).
FL_SET(sym, SYMBOL_PINNED); /* make it permanent object */ @@ -525,6 +528,9 @@ rb_intern_cstr_without_pindown(const char *name, long len, rb_encoding *enc) OBJ_FREEZE(str); if (st_lookup(global_symbols.str_id, str, &id)) { + if (ID_DYNAMIC_SYM_P((ID)id)) { + return (ID)dsymbol_check((VALUE)id); + }
However, I think this dsymbol_check still is worthwhile.
return (ID)id; }
Updated by normalperson (Eric Wong) over 10 years ago
I'm looking into uses of intern_cstr_without_pindown in parse.y causing
garbage syms.
Unfortunately, I do not yet understand why we avoid pindown in parse.y
(or much of parse.y). I thought symbol GC was only to help users who
use String#to_sym too aggressively.
compile.c:
case TS_ID: /* ID */
generated_iseq[pos + 1 + j] = SYM2ID(operands[j]);
Updated by ko1 (Koichi Sasada) over 10 years ago
Unfortunately, I do not yet understand why we avoid pindown in parse.y
(or much of parse.y). I thought symbol GC was only to help users who
use String#to_sym too aggressively.
Exactlly. However, nobu wants to reduce immoratal symbols from parse.y.
(I'm strongly against for such optimization)
My proposal is to avoid such `without_pindown' functions completely.
Updated by nobu (Nobuyoshi Nakada) over 10 years ago
One problem about it is ripper.
The result of Ripper.parse
is transient, but symbols by its side-effect are permanent, right now.
So now I'm thinking the plan:
- make all IDs permanent, as ko1 claims
- isolate Symbols in ripper from IDs
Updated by normalperson (Eric Wong) over 10 years ago
nobu@ruby-lang.org wrote:
One problem about it is ripper.
The result ofRipper.parse
is transient, but symbols by its side-effect are permanent, right now.So now I'm thinking the plan:
- make all IDs permanent, as ko1 claims
- isolate Symbols in ripper from IDs
nobu: can you fix this in time for 2.2.0-preview1 release?
Otherwise, I propose the following temporary fix:
--- a/parse.y
+++ b/parse.y
@@ -285,7 +285,7 @@ struct parser_params {
#ifdef RIPPER
#define intern_cstr_without_pindown(n,l,en) rb_intern3(n,l,en)
#else
-#define intern_cstr_without_pindown(n,l,en) rb_intern_cstr_without_pindown(n,l,en)
+#define intern_cstr_without_pindown(n,l,en) rb_intern3(n,l,en)
#endif
#define STR_NEW(p,n) rb_enc_str_new((p),(n),current_enc)
Updated by nobu (Nobuyoshi Nakada) over 10 years ago
- Status changed from Open to Closed
- % Done changed from 0 to 100
Applied in changeset r47569.
parse.y: intern_cstr
- parse.y (intern_cstr): remove
_without_pindown
suffix and use
rb_intern3() as well as RIPPER, for the time being.
[ruby-core:65009] [Bug #10206]
Updated by ko1 (Koichi Sasada) over 10 years ago
(2014/09/13 7:59), Eric Wong wrote:
- make all IDs permanent, as ko1 claims
+1
--
// SASADA Koichi at atdot dot net