Project

General

Profile

Actions

Bug #10206

closed

garbage symbols crash symbol GC

Added by normalperson (Eric Wong) almost 10 years ago. Updated almost 10 years ago.

Status:
Closed
Target version:
ruby -v:
trunk
[ruby-core:64806]

Description

This is reproducible with just a test loop running for serveral minutes/hours:

while make test-all TESTS=-j8; do :; done

It looks like SYM2ID/rb_sym2id interacts badly with dsymbol_check
when it encounters garbage objects.

dsymbol_check replaces an invalid object and returns a new object
for the caller, but the original arg for SYM2ID remains usable
to the caller:

    id = SYM2ID(garbage_sym);
    do_something(garbage_sym); /* bad invalid object used */

Changing: rb_sym2id(VALUE) to rb_sym2id(VALUE *)
might solve the issue, but introduces many incompatibilities in existing
code:

    id = rb_sym2id(&garbage_sym);
    do_something(garbage_sym); /* id == garbage_sym, safe to use */

ref: ruby-core thread starting at [ruby-core:64671]
backtraces:
http://80x24.org/r35240/rb-dump.txt
http://80x24.org/r35240/gdb-bt.txt

Updated by nobu (Nobuyoshi Nakada) almost 10 years ago

Or revert dsymbol_check()?

Updated by normalperson (Eric Wong) almost 10 years ago

wrote:

Or revert dsymbol_check()?

But we need to replace with rb_gc_resurrect, right?

I tried http://80x24.org/10206/resurrect.patch but test-all loop
failed with:
http://80x24.org/10206/resurrect-gdb-bt.txt
http://80x24.org/10206/resurrect-dump.txt

Haven't had time to investigate. I'm also not sure what the problem was
with rb_gc_resurrect originally

Updated by ko1 (Koichi Sasada) almost 10 years ago

At first, Symbol is VALUE and it should be marked.

So that the following code should not be allowed.

id = SYM2ID(garbage_sym);

In this case, afeter sweeping, garbage_sym becomes freed VALUE.

What happen on it?

Updated by normalperson (Eric Wong) almost 10 years ago

wrote:

At first, Symbol is VALUE and it should be marked.

So that the following code should not be allowed.

id = SYM2ID(garbage_sym);

In this case, afeter sweeping, garbage_sym becomes freed VALUE.

What happen on it?

Looking at this more, we may run dsymbol_check too late in
dsymbol_pindown. I think we must run dsymbol_check immediately after
looking up dynamic syms from global_symbol.str_id, and not later.

I think this may be a fix (still testing):

--- a/symbol.c
+++ b/symbol.c
@@ -458,7 +458,10 @@ dsymbol_pindown(VALUE sym)
 
     if (UNLIKELY(SYMBOL_PINNED_P(sym) == 0)) {
 	VALUE fstr = RSYMBOL(sym)->fstr;
-	sym = dsymbol_check(sym);
+
+	if (UNLIKELY(rb_objspace_garbage_object_p(sym))) {
+	    rb_bug("attempted to pindown garbage sym");
+	}
 	FL_SET(sym, SYMBOL_PINNED);
 
 	/* make it permanent object */
@@ -525,6 +528,9 @@ rb_intern_cstr_without_pindown(const char *name, long len, rb_encoding *enc)
     OBJ_FREEZE(str);
 
     if (st_lookup(global_symbols.str_id, str, &id)) {
+	if (ID_DYNAMIC_SYM_P((ID)id)) {
+	    return (ID)dsymbol_check((VALUE)id);
+	}
 	return (ID)id;
     }

Updated by normalperson (Eric Wong) almost 10 years ago

Eric Wong wrote:

I think this may be a fix (still testing):

Nope. However, I think it takes longer in the test-all loop to
reproduce the problem.

--- a/symbol.c
+++ b/symbol.c
@@ -458,7 +458,10 @@ dsymbol_pindown(VALUE sym)
 
     if (UNLIKELY(SYMBOL_PINNED_P(sym) == 0)) {
 	VALUE fstr = RSYMBOL(sym)->fstr;
-	sym = dsymbol_check(sym);
+
+	if (UNLIKELY(rb_objspace_garbage_object_p(sym))) {
+	    rb_bug("attempted to pindown garbage sym");
+	}

I still hit this rb_bug (similar backtraces as before).

 	FL_SET(sym, SYMBOL_PINNED);
 
 	/* make it permanent object */
@@ -525,6 +528,9 @@ rb_intern_cstr_without_pindown(const char *name, long len, rb_encoding *enc)
     OBJ_FREEZE(str);
 
     if (st_lookup(global_symbols.str_id, str, &id)) {
+	if (ID_DYNAMIC_SYM_P((ID)id)) {
+	    return (ID)dsymbol_check((VALUE)id);
+	}

However, I think this dsymbol_check still is worthwhile.

 	return (ID)id;
     }

Updated by normalperson (Eric Wong) almost 10 years ago

I'm looking into uses of intern_cstr_without_pindown in parse.y causing
garbage syms.

Unfortunately, I do not yet understand why we avoid pindown in parse.y
(or much of parse.y). I thought symbol GC was only to help users who
use String#to_sym too aggressively.

http://80x24.org/r35240/gdb-bt.txt

compile.c:
case TS_ID: /* ID */
generated_iseq[pos + 1 + j] = SYM2ID(operands[j]);

Updated by ko1 (Koichi Sasada) almost 10 years ago

Unfortunately, I do not yet understand why we avoid pindown in parse.y
(or much of parse.y). I thought symbol GC was only to help users who
use String#to_sym too aggressively.

Exactlly. However, nobu wants to reduce immoratal symbols from parse.y.
(I'm strongly against for such optimization)

My proposal is to avoid such `without_pindown' functions completely.

Updated by nobu (Nobuyoshi Nakada) almost 10 years ago

One problem about it is ripper.
The result of Ripper.parse is transient, but symbols by its side-effect are permanent, right now.

So now I'm thinking the plan:

  1. make all IDs permanent, as ko1 claims
  2. isolate Symbols in ripper from IDs

Updated by normalperson (Eric Wong) almost 10 years ago

wrote:

One problem about it is ripper.
The result of Ripper.parse is transient, but symbols by its side-effect are permanent, right now.

So now I'm thinking the plan:

  1. make all IDs permanent, as ko1 claims
  2. isolate Symbols in ripper from IDs

nobu: can you fix this in time for 2.2.0-preview1 release?

Otherwise, I propose the following temporary fix:

--- a/parse.y
+++ b/parse.y
@@ -285,7 +285,7 @@ struct parser_params {
 #ifdef RIPPER
 #define intern_cstr_without_pindown(n,l,en) rb_intern3(n,l,en)
 #else
-#define intern_cstr_without_pindown(n,l,en) rb_intern_cstr_without_pindown(n,l,en)
+#define intern_cstr_without_pindown(n,l,en) rb_intern3(n,l,en)
 #endif
 
 #define STR_NEW(p,n) rb_enc_str_new((p),(n),current_enc)

Updated by nobu (Nobuyoshi Nakada) almost 10 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r47569.


parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use
    rb_intern3() as well as RIPPER, for the time being.
    [ruby-core:65009] [Bug #10206]

Updated by ko1 (Koichi Sasada) almost 10 years ago

(2014/09/13 7:59), Eric Wong wrote:

  1. make all IDs permanent, as ko1 claims

+1

--
// SASADA Koichi at atdot dot net

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0