Project

General

Profile

Bug #10206

garbage symbols crash symbol GC

Added by normalperson (Eric Wong) over 5 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Target version:
ruby -v:
trunk
[ruby-core:64806]

Description

This is reproducible with just a test loop running for serveral minutes/hours:

while make test-all TESTS=-j8; do :; done

It looks like SYM2ID/rb_sym2id interacts badly with dsymbol_check
when it encounters garbage objects.

dsymbol_check replaces an invalid object and returns a new object
for the caller, but the original arg for SYM2ID remains usable
to the caller:

    id = SYM2ID(garbage_sym);
    do_something(garbage_sym); /* bad invalid object used */

Changing: rb_sym2id(VALUE) to rb_sym2id(VALUE *)
might solve the issue, but introduces many incompatibilities in existing
code:

    id = rb_sym2id(&garbage_sym);
    do_something(garbage_sym); /* id == garbage_sym, safe to use */

ref: ruby-core thread starting at [ruby-core:64671]
backtraces:
http://80x24.org/r35240/rb-dump.txt
http://80x24.org/r35240/gdb-bt.txt

Associated revisions

Revision da7475b0
Added by nobu (Nobuyoshi Nakada) over 5 years ago

parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use rb_intern3() as well as RIPPER, for the time being. [ruby-core:65009] [Bug #10206]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@47569 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 47569
Added by nobu (Nobuyoshi Nakada) over 5 years ago

parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use rb_intern3() as well as RIPPER, for the time being. [ruby-core:65009] [Bug #10206]

Revision 47569
Added by nobu (Nobuyoshi Nakada) over 5 years ago

parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use rb_intern3() as well as RIPPER, for the time being. [ruby-core:65009] [Bug #10206]

Revision 47569
Added by nobu (Nobuyoshi Nakada) over 5 years ago

parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use rb_intern3() as well as RIPPER, for the time being. [ruby-core:65009] [Bug #10206]

Revision 47569
Added by nobu (Nobuyoshi Nakada) over 5 years ago

parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use rb_intern3() as well as RIPPER, for the time being. [ruby-core:65009] [Bug #10206]

Revision 47569
Added by nobu (Nobuyoshi Nakada) over 5 years ago

parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use rb_intern3() as well as RIPPER, for the time being. [ruby-core:65009] [Bug #10206]

Revision 47569
Added by nobu (Nobuyoshi Nakada) over 5 years ago

parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use rb_intern3() as well as RIPPER, for the time being. [ruby-core:65009] [Bug #10206]

History

Updated by nobu (Nobuyoshi Nakada) over 5 years ago

Or revert dsymbol_check()?

Updated by normalperson (Eric Wong) over 5 years ago

nobu@ruby-lang.org wrote:

Or revert dsymbol_check()?

But we need to replace with rb_gc_resurrect, right?

I tried http://80x24.org/10206/resurrect.patch but test-all loop
failed with:
http://80x24.org/10206/resurrect-gdb-bt.txt
http://80x24.org/10206/resurrect-dump.txt

Haven't had time to investigate. I'm also not sure what the problem was
with rb_gc_resurrect originally

Updated by ko1 (Koichi Sasada) over 5 years ago

At first, Symbol is VALUE and it should be marked.

So that the following code should not be allowed.

id = SYM2ID(garbage_sym);

In this case, afeter sweeping, garbage_sym becomes freed VALUE.

What happen on it?

Updated by normalperson (Eric Wong) over 5 years ago

ko1@atdot.net wrote:

At first, Symbol is VALUE and it should be marked.

So that the following code should not be allowed.

id = SYM2ID(garbage_sym);

In this case, afeter sweeping, garbage_sym becomes freed VALUE.

What happen on it?

Looking at this more, we may run dsymbol_check too late in
dsymbol_pindown. I think we must run dsymbol_check immediately after
looking up dynamic syms from global_symbol.str_id, and not later.

I think this may be a fix (still testing):

--- a/symbol.c
+++ b/symbol.c
@@ -458,7 +458,10 @@ dsymbol_pindown(VALUE sym)

     if (UNLIKELY(SYMBOL_PINNED_P(sym) == 0)) {
    VALUE fstr = RSYMBOL(sym)->fstr;
-   sym = dsymbol_check(sym);
+
+   if (UNLIKELY(rb_objspace_garbage_object_p(sym))) {
+       rb_bug("attempted to pindown garbage sym");
+   }
    FL_SET(sym, SYMBOL_PINNED);

    /* make it permanent object */
@@ -525,6 +528,9 @@ rb_intern_cstr_without_pindown(const char *name, long len, rb_encoding *enc)
     OBJ_FREEZE(str);

     if (st_lookup(global_symbols.str_id, str, &id)) {
+   if (ID_DYNAMIC_SYM_P((ID)id)) {
+       return (ID)dsymbol_check((VALUE)id);
+   }
    return (ID)id;
     }

Updated by normalperson (Eric Wong) over 5 years ago

Eric Wong normalperson@yhbt.net wrote:

I think this may be a fix (still testing):

Nope. However, I think it takes longer in the test-all loop to
reproduce the problem.

--- a/symbol.c
+++ b/symbol.c
@@ -458,7 +458,10 @@ dsymbol_pindown(VALUE sym)

     if (UNLIKELY(SYMBOL_PINNED_P(sym) == 0)) {
     VALUE fstr = RSYMBOL(sym)->fstr;
-    sym = dsymbol_check(sym);
+
+    if (UNLIKELY(rb_objspace_garbage_object_p(sym))) {
+        rb_bug("attempted to pindown garbage sym");
+    }

I still hit this rb_bug (similar backtraces as before).

     FL_SET(sym, SYMBOL_PINNED);

     /* make it permanent object */
@@ -525,6 +528,9 @@ rb_intern_cstr_without_pindown(const char *name, long len, rb_encoding *enc)
     OBJ_FREEZE(str);

     if (st_lookup(global_symbols.str_id, str, &id)) {
+    if (ID_DYNAMIC_SYM_P((ID)id)) {
+        return (ID)dsymbol_check((VALUE)id);
+    }

However, I think this dsymbol_check still is worthwhile.

     return (ID)id;
     }

Updated by normalperson (Eric Wong) over 5 years ago

I'm looking into uses of intern_cstr_without_pindown in parse.y causing
garbage syms.

Unfortunately, I do not yet understand why we avoid pindown in parse.y
(or much of parse.y). I thought symbol GC was only to help users who
use String#to_sym too aggressively.

http://80x24.org/r35240/gdb-bt.txt

compile.c:
case TS_ID: /* ID */
generated_iseq[pos + 1 + j] = SYM2ID(operands[j]);

Updated by ko1 (Koichi Sasada) over 5 years ago

Unfortunately, I do not yet understand why we avoid pindown in parse.y
(or much of parse.y). I thought symbol GC was only to help users who
use String#to_sym too aggressively.

Exactlly. However, nobu wants to reduce immoratal symbols from parse.y.
(I'm strongly against for such optimization)

My proposal is to avoid such `without_pindown' functions completely.

Updated by nobu (Nobuyoshi Nakada) over 5 years ago

One problem about it is ripper.
The result of Ripper.parse is transient, but symbols by its side-effect are permanent, right now.

So now I'm thinking the plan:

  1. make all IDs permanent, as ko1 claims
  2. isolate Symbols in ripper from IDs

Updated by normalperson (Eric Wong) over 5 years ago

nobu@ruby-lang.org wrote:

One problem about it is ripper.
The result of Ripper.parse is transient, but symbols by its side-effect are permanent, right now.

So now I'm thinking the plan:

  1. make all IDs permanent, as ko1 claims
  2. isolate Symbols in ripper from IDs

nobu: can you fix this in time for 2.2.0-preview1 release?

Otherwise, I propose the following temporary fix:

--- a/parse.y
+++ b/parse.y
@@ -285,7 +285,7 @@ struct parser_params {
 #ifdef RIPPER
 #define intern_cstr_without_pindown(n,l,en) rb_intern3(n,l,en)
 #else
-#define intern_cstr_without_pindown(n,l,en) rb_intern_cstr_without_pindown(n,l,en)
+#define intern_cstr_without_pindown(n,l,en) rb_intern3(n,l,en)
 #endif

 #define STR_NEW(p,n) rb_enc_str_new((p),(n),current_enc)

Updated by nobu (Nobuyoshi Nakada) over 5 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

Applied in changeset r47569.


parse.y: intern_cstr

  • parse.y (intern_cstr): remove _without_pindown suffix and use rb_intern3() as well as RIPPER, for the time being. [ruby-core:65009] [Bug #10206]

Updated by ko1 (Koichi Sasada) over 5 years ago

(2014/09/13 7:59), Eric Wong wrote:

  1. make all IDs permanent, as ko1 claims

+1

--
// SASADA Koichi at atdot dot net

Also available in: Atom PDF