Bug #4289

Timeouts in threads cause SEGV

Added by Motohiro KOSAKI over 3 years ago. Updated almost 3 years ago.

[ruby-core:34554]
Status:Closed
Priority:Normal
Assignee:Akira Tanaka
Category:core
Target version:1.9.3
ruby -v:ruby 1.9.3dev (2011-01-18 trunk 30591) [x86_64-linux] Backport:

Description

=begin
Derived from [Bug#4266]

Running deadlock_test.rb in [Bug#4266] on trunk makes segfault. git bisect indicate
first bad commit is below.


commit d295957957c828588a8ca3c7b8619c7a93be6b5c
Author: akr akr@b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Date: Tue Nov 2 22:37:08 2010 +0000

 * vm_method.c (rb_clear_cache_by_class): just return if the class has
   no method.  reported by Eric Wong.  


 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Plus, I've confirmed latest trunk + revert d2959579 doesn't makes segfault.
=end

0001-timeout.rb-avoid-introducing-new-class-for-every-tim.patch Magnifier (1.33 KB) Eric Wong, 04/06/2011 07:36 AM

0001-revert-r29673-optimization-which-caused-segfaults.patch Magnifier - fix segfaults due to dangling references in the cache (774 Bytes) Eric Wong, 04/07/2011 08:45 AM

0002-error.c-rb_mod_sys_fail-use-subclass-and-cache.patch Magnifier - cache rb_mod_sys_fail to avoid expensive cache clearing with non-blocking I/O (2.08 KB) Eric Wong, 04/07/2011 08:45 AM

0001-test-socket-test_unix-fix-test-failures-from-rb_mod_.patch Magnifier (1.25 KB) Eric Wong, 04/07/2011 11:44 AM

0002-introduce-ephemeral-class-flag-for-short-lived-class.patch Magnifier (737 Bytes) Eric Wong, 04/09/2011 03:54 PM

0003-vm_method.c-ephemeral-classes-do-not-write-expire-ca.patch Magnifier (980 Bytes) Eric Wong, 04/09/2011 03:54 PM

0003-vm_method.c-ephemeral-classes-do-not-write-expire-ca.patch Magnifier (980 Bytes) Eric Wong, 04/09/2011 03:54 PM


Related issues

Related to ruby-trunk - Bug #4266: Timeouts in threads cause "ThreadError: deadlock; recursi... Closed 01/12/2011
Related to ruby-trunk - Feature #3905: rb_clear_cache_by_class() called often during GC for non-... Closed 10/05/2010

Associated revisions

Revision 31378
Added by Motohiro KOSAKI almost 3 years ago

  • vmmethod.c (rbclearcacheby_class): Revert r29673. It made a segmentation fault regression. [Bug #4289].

History

#1 Updated by Eric Wong about 3 years ago

=begin
Hiding, but not fixing the issue is the attached patch:

[PATCH] timeout.rb: avoid introducing new class for every timeout

This is expensive because of clearing the method cache upon GC.

As a side effect, it also seems to pass the deadlocktest.rb[1]
for Bug #4266[2] and also the JRuby load
timeout.rb[3] test.
However, DO NOT consider this a fix for Bug #4266 or other
timeout-related issues. I believe this patch merely hides the
real bug and makes it hard to trigger from Ruby standard library.

[1] http://redmine.ruby-lang.org/attachments/download/1404/deadlock_test.rb
[2] http://redmine.ruby-lang.org/issues/4266
[3] https://github.com/jruby/jruby/raw/master/test/load/load_timeout.rb

=end

#2 Updated by Eric Wong about 3 years ago

=begin
also see
((<|URL:http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/35622>))

(redmine doesn't seem to handle mail replies correctly)

=end

#3 Updated by Eric Wong about 3 years ago

=begin
Motohiro KOSAKI wrote:

Running deadlock_test.rb in [Bug#4266] on trunk makes segfault. git bisect indicate
first bad commit is below.


commit d295957957c828588a8ca3c7b8619c7a93be6b5c
Author: akr akr@b2dd03c8-39d4-4d8f-98ff-823fe69b080e
Date: Tue Nov 2 22:37:08 2010 +0000

 * vm_method.c (rb_clear_cache_by_class): just return if the class has
   no method.  reported by Eric Wong.  


 git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@29673 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Plus, I've confirmed latest trunk + revert d2959579 doesn't makes segfault.

Yes, r29673 is bad, I think. The method cache caches for the subclass
even if the method belongs to a superclass. I confirmed it with the
following debug patch that writes to stderr whenever a method-less class
is cleared:

diff --git a/vmmethod.c b/vmmethod.c
index 278941a..021b703 100644
--- a/vmmethod.c
+++ b/vm
method.c
@@ -84,9 +84,10 @@ void
rbclearcachebyclass(VALUE klass)
{
struct cacheentry *ent, *end;
+ int nr
cleared = 0, check_empty = 0;

  if (RCLASS_M_TBL(klass)->num_entries == 0)
  • return;
  • check_empty = 1;

    rbvmchange_state();

    @@ -98,9 +99,12 @@ rbclearcachebyclass(VALUE klass)
    if (ent->klass == klass || (ent->me && ent->me->klass == klass)) {
    ent->me = 0;
    ent->mid = 0;

  •  ++nr_cleared;
    

    }
    ent++;
    }

  • if (checkempty && nrcleared)

  • fprintf(stderr, "cleared %d methods for method-less class\n", nr_cleared);
    }

    VALUE

    Eric Wong
    =end

#4 Updated by Eric Wong about 3 years ago

=begin
Attached are patches to revert r29673 and instate the rbmodsys_fail class cache I proposed in

=end

#5 Updated by Eric Wong about 3 years ago

=begin
I noticed 0002-error.c-rbmodsys_fail-use-subclass-and-cache.patch breaks on latest trunk, actually.
=end

#6 Updated by Eric Wong about 3 years ago

=begin
0002-error.c-rbmodsysfail-use-subclass-and-cache.patch breaks
existing test cases that use assert
raise/assert_raises. This
may be a dealbreaker for the patch, unfortunately...

Actual code that uses "rescue Errno::EAGAIN" is unaffected.
Anyways I have a patch to fix the failing test case I encountered if
breaking existing Test::Unit code is alright...

=end

#7 Updated by Eric Wong about 3 years ago

=begin
I found a way to fix the issue without breaking user-facing code and still keep
performance :D

I just use a flag to mark a singleton class as ephemeral and have the method
cache bypass caching (and expiry) of short-lived ephemeral classes.

The series should be:
* 0001-revert-r29673-optimization-which-caused-segfaults.patch
* 0002-introduce-ephemeral-class-flag-for-short-lived-class.patch
* 0003-vm_method.c-ephemeral-classes-do-not-write-expire-ca.patch
* 0004-IO-Wait-able-extended-singleton-classes-are-ephemera.patch

Please ignore the following as noise:
* 0002-error.c-rbmodsysfail-use-subclass-and-cache.patch
* 0001-test-socket-test
unix-fix-test-failures-from-rbmod.patch

If you use git, I am tracking this my "method-cache-clear" branch in my
repo: git pull git://bogomips.org/ruby method-cache-clear

I still think 0001-timeout.rb-avoid-introducing-new-class-for-every-tim.patch
will be useful, but less important. The Timeout::ExitException subclass cannot
be marked as ephemeral from Ruby code...

=end

#8 Updated by Eric Wong about 3 years ago

=begin
Eric Wong redmine@ruby-lang.org wrote:

The series should be:
* 0001-revert-r29673-optimization-which-caused-segfaults.patch

Can we get this reversion ASAP since it's confirmed to be
causing segfaults in trunk?

Take your time with reviewing the rest, they're low-priority
performance improvements (I'll split out to a new ticket if needed):

  • 0002-introduce-ephemeral-class-flag-for-short-lived-class.patch
  • 0003-vm_method.c-ephemeral-classes-do-not-write-expire-ca.patch
  • 0004-IO-Wait-able-extended-singleton-classes-are-ephemera.patch

Ignore the following sentence:

I still think 0001-timeout.rb-avoid-introducing-new-class-for-every-tim.patch
will be useful, but less important.

I realized creating a new class every time is needed to do nested
timeouts :>

--
Eric Wong
=end

#9 Updated by Motohiro KOSAKI almost 3 years ago

=begin
0001-revert-r29673-optimization-which-caused-segfaults.patch was commited as r31378.

I think other patch need to get Tanaka-san's review.
=end

#10 Updated by Akira Tanaka almost 3 years ago

  • Status changed from Assigned to Closed

Also available in: Atom PDF