Project

General

Profile

Actions

Bug #8100

closed

Segfault in trunk

Added by judofyr (Magnus Holm) over 11 years ago. Updated over 11 years ago.

Status:
Closed
Target version:
ruby -v:
ruby 2.1.0dev (2013-03-18 trunk 39805) [x86_64-linux]
Backport:
[ruby-core:53439]

Description

=begin
Full backtrace (both VM, C and Ruby) is both attached and available at https://travis-ci.org/rtomayko/tilt/jobs/5479138

I haven't been able to reproduce it (and thus I can't create a reduced test case).

This is the test that fails: https://github.com/rtomayko/tilt/blob/581230cbb3b314e88cf5ec9167a24ebb8acc7a93/test/tilt_compilesite_test.rb#L31

The code in question will do these steps in several threads at the same time:

The method is doing some funky class << self to ensure that it gets evaluated under a proper constant scope). It's also caching the methods, so it won't always define a new method, but might re-use another UnboundMethod from a previous compilation (that might have happened on a different thread).

I know it's not much to go after, but at least the backtrace seems to suggest that the error happend in rb_ary_fill in array.c.

I've also had another report of segfault in Tilt + Ruby 2.0.0, but I don't have the full backtrace yet: https://github.com/rtomayko/tilt/issues/179. Might this be related?

Let me know if you need more details.
=end


Files

seglog.txt (104 KB) seglog.txt judofyr (Magnus Holm), 03/15/2013 08:58 PM
segfault_spec.tar.gz (3.01 KB) segfault_spec.tar.gz zzak (zzak _), 03/18/2013 10:51 AM
seg.txt (63.4 KB) seg.txt DAddYE (Davide D'Agostino), 03/18/2013 04:14 PM
fail.rb (604 Bytes) fail.rb Reduced script judofyr (Magnus Holm), 03/22/2013 06:38 PM

Related issues 4 (0 open4 closed)

Related to Backport193 - Backport #8163: Backport r39919Rejectedusa (Usaku NAKAMURA)Actions
Has duplicate Ruby master - Bug #8336: Segfault in :=~Closed04/27/2013Actions
Has duplicate Ruby master - Bug #8353: segfault with puma-1.6.3Closed05/02/2013Actions
Has duplicate Ruby master - Bug #8056: Random segmentation faults in TempfileClosed03/09/2013Actions

Updated by zzak (zzak _) over 11 years ago

  • File segfault_spec.tar.gz segfault_spec.tar.gz added
  • Subject changed from Segfault in ruby-2.0.0p0 to Segfault in trunk
  • Target version set to 2.1.0
  • ruby -v changed from ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux] to ruby 2.1.0dev (2013-03-18 trunk 39805) [x86_64-linux]

I've updated the description of this ticket, because I'm able to reproduce a similar bug. Only similar in that we're using a lot of the same dependencies.

I also went ahead and created (as small as possible) reproducible script. Here's the instructions for reproducing the segfault:

  1. git clone git://github.com/zzak/segfault_spec.rb.git
  2. bundle install
  3. bundle exec rspec segfault_spec.rb
  4. repeat #3 until segfault. this may take a few tries

I will also attach an archive of the script.

Updated by zzak (zzak _) over 11 years ago

Forgot to add a link to the repo on github: https://github.com/zzak/segfault_spec.rb

Updated by wardrop (Tom Wardrop) over 11 years ago

I'm also getting segfaults on Ruby 2.0.0. It seems to be related to threading or forking. Can't quite put my figure on it. All I can say is that I don't get in when running my web app in WEBrick on my Mac, but if running it on my CentOS server with Phusion Passenger using the smart spawn method, I get it all the time, about every 10th request it segfaults. Setting passenger to a conservative spawn method (one request per process) reduces the segfault rate considerably, but they still occur.

Here's a stack overflow thread about it, with a response I left on there with a bit more information about my experiences: http://stackoverflow.com/questions/15315809/segfault-error-in-sinatra-after-upgrading-to-ruby-2-0-beta/15492401#15492401

I also reported this to the Phusion Passenger Google Group before realising it's a problem with ruby 2.0.0: https://groups.google.com/forum/?fromgroups=#!topic/phusion-passenger/iEOE4shl_jE

Here's a log including numerous segfaults from my CentOS server running Phusion Passenger: https://gist.github.com/Wardrop/5179380

Either way, it looks like something common to web applications is causing this, or perhaps web application frameworks are so far the most common cases in which Ruby 2.0.0 is being used.

Updated by judofyr (Magnus Holm) over 11 years ago

I've managed to reduce the script down to 30 lines (with no dependencies) that segfaults in both 2.0.0-p0 and trunk (39875). It doesn't segfault every time though so if it takes more than a few seconds to run it, simply Ctrl-C and try again.

Updated by judofyr (Magnus Holm) over 11 years ago

Here's a backtrace I got in gdb: http://pastie.org/7064676. rb_gc_mark_unlinked_live_method_entries seems suspicious and related to what the script does.

Updated by wardrop (Tom Wardrop) over 11 years ago

They've obviously done work on the garbage collector for Ruby 2.0. This is likely a bug introduced as result of that. Good work tracking it down judofyr.

Updated by judofyr (Magnus Holm) over 11 years ago

After working with charliesome we've now found an even simpler test case:

http://eval.in/13339

This always segfaults for me on trunk.

Updated by Anonymous over 11 years ago

=begin
Magnus and I reduced this down to an even simpler^2 test case:

loop do
def x
"hello" * 1000
end

method(:x).call

end

http://eval.in/13344
=end

Updated by kosaki (Motohiro KOSAKI) over 11 years ago

  • Category set to core
  • Status changed from Open to Assigned
  • Assignee set to authorNari (Narihiro Nakamura)
Actions #11

Updated by nobu (Nobuyoshi Nakada) over 11 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r39883.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • KNOWNBUGS.rb: test for [Bug #8100].

Updated by nobu (Nobuyoshi Nakada) over 11 years ago

  • Status changed from Closed to Assigned
  • % Done changed from 100 to 0

Updated by Anonymous over 11 years ago

nobu-san, this will loop forever when the bug is fixed. Perhaps change it to 100_000.times?

Updated by wardrop (Tom Wardrop) over 11 years ago

I'd set it to a duration rather than a set number of iterations. I've see it go for 2 seconds on my machine before segfault'ing. 3 seconds should fail almost every time.

start_time = Time.now
while (Time.now - start_time) < 3
  def x
    "hello" * 1000
  end
  method(:x).call
end

Updated by nobu (Nobuyoshi Nakada) over 11 years ago

charliesome (Charlie Somerville) wrote:

nobu-san, this will loop forever when the bug is fixed. Perhaps change it to 100_000.times?

Sure, I've forgot it before the commit.

Actions #16

Updated by naruse (Yui NARUSE) over 11 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r39894.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


Add timeout to infinite loop [Bug #8100]

On FreeBSD, it doesn't SEGV.
http://fbsd.rubyci.org/~chkbuild/ruby-trunk/log/20130323T170203Z.log.html.gz

Updated by naruse (Yui NARUSE) over 11 years ago

  • Status changed from Closed to Assigned
Actions #18

Updated by authorNari (Narihiro Nakamura) over 11 years ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r39919.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • proc.c (bm_free): need to clean up the mark flag of a free and
    unlinked method entry. [Bug #8100] [ruby-core:53439]

Updated by zzak (zzak _) over 11 years ago

Thank you nari-san and everyone who helped with this.

Should this be backported as well?

Updated by authorNari (Narihiro Nakamura) over 11 years ago

zzak (Zachary Scott) wrote:

Thank you nari-san and everyone who helped with this.

Should this be backported as well?

Yeah, this fix should be backport to 1.9.3 and 2.0.0.

Updated by wardrop (Tom Wardrop) over 11 years ago

Eagerly awaiting the backport. Can someone please leave a comment when it's back-ported to ruby-2.0.0 head?

Updated by authorNari (Narihiro Nakamura) over 11 years ago

wardrop (Tom Wardrop) wrote:

Eagerly awaiting the backport. Can someone please leave a comment when it's back-ported to ruby-2.0.0 head?

The backport request ticket is here.
https://bugs.ruby-lang.org/issues/8163
You might want to watch this ticket for your purpose.

Updated by wardrop (Tom Wardrop) over 11 years ago

Thanks for that. By the way, I've applied the patch to my production server. Write me down as another happy customer :-)

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0