Bug #8100
closedSegfault in trunk
Description
=begin
Full backtrace (both VM, C and Ruby) is both attached and available at https://travis-ci.org/rtomayko/tilt/jobs/5479138
I haven't been able to reproduce it (and thus I can't create a reduced test case).
This is the test that fails: https://github.com/rtomayko/tilt/blob/581230cbb3b314e88cf5ec9167a24ebb8acc7a93/test/tilt_compilesite_test.rb#L31
The code in question will do these steps in several threads at the same time:
- https://github.com/rtomayko/tilt/blob/581230cbb3b314e88cf5ec9167a24ebb8acc7a93/lib/tilt/template.rb#L212
- Define a method called "tilt#{Thread.current.id.abs}" on Object
- Grab the UnboundMethod
- Undefine the method from Object
- https://github.com/rtomayko/tilt/blob/581230cbb3b314e88cf5ec9167a24ebb8acc7a93/lib/tilt/template.rb#L144
- Then it binds the UnboundMethod to an object and calls it
The method is doing some funky class << self
to ensure that it gets evaluated under a proper constant scope). It's also caching the methods, so it won't always define a new method, but might re-use another UnboundMethod from a previous compilation (that might have happened on a different thread).
I know it's not much to go after, but at least the backtrace seems to suggest that the error happend in rb_ary_fill in array.c.
I've also had another report of segfault in Tilt + Ruby 2.0.0, but I don't have the full backtrace yet: https://github.com/rtomayko/tilt/issues/179. Might this be related?
Let me know if you need more details.
=end
Files
Updated by zzak (zzak _) over 11 years ago
- File segfault_spec.tar.gz segfault_spec.tar.gz added
- Subject changed from Segfault in ruby-2.0.0p0 to Segfault in trunk
- Target version set to 2.1.0
- ruby -v changed from ruby 2.0.0p0 (2013-02-24 revision 39474) [x86_64-linux] to ruby 2.1.0dev (2013-03-18 trunk 39805) [x86_64-linux]
I've updated the description of this ticket, because I'm able to reproduce a similar bug. Only similar in that we're using a lot of the same dependencies.
I also went ahead and created (as small as possible) reproducible script. Here's the instructions for reproducing the segfault:
- git clone git://github.com/zzak/segfault_spec.rb.git
- bundle install
- bundle exec rspec segfault_spec.rb
- repeat #3 until segfault. this may take a few tries
I will also attach an archive of the script.
Updated by DAddYE (Davide D'Agostino) over 11 years ago
I got a similar one too, see here: https://github.com/padrino/padrino-framework/issues/1131
Updated by zzak (zzak _) over 11 years ago
Forgot to add a link to the repo on github: https://github.com/zzak/segfault_spec.rb
Updated by wardrop (Tom Wardrop) over 11 years ago
I'm also getting segfaults on Ruby 2.0.0. It seems to be related to threading or forking. Can't quite put my figure on it. All I can say is that I don't get in when running my web app in WEBrick on my Mac, but if running it on my CentOS server with Phusion Passenger using the smart spawn method, I get it all the time, about every 10th request it segfaults. Setting passenger to a conservative spawn method (one request per process) reduces the segfault rate considerably, but they still occur.
Here's a stack overflow thread about it, with a response I left on there with a bit more information about my experiences: http://stackoverflow.com/questions/15315809/segfault-error-in-sinatra-after-upgrading-to-ruby-2-0-beta/15492401#15492401
I also reported this to the Phusion Passenger Google Group before realising it's a problem with ruby 2.0.0: https://groups.google.com/forum/?fromgroups=#!topic/phusion-passenger/iEOE4shl_jE
Here's a log including numerous segfaults from my CentOS server running Phusion Passenger: https://gist.github.com/Wardrop/5179380
Either way, it looks like something common to web applications is causing this, or perhaps web application frameworks are so far the most common cases in which Ruby 2.0.0 is being used.
Updated by judofyr (Magnus Holm) over 11 years ago
I've managed to reduce the script down to 30 lines (with no dependencies) that segfaults in both 2.0.0-p0 and trunk (39875). It doesn't segfault every time though so if it takes more than a few seconds to run it, simply Ctrl-C and try again.
Updated by judofyr (Magnus Holm) over 11 years ago
Here's a backtrace I got in gdb: http://pastie.org/7064676. rb_gc_mark_unlinked_live_method_entries seems suspicious and related to what the script does.
Updated by wardrop (Tom Wardrop) over 11 years ago
They've obviously done work on the garbage collector for Ruby 2.0. This is likely a bug introduced as result of that. Good work tracking it down judofyr.
Updated by judofyr (Magnus Holm) over 11 years ago
After working with charliesome we've now found an even simpler test case:
This always segfaults for me on trunk.
Updated by Anonymous over 11 years ago
=begin
Magnus and I reduced this down to an even simpler^2 test case:
loop do
def x
"hello" * 1000
end
method(:x).call
end
http://eval.in/13344
=end
Updated by kosaki (Motohiro KOSAKI) over 11 years ago
- Category set to core
- Status changed from Open to Assigned
- Assignee set to authorNari (Narihiro Nakamura)
Updated by nobu (Nobuyoshi Nakada) over 11 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r39883.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
- KNOWNBUGS.rb: test for [Bug #8100].
Updated by nobu (Nobuyoshi Nakada) over 11 years ago
- Status changed from Closed to Assigned
- % Done changed from 100 to 0
Updated by Anonymous over 11 years ago
nobu-san, this will loop forever when the bug is fixed. Perhaps change it to 100_000.times?
Updated by wardrop (Tom Wardrop) over 11 years ago
I'd set it to a duration rather than a set number of iterations. I've see it go for 2 seconds on my machine before segfault'ing. 3 seconds should fail almost every time.
start_time = Time.now
while (Time.now - start_time) < 3
def x
"hello" * 1000
end
method(:x).call
end
Updated by nobu (Nobuyoshi Nakada) over 11 years ago
charliesome (Charlie Somerville) wrote:
nobu-san, this will loop forever when the bug is fixed. Perhaps change it to 100_000.times?
Sure, I've forgot it before the commit.
Updated by naruse (Yui NARUSE) over 11 years ago
- Status changed from Assigned to Closed
- % Done changed from 0 to 100
This issue was solved with changeset r39894.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
Add timeout to infinite loop [Bug #8100]
On FreeBSD, it doesn't SEGV.
http://fbsd.rubyci.org/~chkbuild/ruby-trunk/log/20130323T170203Z.log.html.gz
Updated by naruse (Yui NARUSE) over 11 years ago
- Status changed from Closed to Assigned
Updated by authorNari (Narihiro Nakamura) over 11 years ago
- Status changed from Assigned to Closed
This issue was solved with changeset r39919.
Magnus, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.
- proc.c (bm_free): need to clean up the mark flag of a free and
unlinked method entry. [Bug #8100] [ruby-core:53439]
Updated by zzak (zzak _) over 11 years ago
Thank you nari-san and everyone who helped with this.
Should this be backported as well?
Updated by authorNari (Narihiro Nakamura) over 11 years ago
zzak (Zachary Scott) wrote:
Thank you nari-san and everyone who helped with this.
Should this be backported as well?
Yeah, this fix should be backport to 1.9.3 and 2.0.0.
Updated by wardrop (Tom Wardrop) over 11 years ago
Eagerly awaiting the backport. Can someone please leave a comment when it's back-ported to ruby-2.0.0 head?
Updated by authorNari (Narihiro Nakamura) over 11 years ago
wardrop (Tom Wardrop) wrote:
Eagerly awaiting the backport. Can someone please leave a comment when it's back-ported to ruby-2.0.0 head?
The backport request ticket is here.
https://bugs.ruby-lang.org/issues/8163
You might want to watch this ticket for your purpose.
Updated by wardrop (Tom Wardrop) over 11 years ago
Thanks for that. By the way, I've applied the patch to my production server. Write me down as another happy customer :-)
Updated by morgoth (Wojciech Wnętrzak) over 11 years ago
Might be related to https://bugs.ruby-lang.org/issues/8056