Bug #15325
closedRuby 2.5.3 seg fault after find block returns
Description
In https://gitlab.com/gitlab-org/gitlab-ce/blob/233af8f1731734aaad7e5055af39f26c16608649/app/services/ci/register_job_service.rb#L48, we see a repeatable seg fault on both MacOS and Ubuntu with Rails 5.0.7 in a development environment. The seg fault appears to occur after the find
returns:
builds.find do |build|
next unless runner.can_pick?(build)
begin
# In case when 2 runners try to assign the same build, second runner will be declined
# with StateMachines::InvalidTransition or StaleObjectError when doing run! or save method.
if assign_runner!(build, params)
register_success(build)
return Result.new(build, true) # <--- SEG FAULT HAPPENS AFTER HERE
end
rescue StateMachines::InvalidTransition, ActiveRecord::StaleObjectError
The segfault shows some bad memory access:
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x00007fff5d0e8b86 __pthread_kill + 10
1 libsystem_pthread.dylib 0x00007fff5d19ec50 pthread_kill + 285
2 libsystem_c.dylib 0x00007fff5d0521c9 abort + 127
3 ruby 0x000000010f5ec6a9 die + 9
4 ruby 0x000000010f5ec908 rb_bug_context + 600
5 ruby 0x000000010f6db7a1 sigsegv + 81
6 libsystem_platform.dylib 0x00007fff5d193b3d _sigtramp + 29
7 ??? 000000000000000000 0 + 0
8 ruby 0x000000010f75461e vm_exec + 142
9 ruby 0x000000010f761f25 invoke_block_from_c_bh + 405
10 ruby 0x000000010f74f719 rb_yield + 153
11 ruby 0x000000010f5e33b9 find_i + 41
12 ruby 0x000000010f7620ca invoke_block_from_c_bh + 826
13 ruby 0x000000010f74f719 rb_yield + 153
14 ruby 0x000000010f57cce9 rb_ary_each + 41
15 ruby 0x000000010f759f51 vm_call_cfunc + 305
16 ruby 0x000000010f742a0d vm_exec_core + 9149
17 ruby 0x000000010f75461e vm_exec + 142
18 ruby 0x000000010f761d41 rb_call0 + 161
19 ruby 0x000000010f74fe54 iterate_method + 52
20 ruby 0x000000010f74fd9b rb_iterate0 + 347
21 ruby 0x000000010f74fe1a rb_block_call + 74
22 ruby 0x000000010f5e0518 enum_find + 104
23 ruby 0x000000010f759f51 vm_call_cfunc + 305
24 ruby 0x000000010f7436bd vm_exec_core + 12397
We do NOT see the problem after downgrading to 2.4.5.
Files
Updated by stanhu (Stan Hu) about 6 years ago
- ruby -v set to ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin15]
Updated by stanhu (Stan Hu) about 6 years ago
Note that I've managed to remove the return
statement inside the find
block, and this appears to make the seg fault go away.
diff --git a/app/services/ci/register_job_service.rb b/app/services/ci/register_job_service.rb
index e06f1c05843..2abc4a67dd6 100644
--- a/app/services/ci/register_job_service.rb
+++ b/app/services/ci/register_job_service.rb
@@ -36,7 +36,7 @@ module Ci
builds = builds.with_any_tags
end
- builds.find do |build|
+ selection = builds.find do |build|
next unless runner.can_pick?(build)
begin
@@ -45,7 +45,7 @@ module Ci
if assign_runner!(build, params)
register_success(build)
- return Result.new(build, true) # rubocop:disable Cop/AvoidReturnFromBlocks
+ break build
end
rescue StateMachines::InvalidTransition, ActiveRecord::StaleObjectError
# We are looping to find another build that is not conflicting
@@ -61,6 +61,8 @@ module Ci
end
end
+ return Result.new(selection, true) if selection
+
register_failure
Result.new(nil, valid)
end
--
2.18.1
Updated by stanhu (Stan Hu) about 6 years ago
Something is quite odd. I tried a number of variations:
-
break build
appears to work with Ruby 2.4.5 and 2.5.3. - Instead of
break build
, usetrue
: In Ruby 2.5.3, this by itself seems to cause selection to be nil. I got a segfault with Ruby 2.4.5 here in the garbage collector (rb_gc_mark_node
). - Instead of
break build
, usebreak true
: selection isnil
in both Ruby 2.4.5 and 2.5.3. - Removing the begin/rescue clause entirely and testing this. The below did not work either:
selection = builds.find do |build|
if assign_runner!(build, params)
register_success(build)
true
else
false
end
end
Updated by stanhu (Stan Hu) about 6 years ago
Ok, I think this bug is caused by https://bugs.ruby-lang.org/issues/15105. We were using the binding_of_caller gem, which calls rb_debug_inspector_open
. The seg fault doesn't happen if we omit that call.
Updated by stanhu (Stan Hu) about 6 years ago
We can close this bug report in favor of https://bugs.ruby-lang.org/issues/15105. I've confirmed applying the patch in https://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/64800 has made the seg fault go away.
Updated by duerst (Martin Dürst) about 6 years ago
- Is duplicate of Bug #15105: `rb_debug_inspector_open` breaks lazy proc optimization added
Updated by duerst (Martin Dürst) about 6 years ago
- Status changed from Open to Closed
stanhu (Stan Hu) wrote:
We can close this bug report in favor of https://bugs.ruby-lang.org/issues/15105. I've confirmed applying the patch in https://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/64800 has made the seg fault go away.
Closed at request of original submitter.