Project

General

Profile

Bug #13412

Infinite recursion with define_method may cause silent SEGV or cfp consistency error

Added by wanabe (_ wanabe) almost 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.0dev (2017-04-09 trunk 58286) [x86_64-linux]
[ruby-core:80618]

Description

The script causes silent (no output [BUG]) SEGV or "cfp consistency error" on my environment.

define_method(:foo) { foo }

loop do
  1.times do
    1.times do
      begin
        foo
      rescue Exception
        nil
      end
    end
  end
end

I think this is related to #11430 (maybe same).


Files

stderr.log (6.96 KB) stderr.log wanabe (_ wanabe), 04/10/2017 11:15 PM
bug.rb (123 Bytes) bug.rb wanabe (_ wanabe), 04/16/2017 03:05 AM
bug.sh (775 Bytes) bug.sh wanabe (_ wanabe), 04/16/2017 03:05 AM
bug13412.r58331.patch (689 Bytes) bug13412.r58331.patch wanabe (_ wanabe), 04/16/2017 03:05 AM
bug13412.r58367.patch (711 Bytes) bug13412.r58367.patch wanabe (_ wanabe), 04/16/2017 03:05 AM
bug_stat.sh (296 Bytes) bug_stat.sh wanabe (_ wanabe), 04/16/2017 03:05 AM
bug.sh (786 Bytes) bug.sh wanabe (_ wanabe), 04/16/2017 05:34 AM
bug_stat.sh (468 Bytes) bug_stat.sh wanabe (_ wanabe), 04/16/2017 05:34 AM
bug13412.r58367.patch (1 KB) bug13412.r58367.patch wanabe (_ wanabe), 04/16/2017 05:34 AM
cfp_before_setjmp.patch (2.09 KB) cfp_before_setjmp.patch wanabe (_ wanabe), 04/16/2017 11:45 PM
ensure_stack.patch (850 Bytes) ensure_stack.patch wanabe (_ wanabe), 04/16/2017 11:55 PM
get_tagged_next_cfp.patch (3.33 KB) get_tagged_next_cfp.patch wanabe (_ wanabe), 04/17/2017 12:02 AM
13412.patch (1.06 KB) 13412.patch wanabe (_ wanabe), 09/05/2017 10:36 AM

Related issues

Related to Ruby trunk - Bug #13164: A second `SystemStackError` exception results in `Segmentation fault (core dumped)`OpenActions
Related to Ruby trunk - Bug #11430: Redefining a lazy-loaded variable in child context within RSpec spec causes crashFeedbackActions
Related to Ruby trunk - Bug #9505: Bug that should cause SystemStackError segfaults under Ruby 2.1FeedbackActions
Related to Ruby trunk - Bug #14387: Ruby 2.5 を Alpine Linux で実行すると比較的浅めで SystemStackError 例外になるOpenActions

Associated revisions

Revision 0afc8db9
Added by nobu (Nobuyoshi Nakada) over 1 year ago

vm_insnhelper.c: cfp error at stack overflow

  • vm_insnhelper.c (threadptr_stack_overflow): set stack overflow flag until handling execptions, to get rid of cfp consistency error when exec tag was rewound. [ruby-core:80618] [Bug #13412]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@59606 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 59606
Added by nobu (Nobuyoshi Nakada) over 1 year ago

vm_insnhelper.c: cfp error at stack overflow

  • vm_insnhelper.c (threadptr_stack_overflow): set stack overflow flag until handling execptions, to get rid of cfp consistency error when exec tag was rewound. [ruby-core:80618] [Bug #13412]

Revision 59606
Added by nobu (Nobuyoshi Nakada) over 1 year ago

vm_insnhelper.c: cfp error at stack overflow

  • vm_insnhelper.c (threadptr_stack_overflow): set stack overflow flag until handling execptions, to get rid of cfp consistency error when exec tag was rewound. [ruby-core:80618] [Bug #13412]

Revision 59606
Added by nobu (Nobuyoshi Nakada) over 1 year ago

vm_insnhelper.c: cfp error at stack overflow

  • vm_insnhelper.c (threadptr_stack_overflow): set stack overflow flag until handling execptions, to get rid of cfp consistency error when exec tag was rewound. [ruby-core:80618] [Bug #13412]

History

#1

Updated by wanabe (_ wanabe) almost 2 years ago

  • Related to Bug #11430: Redefining a lazy-loaded variable in child context within RSpec spec causes crash added
#2

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Is duplicate of Bug #13164: A second `SystemStackError` exception results in `Segmentation fault (core dumped)` added

Updated by wanabe (_ wanabe) almost 2 years ago

Duplicates Bug #13164: A second SystemStackError exception results in Segmentation fault (core dumped) added

#13164 is very much like this issue, but not quite the same.
Because this issue doesn't occur without define_method.

The followning script works fine, loops infinitely.

def foo; foo; end

loop do
  1.times do
    1.times do
      begin
        foo
      rescue Exception
        nil
      end
    end
  end
end

And another following script may cause "cfp consistency error" in rare cases. (5% or lower ?)

define_method(:foo) { foo }

1.times do
  1.times do
    begin
      foo
    rescue Exception
      nil
    end
  end
end

Updated by wanabe (_ wanabe) almost 2 years ago

Looks like inconsistency longjmp().

I added debug print:

diff --git a/vm_insnhelper.c b/vm_insnhelper.c
index 21a358cb30..1a19ae397a 100644
--- a/vm_insnhelper.c
+++ b/vm_insnhelper.c
@@ -1766,7 +1766,9 @@ vm_call_cfunc_with_frame(rb_thread_t *th, rb_control_frame_t *reg_cfp, struct rb

     reg_cfp->sp -= argc + 1;
     VM_PROFILE_UP(R2C_CALL);
+    fprintf(stderr, ">> ccwf %x %x %x\n", &val, reg_cfp, th->cfp);
     val = (*cfunc->invoker)(cfunc->func, recv, argc, reg_cfp->sp + 1);
+    fprintf(stderr, "<< ccwf %x %x %x\n", &val, reg_cfp, th->cfp);

     if (reg_cfp != th->cfp + 1) {
    rb_bug("vm_call_cfunc - cfp consistency error");

normal case output:

$ ./miniruby bug.rb
>> ccwf de222b70 c68cbfb0 c68cbf80
<< ccwf de222b70 c68cbfb0 c68cbf80
>> ccwf de222b70 c68cbfb0 c68cbf80
>> ccwf de2216d0 c68cbf50 c68cbf20
<< ccwf de2216d0 c68cbf50 c68cbf20
<< ccwf de222b70 c68cbfb0 c68cbf80

[BUG] case output:

$ ./miniruby bug.rb
>> ccwf ca6fe2f0 53f7ffb0 53f7ff80
<< ccwf ca6fe2f0 53f7ffb0 53f7ff80
>> ccwf ca6fe2f0 53f7ffb0 53f7ff80
>> ccwf ca6fce50 53f7ff50 53f7ff20
<< ccwf ca6fe2f0 53f7ffb0 53f7ff20
bug.rb:4: [BUG] vm_call_cfunc - cfp consistency error
ruby 2.5.0dev (2017-04-09 trunk 58286) [x86_64-linux]

(snipped and attached)

I expected the last &val value should be "ca6fce50" but "ca6fe2f0".
This is the value of previous stack frame.

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

It's by check_stack_overflow() in signal.c:

    if ((uintptr_t)th->tag->buf / pagesize == sp_page) {
        /* drop the last tag if it is close to the fault,
         * otherwise it can cause stack overflow again at the same
         * place. */
        th->tag = th->tag->prev;
    }
#6

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Related to deleted (Bug #11430: Redefining a lazy-loaded variable in child context within RSpec spec causes crash)
#7

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Is duplicate of Bug #11430: Redefining a lazy-loaded variable in child context within RSpec spec causes crash added
#8

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Is duplicate of deleted (Bug #13164: A second `SystemStackError` exception results in `Segmentation fault (core dumped)`)
#9

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Related to Bug #13164: A second `SystemStackError` exception results in `Segmentation fault (core dumped)` added
#10

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Is duplicate of deleted (Bug #11430: Redefining a lazy-loaded variable in child context within RSpec spec causes crash)
#11

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Related to Bug #11430: Redefining a lazy-loaded variable in child context within RSpec spec causes crash added

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

  • Description updated (diff)

This ticket concerns two different issues, silent SEGV and cfp consistency error.

Updated by wanabe (_ wanabe) almost 2 years ago

I have checked the patterns with attached .patch and .sh and .rb.

This is the result at r58331.

ruby 2.5.0dev (2017-04-13 trunk 58331) [x86_64-linux]
73 bug.*.cfp.noprev.BUG.txt
     73 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
35 bug.*.cfp.prev.BUG.txt
     35 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
44 bug.*.nocfp.noprev.silent.txt
     20 #0  hook_before_rewind at ../../vm.c:1641
     10 #0  vm_exec at ../../vm.c:1764
      6 #0  VM_ENV_FLAGS at ../../vm_core.h:1019
      5 #0  VM_ENV_FLAGS (ep=0x0, flag=0) at ../../vm_core.h:1019
      3 #0  VM_FRAME_TYPE at ../../vm_core.h:1027

And this is the result at r58367.

ruby 2.5.0dev (2017-04-16 trunk 58367) [x86_64-linux]
83 bug.*.cfp.noprev.BUG.txt
     83 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
797 bug.*.cfp.prev.BUG.txt
    797 #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58

It seems like silent SEGVs are eliminated.
Thank you Nobu for your works. (r58353, r58354, r58363 and maybe r58328, r58334, r58352 and/or r58361?)

Updated by wanabe (_ wanabe) almost 2 years ago

Here is another stat for association between first SEGV point and process result.

ruby 2.5.0dev (2017-04-16 trunk 58367) [x86_64-linux]
71 bug.*.cfp.noprev.BUG.txt
     71 #6   in vm_exec () at ../../vm.c:1759, #7   in invoke_bmethod () at ../../vm.c:982
793 bug.*.cfp.prev.BUG.txt
    724 #6   in vm_exec_core () at ../../vm_exec.c:49, #7   in vm_exec () at ../../vm.c:1769
     18 #6   in vm_exec () at ../../vm.c:1769, #7   in invoke_bmethod () at ../../vm.c:982
     17 #6   in vm_call_bmethod_body () at ../../vm_insnhelper.c:1885, #7   in vm_call_bmethod () at ../../vm_insnhelper.c:1909
     12 #6   in vm_search_method () at ../../vm_insnhelper.c:1234, #7   in vm_exec_core () at ../../insns.def:1079
      7 #6   in vm_call_bmethod () at ../../vm_insnhelper.c:1906, #7   in vm_exec_core () at ../../insns.def:1080
      5 #6   in vm_call_bmethod_body () at ../../vm_insnhelper.c:1892, #7   in vm_call_bmethod () at ../../vm_insnhelper.c:1909
      4 #6   in vm_search_method () at ../../vm_insnhelper.c:1235, #7   in vm_exec_core () at ../../insns.def:1079
      3 #6   in vm_exec_core () at ../../insns.def:1079, #7   in vm_exec () at ../../vm.c:1769
      3 #6   in rb_class_of () at ../../include/ruby/ruby.h:1965, #7   in vm_search_method () at ../../vm_insnhelper.c:1235
135 bug.pass.*.nocore.silent.txt
     23 #6   in vm_yield_setup_args () at ../../vm_insnhelper.c:2560, #7   in invoke_iseq_block_from_c () at ../../vm.c:1007
     22 #6   in vm_callee_setup_block_arg () at ../../vm_insnhelper.c:2519, #7   in vm_yield_setup_args () at ../../vm_insnhelper.c:2571
     20 #6   in invoke_iseq_block_from_c () at ../../vm.c:992, #7   in invoke_block_from_c_unsplattable () at ../../vm.c:1099
     11 #6   in invoke_block_from_c_unsplattable () at ../../vm.c:1095, #7   in vm_invoke_bmethod () at ../../vm.c:1140
     10 #6   in vm_push_frame () at ../../vm_insnhelper.c:179, #7   in invoke_bmethod () at ../../vm.c:973
      9 #6   in vm_invoke_bmethod () at ../../vm.c:1139, #7   in vm_call_bmethod_body () at ../../vm_insnhelper.c:1892
      7 #6   in vm_invoke_bmethod () at ../../vm.c:1140, #7   in vm_call_bmethod_body () at ../../vm_insnhelper.c:1892
      7 #6   in vm_callee_setup_block_arg () at ../../vm_insnhelper.c:2520, #7   in vm_yield_setup_args () at ../../vm_insnhelper.c:2571
      6 #6   in invoke_block_from_c_unsplattable () at ../../vm.c:1097, #7   in vm_invoke_bmethod () at ../../vm.c:1140
      5 #6   in vm_block_type () at ../../vm_core.h:1279, #7   in invoke_block_from_c_unsplattable () at ../../vm.c:1097
      3 #6   in vm_yield_setup_args () at ../../vm_insnhelper.c:2571, #7   in invoke_iseq_block_from_c () at ../../vm.c:1007
      3 #6   in simple_iseq_p () at ../../vm_insnhelper.c:1459, #7   in vm_callee_setup_block_arg () at ../../vm_insnhelper.c:2520
      3 #6   in rb_iseq_check () at ../../vm_core.h:416, #7   in invoke_iseq_block_from_c () at ../../vm.c:993
      3 #6   in invoke_iseq_block_from_c () at ../../vm.c:993, #7   in invoke_block_from_c_unsplattable () at ../../vm.c:1099
      3 #6   in invoke_block_from_c_unsplattable () at ../../vm.c:1099, #7   in vm_invoke_bmethod () at ../../vm.c:1140

Updated by wanabe (_ wanabe) almost 2 years ago

There are some choices for this "cfp consistency error".
All patches are just for description and incomplete.

  1. Mark as WONTFIX
    I think this is most reasonable because the issue is edge case.

  2. Ensure enough stack before rb_vm_push_frame() or control SIGSEGV point
    ensure_stack.patch attached.
    Using large machine stack frame can check that there is enough stack frame.

  3. Rollback cfp when SEGV point is between rb_vm_push_frame() and TH_EXEC_TAG()
    cfp_before_setjmp.patch attached.
    setjmp() rolls back machine stack at previous TH_EXEC_TAG() point.
    So also th->cfp should be rolled back at that time.

  4. Rollback cfp at the moment of TH_EXEC_TAG()
    get_tagged_next_cfp.patch attached.
    This is like previous 3. pattern, but more precise and more wasteful.

5.Others

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

wanabe (_ wanabe) wrote:

  1. Ensure enough stack before rb_vm_push_frame() or control SIGSEGV point ensure_stack.patch attached. Using large machine stack frame can check that there is enough stack frame.

ALLOCA would be useful.

Updated by wanabe (_ wanabe) almost 2 years ago

nobu (Nobuyoshi Nakada) wrote:

ALLOCA would be useful.

Is it platform dependent?
__builtin_alloca() of gcc would be useful but alloca() in missing/alloca.c would not, I guess.

Updated by nobu (Nobuyoshi Nakada) almost 2 years ago

wanabe (_ wanabe) wrote:

nobu (Nobuyoshi Nakada) wrote:

ALLOCA would be useful.

Is it platform dependent?

Yes, of course.

__builtin_alloca() of gcc would be useful but alloca() in missing/alloca.c would not, I guess.

Only when C_ALLOCA is not defined.
See reserve_stack in thread_pthread.c.

Updated by mtsmfm (Fumiaki Matsushima) over 1 year ago

I reproduced this bug with 2.2.7, 2.3.4 and 2.4.1.

docker run ruby:2.2.7-alpine ruby -e 'define_method(:foo) { foo }; loop { 1.times { 1.times { begin; foo; rescue Exception; nil; end } } }'
docker run ruby:2.3.4-alpine ruby -e 'define_method(:foo) { foo }; loop { 1.times { 1.times { begin; foo; rescue Exception; nil; end } } }'
docker run ruby:2.4.1-alpine ruby -e 'define_method(:foo) { foo }; loop { 1.times { 1.times { begin; foo; rescue Exception; nil; end } } }'
#20

Updated by nobu (Nobuyoshi Nakada) over 1 year ago

  • Status changed from Open to Closed

Applied in changeset trunk|r59606.


vm_insnhelper.c: cfp error at stack overflow

  • vm_insnhelper.c (threadptr_stack_overflow): set stack overflow flag until handling execptions, to get rid of cfp consistency error when exec tag was rewound. [ruby-core:80618] [Bug #13412]
#21

Updated by nagachika (Tomoyuki Chikanaga) over 1 year ago

  • Backport changed from 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN to 2.2: REQUIRED, 2.3: REQUIRED, 2.4: REQUIRED

Updated by wanabe (_ wanabe) over 1 year ago

It seems to be fixed at trunk. Thank you.
But I guess that it is not due to r59606, but rather to r59630 and r59676.

$ git checkout $(git log --grep "trunk@59606" origin/trunk --format="%h") && git checkout -B work && 
  make miniruby && for i in `seq 1 1 10`; do
    ./miniruby -ve 'define_method(:foo) { foo }; 1.times { 1.times { 1.times { begin; foo; rescue Exception; nil; end } } } ' || break
  done
Note: checking out '0afc8db914'.
(snip)
linking miniruby
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
-e:1: [BUG] vm_call_cfunc: cfp consistency error (0x00007f622207bf50, 0x00007f622207bef0)
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]

(snip)
$ git checkout $(git log --grep "trunk@59606" origin/trunk --format="%h")~ && git checkout -B work &&
  for r in 59630 59676; do git cherry-pick $(git log --grep "trunk@$r" origin/trunk --format="%h"); done &&
    make miniruby && for i in `seq 1 1 10`; do
      ./miniruby -ve 'define_method(:foo) { foo }; 1.times { 1.times { 1.times { begin; foo; rescue Exception; nil; end } } } ' || break
  done
Note: checking out '0afc8db914~'.
(snip)
linking miniruby
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]
ruby 2.5.0dev (2017-08-16 work 59606) [x86_64-linux]

Updated by wanabe (_ wanabe) over 1 year ago

Hmm... r59630 seems to be too hard to backport.
I had to cherry-pick r58328, r58353, r58354, r58374, r58377 and r58379 before cherry-picking r59630.
They are too many.

$ git checkout origin/ruby_2_4 && git checkout -B work &&
  for r in 58328 58353 58354 58374 58377 58379 59630 59676; do 
    git cherry-pick $(git log --grep "trunk@$r" origin/trunk --format="%h") 
  done &&
  make miniruby -j4 && for i in `seq 1 1 10`; do 
   ./miniruby -ve 'define_method(:foo) { foo }; 1.times { 1.times { 1.times { begin; foo; rescue Exception; nil; end } } } ' || break
  done
Note: checking out 'origin/ruby_2_4'.
(snip)
linking miniruby
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]

Trimmed patch is here, but I really cannot say this is backport. (which commits correspond with the patch?)

$ git checkout origin/ruby_2_4 && git checkout -B work &&
  patch -d $(git rev-parse --show-toplevel) -p1 < 13412.patch &&
  make miniruby -j4 && for i in `seq 1 1 10`; do
   ./miniruby -ve 'define_method(:foo) { foo }; 1.times { 1.times { 1.times { begin; foo; rescue Exception; nil; end } } } ' || break
  done
Note: checking out 'origin/ruby_2_4'.
(snip)
linking miniruby
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
ruby 2.4.2p181 (2017-08-05 revision 59606) [x86_64-linux]
#24

Updated by wanabe (_ wanabe) over 1 year ago

  • Related to Bug #9505: Bug that should cause SystemStackError segfaults under Ruby 2.1 added
#25

Updated by wanabe (_ wanabe) about 1 year ago

  • Related to Bug #14387: Ruby 2.5 を Alpine Linux で実行すると比較的浅めで SystemStackError 例外になる added

Also available in: Atom PDF