Project

General

Profile

Actions

Bug #14103

closed

Regexp absense operator has no chance to ^C

Added by shyouhei (Shyouhei Urabe) almost 5 years ago. Updated 6 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.5.0dev (2017-11-09 trunk 60720) [x86_64-darwin15]
[ruby-core:83750]
Tags:

Description

Following script hangs, with no respond to ^C.

/(?<x> (?<! a ) a+ ){0}
 (?<y> (?~ \g<z> ) ){0}
 (?<z> (?<! a ) \k<x> (?! a ) ){0}
 \g<x> \g<y> \g<z>
/xo =~ (1..1024).map{|x| 'b' + 'a' * x }.join

Updated by jeremyevans0 (Jeremy Evans) 12 months ago

I submitted a pull request to fix this: https://github.com/ruby/ruby/pull/4960

The issue is unlikely to be specific to the absence operator, I think it affects any case where a regexp takes a long time due to backtracking. In addition to allowing interrupts, the pull request also allows yielding to other threads during a long regexp match (since checking for interrupts has that effect).

Actions #2

Updated by jeremyevans (Jeremy Evans) 7 months ago

  • Status changed from Open to Closed

Applied in changeset git|edc8576a65b7082597d45a694434261ec3ac0d9e.


Allow interrupting regexps that backtrack

Fixes [Bug #14103]

Co-authored-by: Nobuyoshi Nakada

Updated by mame (Yusuke Endoh) 6 months ago

  • Status changed from Closed to Open

This change degrades the performance of regular expression matching when frequent backtracking occurs.

Before edc8576a65b7082597d45a694434261ec3ac0d9e

$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-10T19:06:33Z master edc8576a65) [x86_64-linux]

real    0m3.824s
user    0m3.820s
sys     0m0.004s

After edc8576a65b7082597d45a694434261ec3ac0d9e

$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-10T19:06:33Z master edc8576a65) [x86_64-linux]

real    0m4.608s
user    0m4.588s
sys     0m0.016s

I have no idea if this may lead to any actual problem, but how about reducing the frequency of rb_thread_check_ints? This PR makes the check only once every 128 backtracks.

https://github.com/ruby/ruby/pull/5697

This restores the original performance.

$ time ./miniruby -ve '/^a*b?a*$/ =~ "a" * 20000 + "x"'
ruby 3.2.0dev (2022-03-23T14:55:49Z master 8f1c69f27c) [x86_64-linux]

real    0m3.702s
user    0m3.696s
sys     0m0.000s

Still, it allows immediate interrupts.

$ ./miniruby -e '/^a*b?a*$/ =~ "a" * 20000 + "x"'
^C-e:1:in `<main>': Interrupt

$ ./miniruby -e "/(?<x> (?<! a ) a+ ){0}
 (?<y> (?~ \g<z> ) ){0}
 (?<z> (?<! a ) \k<x> (?! a ) ){0}
 \g<x> \g<y> \g<z>
/xo =~ (1..1024).map{|x| 'b' + 'a' * x }.join"
^C-e:5:in `<main>': Interrupt

Updated by mame (Yusuke Endoh) 6 months ago

  • Status changed from Open to Closed

Fixed at 9112cf4ae7f7ea8ab33c282aa02eec812421aeab.

Actions

Also available in: Atom PDF