Project

General

Profile

Bug #14101

Unreliable handling of groups nested within absent/absence operator of regex

Added by tom-lord (Tom Lord) 11 days ago. Updated 11 days ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
2.5.0
[ruby-core:83743]

Description

The new absent/absence regex operator, added to Onigmo and bundled into ruby since v2.4.1, supports nested groupings such as:

"abb".match /(?~(a|b)b)/
 => #<MatchData "a" 1:"a">

However, under some scenarios (I haven't been able to determine the exact cause), the execution fails:

"abb".match /(?~(a|c)c)/
ArgumentError: negative string size (or size too big)
from (irb):1:in `scan'

Interestingly, when running the above in pry, we see some malformed object created:

"abb".match /(?~(a|c)c)/
#=> #<MatchData "abb" 1:#<MatchData:0x3fd47ec398d4>

"abb".scan /(?~(a|c)c)/
#=> ArgumentError: negative string size (or size too big)

I am unclear whether this bug belongs in the ruby project, or Onigmo.
Documentation on the operator is still a work in progress (https://github.com/k-takata/Onigmo/issues/87); perhaps nested groups should not be allowed by the engine?

Associated revisions

Revision 60755
Added by nobu (Nobuyoshi Nakada) 11 days ago

regexec.c: invalidate previously matched position

  • regexec.c (match_at): invalidate end position not yet matched when new start position is pushed, to dispose previously stored position. [Bug #14101]

History

#1 [ruby-core:83744] Updated by tom-lord (Tom Lord) 11 days ago

Here's a slightly more minimal reproduction example:

"abb".match /(?~(a)c)/
#=> ArgumentError: negative string size (or size too big)

My best guess is that the regexp engine is caught in an unexpected state, where the capture group still references an orphaned object?

#2 Updated by nobu (Nobuyoshi Nakada) 11 days ago

  • Status changed from Open to Closed

Applied in changeset trunk|r60755.


regexec.c: invalidate previously matched position

  • regexec.c (match_at): invalidate end position not yet matched when new start position is pushed, to dispose previously stored position. [Bug #14101]

Also available in: Atom PDF