Bug #14101
closedUnreliable handling of groups nested within absent/absence operator of regex
Description
The new absent/absence regex operator, added to Onigmo and bundled into ruby since v2.4.1, supports nested groupings such as:
"abb".match /(?~(a|b)b)/
=> #<MatchData "a" 1:"a">
However, under some scenarios (I haven't been able to determine the exact cause), the execution fails:
"abb".match /(?~(a|c)c)/
ArgumentError: negative string size (or size too big)
from (irb):1:in `scan'
Interestingly, when running the above in pry
, we see some malformed object created:
"abb".match /(?~(a|c)c)/
#=> #<MatchData "abb" 1:#<MatchData:0x3fd47ec398d4>
"abb".scan /(?~(a|c)c)/
#=> ArgumentError: negative string size (or size too big)
I am unclear whether this bug belongs in the ruby project, or Onigmo.
Documentation on the operator is still a work in progress (https://github.com/k-takata/Onigmo/issues/87); perhaps nested groups should not be allowed by the engine?
Updated by tom-lord (Tom Lord) about 7 years ago
Here's a slightly more minimal reproduction example:
"abb".match /(?~(a)c)/
#=> ArgumentError: negative string size (or size too big)
My best guess is that the regexp engine is caught in an unexpected state, where the capture group still references an orphaned object?
Updated by nobu (Nobuyoshi Nakada) about 7 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r60755.
regexec.c: invalidate previously matched position
- regexec.c (match_at): invalidate end position not yet matched
when new start position is pushed, to dispose previously stored
position. [ruby-core:83743] [Bug #14101]