Project

General

Profile

Actions

Bug #20083

closed

String#match? behaving inconsistently with Ruby 3.3.0

Added by jussikos (Jussi Koljonen) 4 months ago. Updated 3 months ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-darwin23]
[ruby-core:115888]

Description

From irb, when calling String#match?

pattern = /([\s]*ABC)$/i # or /(\s*ABC)/i

p "1ABC".match?(pattern) # => true
p "12ABC".match?(pattern) # => true
p "123ABC".match?(pattern) # => true
p "1231ABC".match?(pattern) # => true
p "12312ABC".match?(pattern) # => false
p "123123ABC".match?(pattern) # => false
p "1231231ABC".match?(pattern) # => true
p "12312312ABC".match?(pattern) # => true
p "123123123ABC".match?(pattern) # => false
p "1231231231ABC".match?(pattern) # => false
p "12312312312ABC".match?(pattern) # => true
p "123123123123ABC".match?(pattern) # => true
p "1231231231231ABC".match?(pattern) # => false

With ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-darwin22] and earlier versions (2.7.8 to 3.2.2) return value is always true

Update: the problem seems to be somehow related to the /i option, as all the above examples work correctly with /([\s]*ABC)$/


Related issues 1 (0 open1 closed)

Has duplicate Ruby master - Bug #20095: Regex lookahead behaving strangely in 3.3.0ClosedActions
Actions #1

Updated by jussikos (Jussi Koljonen) 4 months ago

  • Subject changed from Regexp#match? behaving inconsistently with Ruby 3.3.0 to String#match? behaving inconsistently with Ruby 3.3.0
Actions #2

Updated by jussikos (Jussi Koljonen) 4 months ago

  • Description updated (diff)
Actions #3

Updated by jussikos (Jussi Koljonen) 4 months ago

  • Description updated (diff)
Actions #4

Updated by jussikos (Jussi Koljonen) 4 months ago

  • Description updated (diff)

Updated by make_now_just (Hiroya Fujinami) 4 months ago

I created a PR for this bug (See https://github.com/ruby/ruby/pull/9367).
Thank you for your reporting!

The bug reason is a combination of a regex optimization and a bug for atomic groups.
First, since \s and the following A (internally it is treated as [aA] on i flag.) is mutually disjoint, \s*ABC is optimized to (?>\s*)ABC.
Next, match cache optimization for atomic groups in this case is buggy, so the matching results become wrong.

When i flag is not given, another optimization is applied and \s*ABC is optimized to (?:(?!A)\s)*ABC, so the bug is not occurred.

Actions #6

Updated by naruse (Yui NARUSE) 4 months ago

  • Backport changed from 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN to 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED
Actions #7

Updated by naruse (Yui NARUSE) 4 months ago

  • Has duplicate Bug #20095: Regex lookahead behaving strangely in 3.3.0 added
Actions #8

Updated by naruse (Yui NARUSE) 3 months ago

  • Status changed from Open to Closed

Updated by naruse (Yui NARUSE) 3 months ago

  • Backport changed from 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED to 3.0: DONTNEED, 3.1: DONTNEED, 3.2: DONTNEED, 3.3: DONE

ruby_3_3 5f3dfa1c273c6fb9eae65ceca633b46f7e30f686 merged revision(s) d8702ddbfbe8cc7fc601a9a4d19842ef9c2b76c1.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like1Like0Like0Like0Like0