Bug #8151

Duplicate character class warning

Added by Aaron Patterson about 1 year ago. Updated about 1 year ago.

[ruby-core:53649]
Status:Closed
Priority:Normal
Assignee:Yui NARUSE
Category:-
Target version:-
ruby -v:ruby 1.9.3p374 (2013-01-15 revision 38858) [x86_64-darwin12.2.1] Backport:

Description

I get a duplicate character class warning, but I think it's a bug. Here is the example code:

def embed exp, depth
  return exp if depth == 0
  embed(/#{exp}/, depth - 1)
end

3.times { |i|
  puts "DEPTH #{i + 1}"
  embed(/[a-z\u{7b}-\u{7d}]/, i + 1)
}

At depth = 1, there is no warning, but greater than 1, I get a duplicate character class warning. I don't think the character class overlaps, so there should never be a warning.

Associated revisions

Revision 40063
Added by Yui NARUSE about 1 year ago

  • re.c (rbregtos): suppress duplicated charclass warning. Regexp#tos suppress extra its whole regexp options by calling onignew with its source, but it doesn't call rbreg_preprocess. Therefore its Unicode escapes (\u{XXXX}) are given as is, and it may cause duplicated charclass warning for example "[\u{33}]" (3 is duplicated) or "[\u{a}\u{b}]" (u is duplicated). [Bug #8151]

History

#1 Updated by Yusuke Endoh about 1 year ago

  • Status changed from Open to Assigned
  • Assignee set to Akira Tanaka

Interestingly, this seems to be a bug of Regexp#to_s, not a regexp creation.

re = /(?:[\u{33}])/
p re #=> /(?:[\u{33}])/

puts re.to_s
#=> warning: character class has duplicated range: /[\u{33}]/
#=> (?-mix:[\u{33}])

As I recall, akr created round trip to_s.

Yusuke Endoh mame@tsg.ne.jp

#2 Updated by Akira Tanaka about 1 year ago

  • Assignee changed from Akira Tanaka to Yui NARUSE

The warning is proposed by Feature #1831 and committed by naruse.

It is seven years later than my Regexp#to_s ruby-dev:16951

It seems this issue is better to assign naruse.

#3 Updated by Yui NARUSE about 1 year ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r40063.
Aaron, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • re.c (rbregtos): suppress duplicated charclass warning. Regexp#tos suppress extra its whole regexp options by calling onignew with its source, but it doesn't call rbreg_preprocess. Therefore its Unicode escapes (\u{XXXX}) are given as is, and it may cause duplicated charclass warning for example "[\u{33}]" (3 is duplicated) or "[\u{a}\u{b}]" (u is duplicated). [Bug #8151]

Also available in: Atom PDF