Bug #5126

Unicode character classes interpolated into regex throws exception

Added by Xavier Shay over 2 years ago. Updated over 2 years ago.

[ruby-core:38635]
Status:Closed
Priority:Normal
Assignee:-
Category:-
Target version:1.9.3
ruby -v:ruby 1.9.3dev (2011-07-31 revision 32789) [x86_64-darwin10.7.0] Backport:

Description

The following script runs under 1.9.2-p290:

# encoding: UTF-8
letter = '\p{L}'
atext = "[#{letter}]"
/#{atext}/

Under 1.9.3-preview1 it raises an exception:

test.rb:6:in `<main>': invalid character property name {L}: /[\p{L}]/ (RegexpError)

The interpolation is necessary to reproduce this bug, unicode character classes work fine when entered directly into the regex.

JRuby has a similar bug: http://jira.codehaus.org/browse/JRUBY-5622

This technique is used in datamapper to build a regex for matching emails:
https://github.com/datamapper/dm-validations/blob/master/lib/dm-validations/formats/email.rb


Related issues

Duplicated by Backport93 - Backport #5287: 1.9.3 - Interpolation in a string causes the string's enc... Closed 09/07/2011

Associated revisions

Revision 32791
Added by Yui NARUSE over 2 years ago

  • insns.def (concatstrings): don't use initial ASCII-8BIT string. [Bug #5126]

History

#1 Updated by Yui NARUSE over 2 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r32791.
Xavier, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • insns.def (concatstrings): don't use initial ASCII-8BIT string. [Bug #5126]

Also available in: Atom PDF