Bug #8512

gsub() works incorrect

Added by Oleg K 10 months ago. Updated 10 months ago.

[ruby-core:55430]
Status:Rejected
Priority:Normal
Assignee:-
Category:-
Target version:-
ruby -v:ruby 2.0.0p195 (2013-05-14) [i386-mingw32] Backport:1.9.3: UNKNOWN, 2.0.0: UNKNOWN

Description

irb(main):005:0> "\".gsub("\", "\\").length
=> 1
irb(main):006:0> "\".gsub("\", "XX").length
=> 2

bug is duplicated with rejected bug #8511

History

#1 Updated by Benoit Daloze 10 months ago

  • Status changed from Open to Rejected

This is due to Regexp replace syntax and literal strings.

In literal strings, you need two \ to produce one \ character (a single is the start of an escape character like \t, \n, ...).
And in Regexp replacement strings, you need to escape the \ (a single one is the beginning of a special replacement sequence like \1,&,...).
So that makes 4 \ for one produced in literal replacement strings:

"\".gsub("\", "\\").length
=> 1
"\".gsub("\", "\\\\").length
=> "\\"

Not so nice, but definitely expected behavior.

#2 Updated by Benoit Daloze 10 months ago

This should be documented in Regexp's overview though.

#3 Updated by Hans Mackowiak 10 months ago

=begin

the docs says:

((If replacement is a String it will be substituted for
the matched text. It may contain back-references to the pattern's capture
groups of the form \\d, where d is a group number, or \\k, where n is
a group name. If it is a double-quoted string, both back-references must be
preceded by an additional backslash.
))

so you need more "\" in your string
=end

#4 Updated by Oleg K 10 months ago

Hanmac (Hans Mackowiak) wrote:

=begin

the docs says:

((If replacement is a String it will be substituted for
the matched text. It may contain back-references to the pattern's capture
groups of the form \\d, where d is a group number, or \\k, where n is
a group name. If it is a double-quoted string, both back-references must be
preceded by an additional backslash.
))

so you need more "\" in your string
=end

Thanks for the explanation.

#5 Updated by Matthew Kerwin 10 months ago

For the record, you can also use the hash or block replacement forms, because those doesn't use regexp back-references:

"\".gsub("\", "\"=>"\\") #=> "\\"
"\".gsub("\") { "\\" } #=> "\\"

Also available in: Atom PDF