Bug #4340
closedEncoding of result string for String#gsub is not consistent
Description
=begin
Depending upon where the replacement occurs, the encoding of the result of String#gsub is not consistent.
When the replacement happens at the beginning of the string the encoding of the result is the encoding of the replacement string.
When the replacement happens elsewhere in the string the encoding of the result is the result of the original string.
With String#sub the encoding of the result is the encoding of the original string always.
$ cat t.rb
puts 'using gsub'
hello_world = 'Hello World!'
hello_world.force_encoding Encoding::UTF_8
everybody = 'Everybody'
everybody.force_encoding Encoding::US_ASCII
hello_everybody = hello_world.gsub(/World/, 'Everybody')
p hello_everybody
p hello_everybody.encoding
hi = 'Hi'
hi.force_encoding Encoding::US_ASCII
hi_world = hello_world.gsub(/Hello/, 'Hi')
p hi_world
p hi_world.encoding
puts 'using sub'
hello_world = 'Hello World!'
hello_world.force_encoding Encoding::UTF_8
everybody = 'Everybody'
everybody.force_encoding Encoding::US_ASCII
hello_everybody = hello_world.sub(/World/, 'Everybody')
p hello_everybody
p hello_everybody.encoding
hi = 'Hi'
hi.force_encoding Encoding::US_ASCII
hi_world = hello_world.sub(/Hello/, 'Hi')
p hi_world
p hi_world.encoding
$ ruby19 -v t.rb
ruby 1.9.3dev (2011-01-26 trunk 30659) [x86_64-darwin10.6.0]
using gsub
"Hello Everybody!"
#Encoding:UTF-8
"Hi World!"
#Encoding:US-ASCII
using sub
"Hello Everybody!"
#Encoding:UTF-8
"Hi World!"
#Encoding:UTF-8
=end
Files
Updated by headius (Charles Nutter) about 13 years ago
=begin
Your beginning-of-string substitutions don't use the "hi" variable in either case. It doesn't affect the result, though.
JRuby behaves differently, apparently using the pattern's encoding in gsub and the original's encoding in sub (and our pattern's encoding is wrong due to other issues).
~/projects/jruby ➔ jruby --1.9 t.rb
using gsub
"Hello Everybody!"
#Encoding:ASCII-8BIT
"Hi World!"
#Encoding:ASCII-8BIT
using sub
"Hello Everybody!"
#Encoding:UTF-8
"Hi World!"
#Encoding:UTF-8
Filed: http://jira.codehaus.org/browse/JRUBY-5437
=end
Updated by drbrain (Eric Hodel) about 13 years ago
=begin
The attached patch fixes this problem, may I commit?
=end
Updated by naruse (Yui NARUSE) about 13 years ago
=begin
Yes, you can; please commit it with a test.
=end
Updated by meta (mathew murphy) about 13 years ago
Updated by drbrain (Eric Hodel) about 13 years ago
- Status changed from Open to Closed
- Assignee set to drbrain (Eric Hodel)
=begin
Fixed by r30806 (with test)
=end
Updated by meta (mathew murphy) about 13 years ago
=begin
On Fri, Feb 4, 2011 at 10:37, mathew meta@pobox.com wrote:
Can I ask why regexps are not affected by
encoding: UTF-8¶
declarations?
Nobody?
I still can't think of a reason, so what am I missing?
mathew
=end
Updated by nobu (Nobuyoshi Nakada) about 13 years ago
=begin
Hi,
At Wed, 9 Feb 2011 04:41:14 +0900,
mathew wrote in [ruby-core:35154]:
Can I ask why regexps are not affected by
encoding: UTF-8¶
declarations?
Nobody?
I still can't think of a reason, so what am I missing?
It does affect.
$ ruby -e '#encoding:utf-8' -e 'p /\u3042/.encoding'
#Encoding:UTF-8
$ ruby -e '#encoding:cp932' -e 'p /\x81\x42/.encoding'
#Encoding:Windows-31J
$ ruby -e '#encoding:euc-jp' -e 'p /\xa1\xa2/.encoding'
#Encoding:EUC-JP
--
Nobu Nakada
=end
Updated by meta (mathew murphy) about 13 years ago
=begin
On Tue, Feb 8, 2011 at 16:27, Eric Hodel drbrain@segment7.net wrote:
You're asking this on a thread attached to a bug on redmine that has
nothing to do with regular expressions. Try making a new bug or thread.
http://redmine.ruby-lang.org/projects/ruby/issues/new reports an error:
"No tracker is associated to this project. Please check the Project settings."
mathew
=end
Updated by sorah (Sorah Fukumori) about 13 years ago
=begin
Hi,
On Thu, Feb 10, 2011 at 12:27 AM, mathew meta@pobox.com wrote:
http://redmine.ruby-lang.org/projects/ruby/issues/new reports an error:
"No tracker is associated to this project. Please check the Project settings."
Look here:
http://redmine.ruby-lang.org/wiki/ruby/HowtoReport
user can't create new ticket on ruby project.
please create on ruby1.9 or ruby1.8.
Thanks,
--
Shota Fukumori a.k.a. @sora_h - http://codnote.net/
=end
Updated by meta (mathew murphy) about 13 years ago
=begin
On Wed, Feb 9, 2011 at 10:08, Shota Fukumori (sora_h) sorah@tubusu.net wrote:
Look here:
http://redmine.ruby-lang.org/wiki/ruby/HowtoReport
Can a link to that be added to the "My Page" template?
mathew
=end