Project

General

Profile

Actions

Backport #4028

closed

substring selection and utf8 encoding problem

Added by barcala (Fco. Mario Barcala Rodríguez) almost 12 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Normal
[ruby-core:33072]

Description

=begin
Substring selection does not work with some utf8 encoded strings. Below is an example. The first substring is well extracted but the second not (extrange characters appear at the end of the substring).

It seems it occurs when the string includes letters with umlauts, accents, etc.

$ irb

ruby-1.9.1-p378 > word = "Ábaco"
=> "Ábaco"
ruby-1.9.1-p378 > substr = word[word.length-1,word.length]
=> "o"
ruby-1.9.1-p378 > word = "Coordinador de ONG's do País Valenciano"
=> "Coordinador de ONG's do País Valenciano"
ruby-1.9.1-p378 > substr = word[word.length-1,word.length]
=> "o\x00\x00\x01\x00\x01\x00\x00\x00"
=end


Related issues 1 (0 open1 closed)

Is duplicate of Ruby master - Bug #2379: String#[] returns invalid values for short multibyte stringsClosednaruse (Yui NARUSE)11/18/2009Actions
Actions #1

Updated by barcala (Fco. Mario Barcala Rodríguez) almost 12 years ago

=begin
The same error occurs in ruby-1.9.1-p430
=end

Actions #2

Updated by barcala (Fco. Mario Barcala Rodríguez) almost 12 years ago

=begin
It seems to be solved in ruby-1.9.2-p0 version. I can't reproduce the error in 1.9.2-p0
=end

Actions #3

Updated by barcala (Fco. Mario Barcala Rodríguez) almost 12 years ago

=begin
Showed example uses substring selection in a wrong way. Example should be:

ruby-1.9.1-p378 > word = "Ábaco"
=> "Ábaco"
ruby-1.9.1-p378 > substr = word[word.length-1,1]
=> "o"
ruby-1.9.1-p378 > word = "Coordinador de ONG's do País Valenciano"
=> "Coordinador de ONG's do País Valenciano"
ruby-1.9.1-p378 > substr = word[word.length-1,1]
=> "o"

This new example works fine, so the problem arises only when the second value of substring selection exceeds the limits of the string.
=end

Actions #4

Updated by naruse (Yui NARUSE) almost 12 years ago

  • Status changed from Open to Assigned
  • Assignee set to yugui (Yuki Sonoda)
  • Priority changed from 5 to Normal

=begin
Confirmed:
ruby 1.9.1p430 (2010-08-16 revision 28997) [x86_64-freebsd8.1]
ruby-1.9.1-p378 > word = "Coordinador de ONG's do País Valenciano"
=> "Coordinador de ONG's do País Valenciano"
ruby-1.9.1-p378 > substr = word[word.length-1,word.length]
=> "o\x00\x00\x01\x00\x01\x00\x00\x00"
=end

Actions #5

Updated by jeremyevans0 (Jeremy Evans) about 3 years ago

  • Status changed from Assigned to Closed
  • Description updated (diff)
Actions

Also available in: Atom PDF