Project

General

Profile

Bug #4366

UTF-8文字列に対しての部分文字列取得操作で結果にゴミがつくことがある

Added by kosaki (Motohiro KOSAKI) almost 9 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
ruby -v:
ruby 1.9.3dev (2011-02-04 trunk 30761) [x86_64-linux]
Backport:
[ruby-dev:43170]

Description

=begin
test.rb


# coding: utf-8

str="あいうえお"

p str[2,17]


結果

% ./ruby -v test.rb
ruby 1.9.3dev (2011-02-04 trunk 30761) [x86_64-linux]
"うえお\u0000"

で、考察なんですが、

static char *
str_utf8_nth(const char *p, const char *e, long *nthp)
{
long nth = *nthp;

 if ((int)SIZEOF_VALUE < e - p && (int)SIZEOF_VALUE * 2 < nth) {

             ↑ e-pつまり文字列長の判定がsizeof(VALUE)*2ではなくsizeof(VALUE) (1)

     do {
         nth -= count_utf8_lead_bytes_with_word(s);
         s++;
     } while (s < t && (int)sizeof(VALUE) <= nth);

              ↑ここがwhileではなくdoループ (2)

なので(1)によりs==tがありえて、その場合(2)により文字列外にたいして
count_utf8_lead_bytes_with_word()呼んじゃってるようです。
=end

Associated revisions

Revision e0d1e245
Added by kosaki (Motohiro KOSAKI) almost 9 years ago

  • string.c (str_utf8_nth): fixed a conditon of optimized lead byte counting. [Bug #4366][ruby-dev:43170]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@30779 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 30779
Added by kosaki (Motohiro KOSAKI) almost 9 years ago

  • string.c (str_utf8_nth): fixed a conditon of optimized lead byte counting. [Bug #4366][ruby-dev:43170]

Revision 30779
Added by kosaki (Motohiro KOSAKI) almost 9 years ago

  • string.c (str_utf8_nth): fixed a conditon of optimized lead byte counting. [Bug #4366][ruby-dev:43170]

Revision 30779
Added by kosaki (Motohiro KOSAKI) almost 9 years ago

  • string.c (str_utf8_nth): fixed a conditon of optimized lead byte counting. [Bug #4366][ruby-dev:43170]

Revision 30779
Added by kosaki (Motohiro KOSAKI) almost 9 years ago

  • string.c (str_utf8_nth): fixed a conditon of optimized lead byte counting. [Bug #4366][ruby-dev:43170]

Revision 30779
Added by kosaki (Motohiro KOSAKI) almost 9 years ago

  • string.c (str_utf8_nth): fixed a conditon of optimized lead byte counting. [Bug #4366][ruby-dev:43170]

Revision 30779
Added by kosaki (Motohiro KOSAKI) almost 9 years ago

  • string.c (str_utf8_nth): fixed a conditon of optimized lead byte counting. [Bug #4366][ruby-dev:43170]

Revision 0ca24ab2
Added by yugui (Yuki Sonoda) over 8 years ago

merges r30779 from trunk into ruby_1_9_2.

    * string.c (str_utf8_nth): fixed a conditon of optimized lead
      byte counting. [Bug #4366][ruby-dev:43170]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_1_9_2@31401 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

History

#1

Updated by kosaki (Motohiro KOSAKI) almost 9 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

=begin
This issue was solved with changeset r30779.
Motohiro, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • string.c (str_utf8_nth): fixed a conditon of optimized lead byte counting. [Bug #4366][ruby-dev:43170] =end

Also available in: Atom PDF