Project

General

Profile

Bug #13119

String#scrub ignores the block if the string encoding is not ASCII-compatible

Added by Eregon (Benoit Daloze) almost 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:79038]

Description

String#scrub completely ignores the block if the string encoding is not ASCII-compatible.
This does not seem intended and is counter-intuitive as ASCII-compatible strings use it.

"\x00\xD8\x42\x30".force_encoding(Encoding::UTF_16LE).scrub { |e| p e; "?".encode(Encoding::UTF_16LE) }

Gives

"\uFFFD\u3042"

But it should be

"\x0\xd8"
"?\u3042"

Moreover, there is a bug in the String to be yielded to the block, string.c:9399:

repl = rb_yield(rb_enc_str_new(p, e-p, enc));

should be

repl = rb_yield(rb_enc_str_new(p, clen, enc));

So it does not yield all the remaining string but only the invalid part.

And finally, it should probably be an error if both a block and a replacement string are given.

Associated revisions

Revision 57302
Added by nobu (Nobuyoshi Nakada) almost 3 years ago

string.c: block for scrub with ASCII-incompatible

  • string.c (rb_enc_str_scrub): honor the given block with ASCII-incompatible encoding. [ruby-core:79039] [Bug #13120]

Revision 57303
Added by nobu (Nobuyoshi Nakada) almost 3 years ago

string.c: yield invalid part

  • string.c (rb_enc_str_scrub): yield the invalid part only with ASCII-incompatible. [ruby-core:79039] [Bug #13120]

Revision 90294641
Added by nobu (Nobuyoshi Nakada) almost 3 years ago

string.c: replacement and block

  • string.c (rb_enc_str_scrub): only one of replacement and block is allowed. [ruby-core:79038] [Bug #13119]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57304 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 57304
Added by nobu (Nobuyoshi Nakada) almost 3 years ago

string.c: replacement and block

  • string.c (rb_enc_str_scrub): only one of replacement and block is allowed. [ruby-core:79038] [Bug #13119]

Revision 57304
Added by nobu (Nobuyoshi Nakada) almost 3 years ago

string.c: replacement and block

  • string.c (rb_enc_str_scrub): only one of replacement and block is allowed. [ruby-core:79038] [Bug #13119]

Revision 57304
Added by nobu (Nobuyoshi Nakada) almost 3 years ago

string.c: replacement and block

  • string.c (rb_enc_str_scrub): only one of replacement and block is allowed. [ruby-core:79038] [Bug #13119]

Revision 9a663128
Added by naruse (Yui NARUSE) over 2 years ago

merge revision(s) 57302,57303,57304: [Backport #13119]

    string.c: block for scrub with ASCII-incompatible

    * string.c (rb_enc_str_scrub): honor the given block with
      ASCII-incompatible encoding.  [ruby-core:79039] [Bug #13120]
    string.c: yield invalid part

    * string.c (rb_enc_str_scrub): yield the invalid part only with
      ASCII-incompatible.  [ruby-core:79039] [Bug #13120]
    string.c: replacement and block

    * string.c (rb_enc_str_scrub): only one of replacement and block
      is allowed.  [ruby-core:79038] [Bug #13119]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_4@57855 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 57855
Added by naruse (Yui NARUSE) over 2 years ago

merge revision(s) 57302,57303,57304: [Backport #13119]

string.c: block for scrub with ASCII-incompatible

* string.c (rb_enc_str_scrub): honor the given block with
  ASCII-incompatible encoding.  [ruby-core:79039] [Bug #13120]
string.c: yield invalid part

* string.c (rb_enc_str_scrub): yield the invalid part only with
  ASCII-incompatible.  [ruby-core:79039] [Bug #13120]
string.c: replacement and block

* string.c (rb_enc_str_scrub): only one of replacement and block
  is allowed.  [ruby-core:79038] [Bug #13119]

Revision 8dc3d3fd
Added by usa (Usaku NAKAMURA) over 2 years ago

merge revision(s) 57302,57303,57304: [Backport #13119]

    string.c: block for scrub with ASCII-incompatible

    * string.c (rb_enc_str_scrub): honor the given block with
      ASCII-incompatible encoding.  [ruby-core:79039] [Bug #13120]
    string.c: yield invalid part

    * string.c (rb_enc_str_scrub): yield the invalid part only with
      ASCII-incompatible.  [ruby-core:79039] [Bug #13120]
    string.c: replacement and block

    * string.c (rb_enc_str_scrub): only one of replacement and block
      is allowed.  [ruby-core:79038] [Bug #13119]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_2@58091 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 58091
Added by usa (Usaku NAKAMURA) over 2 years ago

merge revision(s) 57302,57303,57304: [Backport #13119]

string.c: block for scrub with ASCII-incompatible

* string.c (rb_enc_str_scrub): honor the given block with
  ASCII-incompatible encoding.  [ruby-core:79039] [Bug #13120]
string.c: yield invalid part

* string.c (rb_enc_str_scrub): yield the invalid part only with
  ASCII-incompatible.  [ruby-core:79039] [Bug #13120]
string.c: replacement and block

* string.c (rb_enc_str_scrub): only one of replacement and block
  is allowed.  [ruby-core:79038] [Bug #13119]

Revision 39ee1e95
Added by nagachika (Tomoyuki Chikanaga) over 2 years ago

merge revision(s) 57302,57303,57304: [Backport #13119]

    string.c: block for scrub with ASCII-incompatible

    * string.c (rb_enc_str_scrub): honor the given block with
      ASCII-incompatible encoding.  [ruby-core:79039] [Bug #13120]
    string.c: yield invalid part

    * string.c (rb_enc_str_scrub): yield the invalid part only with
      ASCII-incompatible.  [ruby-core:79039] [Bug #13120]
    string.c: replacement and block

    * string.c (rb_enc_str_scrub): only one of replacement and block
      is allowed.  [ruby-core:79038] [Bug #13119]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_3@58175 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 58175
Added by nagachika (Tomoyuki Chikanaga) over 2 years ago

merge revision(s) 57302,57303,57304: [Backport #13119]

string.c: block for scrub with ASCII-incompatible

* string.c (rb_enc_str_scrub): honor the given block with
  ASCII-incompatible encoding.  [ruby-core:79039] [Bug #13120]
string.c: yield invalid part

* string.c (rb_enc_str_scrub): yield the invalid part only with
  ASCII-incompatible.  [ruby-core:79039] [Bug #13120]
string.c: replacement and block

* string.c (rb_enc_str_scrub): only one of replacement and block
  is allowed.  [ruby-core:79038] [Bug #13119]

History

Updated by nobu (Nobuyoshi Nakada) almost 3 years ago

  • Backport changed from 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN to 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED
  • Status changed from Open to Closed

Updated by Eregon (Benoit Daloze) almost 3 years ago

Thanks nobu for the amazingly quick fix!

Updated by naruse (Yui NARUSE) over 2 years ago

  • Backport changed from 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED to 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED, 2.4: DONE

ruby_2_4 r57855 merged revision(s) 57302,57303,57304.

Updated by usa (Usaku NAKAMURA) over 2 years ago

  • Backport changed from 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED, 2.4: DONE to 2.1: REQUIRED, 2.2: DONE, 2.3: REQUIRED, 2.4: DONE

ruby_2_2 r58091 merged revision(s) 57302,57303,57304.

Updated by nagachika (Tomoyuki Chikanaga) over 2 years ago

  • Backport changed from 2.1: REQUIRED, 2.2: DONE, 2.3: REQUIRED, 2.4: DONE to 2.1: REQUIRED, 2.2: DONE, 2.3: DONE, 2.4: DONE

ruby_2_3 r58175 merged revision(s) 57302,57303,57304.

Also available in: Atom PDF