Project

General

Profile

Bug #13119

String#scrub ignores the block if the string encoding is not ASCII-compatible

Added by Eregon (Benoit Daloze) over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:79038]

Description

String#scrub completely ignores the block if the string encoding is not ASCII-compatible.
This does not seem intended and is counter-intuitive as ASCII-compatible strings use it.

"\x00\xD8\x42\x30".force_encoding(Encoding::UTF_16LE).scrub { |e| p e; "?".encode(Encoding::UTF_16LE) }

Gives

"\uFFFD\u3042"

But it should be

"\x0\xd8"
"?\u3042"

Moreover, there is a bug in the String to be yielded to the block, string.c:9399:

repl = rb_yield(rb_enc_str_new(p, e-p, enc));

should be

repl = rb_yield(rb_enc_str_new(p, clen, enc));

So it does not yield all the remaining string but only the invalid part.

And finally, it should probably be an error if both a block and a replacement string are given.

Updated by nobu (Nobuyoshi Nakada) over 3 years ago

  • Backport changed from 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN to 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED
  • Status changed from Open to Closed

Updated by Eregon (Benoit Daloze) over 3 years ago

Thanks nobu for the amazingly quick fix!

Updated by naruse (Yui NARUSE) over 3 years ago

  • Backport changed from 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED to 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED, 2.4: DONE

ruby_2_4 r57855 merged revision(s) 57302,57303,57304.

Updated by usa (Usaku NAKAMURA) over 3 years ago

  • Backport changed from 2.1: REQUIRED, 2.2: REQUIRED, 2.3: REQUIRED, 2.4: DONE to 2.1: REQUIRED, 2.2: DONE, 2.3: REQUIRED, 2.4: DONE

ruby_2_2 r58091 merged revision(s) 57302,57303,57304.

Updated by nagachika (Tomoyuki Chikanaga) over 3 years ago

  • Backport changed from 2.1: REQUIRED, 2.2: DONE, 2.3: REQUIRED, 2.4: DONE to 2.1: REQUIRED, 2.2: DONE, 2.3: DONE, 2.4: DONE

ruby_2_3 r58175 merged revision(s) 57302,57303,57304.

Also available in: Atom PDF