Project

General

Profile

Bug #13119

String#scrub ignores the block if the string encoding is not ASCII-compatible

Added by Eregon (Benoit Daloze) over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:79038]

Description

String#scrub completely ignores the block if the string encoding is not ASCII-compatible.
This does not seem intended and is counter-intuitive as ASCII-compatible strings use it.

"\x00\xD8\x42\x30".force_encoding(Encoding::UTF_16LE).scrub { |e| p e; "?".encode(Encoding::UTF_16LE) }

Gives

"\uFFFD\u3042"

But it should be

"\x0\xd8"
"?\u3042"

Moreover, there is a bug in the String to be yielded to the block, string.c:9399:

repl = rb_yield(rb_enc_str_new(p, e-p, enc));

should be

repl = rb_yield(rb_enc_str_new(p, clen, enc));

So it does not yield all the remaining string but only the invalid part.

And finally, it should probably be an error if both a block and a replacement string are given.

Also available in: Atom PDF