Bug #7646
String#each_lineでinvalid byte sequence
Description
=begin
String#each_lineでセパレータを指定したときにASCII以外の文字でinvalid byte sequenceが発生します。
$ ruby -ve '"\n\u0100".each_line("\n") {|l| p l }'
ruby 2.0.0dev (2013-01-02 trunk 38676) [i686-linux]
"\n"
-e:1:in each_line': invalid byte sequence in UTF-8 (ArgumentError)
'
from -e:1:in
r38616あたりの変更で入ったバグのようです。
--- string.c.org 2012-12-27 21:57:07.000000000 +0900
+++ string.c 2013-01-02 23:36:47.000000000 +0900
@@ -6199,14 +6199,14 @@
if (c == newline &&
(rslen <= 1 ||
(pend - p >= rslen && memcmp(RSTRING_PTR(rs), p, rslen) == 0))) {
- p += (rslen ? rslen : n);
- line = rb_str_subseq(str, s - ptr, p - s);
- const char *pp = p + (rslen ? rslen : n);
- line = rb_str_subseq(str, s - ptr, pp - s); if (wantarray) rb_ary_push(ary, line); else rb_yield(line); str_mod_check(str, ptr, len);
- s = p;
- s = pp; } p += n; }
=end