Project

General

Profile

Actions

Bug #15642

closed

IO#readline に chomp: true オプションを指定した場合に正しく行区切りを取り除かれないケースが存在する

Added by tomog105 (Tomohiro Ogoke) about 5 years ago. Updated about 5 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.7.0dev (2019-03-06 trunk 67174) [x86_64-darwin18]
[ruby-core:91692]

Description

内容

IO#readlinechomp: true オプションを指定して文字列を読み込んだ場合に、
先頭から特定の長さになる行(具体的には 8,192 の n 倍 + 1バイト)について、
行の区切りが "\r\n" であっても "\n" しか取り除かれないという現象が発生します。

IO#each_line についても同じ現象が発生しますが、
引数の先頭に rs として "\r\n" を与えた場合には当該現象は発生しません。

再現コード

require 'tempfile'

(1..10).each do |i|
  Tempfile.open do |tmp|
    tmp.write("a" * ((8192 * i) - 4) + "\r\n" + "a\r\n")
    tmp.flush
    p "size: #{tmp.size} result: " + File.open(tmp, "rb").readlines(chomp: true).last
  end
end

実行結果

それぞれ result: a が返ることを期待していますが、
trunk (2.7.0dev), 2.6.1, 2.5.3, 2.4.5 全てで result: a\r が返ってきます。

$ RBENV_VERSION=2.7.0-dev ruby -v test_chomp.rb
ruby 2.7.0dev (2019-03-06 trunk 67174) [x86_64-darwin18]
"size: 8193 result: a\r"
"size: 16385 result: a\r"
"size: 24577 result: a\r"
"size: 32769 result: a\r"
"size: 40961 result: a\r"
"size: 49153 result: a\r"
"size: 57345 result: a\r"
"size: 65537 result: a\r"
"size: 73729 result: a\r"
"size: 81921 result: a\r"

$ RBENV_VERSION=2.6.1 ruby -v test_chomp.rb
ruby 2.6.1p33 (2019-01-30 revision 66950) [x86_64-darwin18]
"size: 8193 result: a\r"
"size: 16385 result: a\r"
"size: 24577 result: a\r"
"size: 32769 result: a\r"
"size: 40961 result: a\r"
"size: 49153 result: a\r"
"size: 57345 result: a\r"
"size: 65537 result: a\r"
"size: 73729 result: a\r"
"size: 81921 result: a\r"

$ RBENV_VERSION=2.5.3 ruby -v test_chomp.rb
ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin18]
"size: 8193 result: a\r"
"size: 16385 result: a\r"
"size: 24577 result: a\r"
"size: 32769 result: a\r"
"size: 40961 result: a\r"
"size: 49153 result: a\r"
"size: 57345 result: a\r"
"size: 65537 result: a\r"
"size: 73729 result: a\r"
"size: 81921 result: a\r"

$ RBENV_VERSION=2.4.5 ruby -v test_chomp.rb
ruby 2.4.5p335 (2018-10-18 revision 65137) [x86_64-darwin18]
"size: 8193 result: a\r"
"size: 16385 result: a\r"
"size: 24577 result: a\r"
"size: 32769 result: a\r"
"size: 40961 result: a\r"
"size: 49153 result: a\r"
"size: 57345 result: a\r"
"size: 65537 result: a\r"
"size: 73729 result: a\r"
"size: 81921 result: a\r"

Updated by alanwu (Alan Wu) about 5 years ago

I ran the repro script on my Mac and I get

"size: 8194 result: a"
"size: 16386 result: a"
"size: 24578 result: a"
"size: 32770 result: a"
"size: 40962 result: a"
"size: 49154 result: a"
"size: 57346 result: a"
"size: 65538 result: a"
"size: 73730 result: a"
"size: 81922 result: a"

This might be caused by IO#write not flushing to the file. The following might get you the desired output:

--- a/repro.rb
+++ b/repro1.rb
@@ -3,6 +3,7 @@ require 'tempfile'
 (1..10).each do |i|
   Tempfile.open do |tmp|
     tmp.write("a" * ((8192 * i) - 3) + "\r\n" + "a\r\n")
+    tmp.flush
     p "size: #{tmp.size} result: " + File.open(tmp, "rb").readlines(chomp: true).last
   end
 end

Updated by tomog105 (Tomohiro Ogoke) about 5 years ago

@alanwu (Alan Wu), Thanks your feedbacks.

Sorry, There was an error in the repro script.
The correct repro script including your feedback is as follows.

repro script:

require 'tempfile'

(1..10).each do |i|
  Tempfile.open do |tmp|
    tmp.write("a" * ((8192 * i) - 4) + "\r\n" + "a\r\n")
    tmp.flush
    p "size: #{tmp.size} result: " + File.open(tmp, "rb").readlines(chomp: true).last
  end
end

results:

"size: 8193 result: a\r"
"size: 16385 result: a\r"
"size: 24577 result: a\r"
"size: 32769 result: a\r"
"size: 40961 result: a\r"
"size: 49153 result: a\r"
"size: 57345 result: a\r"
"size: 65537 result: a\r"
"size: 73729 result: a\r"
"size: 81921 result: a\r"
Actions #3

Updated by tomog105 (Tomohiro Ogoke) about 5 years ago

  • Description updated (diff)
Actions #4

Updated by nobu (Nobuyoshi Nakada) about 5 years ago

  • Backport changed from 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN to 2.4: REQUIRED, 2.5: REQUIRED, 2.6: REQUIRED
Actions #5

Updated by nobu (Nobuyoshi Nakada) about 5 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r67188.


io.c: chomp CR at the end of read buffer

  • io.c (rb_io_getline_fast): chomp CR followed by LF but separated
    by the read buffer boundary. [ruby-core:91707] [Bug #15642]

Updated by nagachika (Tomoyuki Chikanaga) about 5 years ago

  • Backport changed from 2.4: REQUIRED, 2.5: REQUIRED, 2.6: REQUIRED to 2.4: REQUIRED, 2.5: DONE, 2.6: REQUIRED

ruby_2_5 r67191 merged revision(s) 67188.

Updated by tomog105 (Tomohiro Ogoke) about 5 years ago

I confirmed that it was fixed, thanks.

$ RBENV_VERSION=2.7.0-dev ruby -v test_chomp.rb
ruby 2.7.0dev (2019-03-08 trunk 67194) [x86_64-darwin18]
"size: 8193 result: a"
"size: 16385 result: a"
"size: 24577 result: a"
"size: 32769 result: a"
"size: 40961 result: a"
"size: 49153 result: a"
"size: 57345 result: a"
"size: 65537 result: a"
"size: 73729 result: a"
"size: 81921 result: a"

Updated by naruse (Yui NARUSE) about 5 years ago

  • Backport changed from 2.4: REQUIRED, 2.5: DONE, 2.6: REQUIRED to 2.4: REQUIRED, 2.5: DONE, 2.6: DONE

ruby_2_6 r67207 merged revision(s) 67188.

Updated by usa (Usaku NAKAMURA) about 5 years ago

  • Backport changed from 2.4: REQUIRED, 2.5: DONE, 2.6: DONE to 2.4: DONE, 2.5: DONE, 2.6: DONE

ruby_2_4 r67391 merged revision(s) 67188.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0