Project

General

Profile

Actions

Bug #18768

open

Inconsistent behavior of IO, StringIO and String each_line methods when return paragraph and chomp: true passed

Added by andrykonchin (Andrew Konchin) about 2 months ago. Updated 17 days ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:108500]

Description

In IO, StringIO and String methods that return lines (e.g. each, each_line, gets, readline, readlines) behave in a different way when the following parameters are passed

  • separator is "" and
  • chomp is true.

They truncate the new line characters between paragraphs and the trailing one differently:

"a\n\nb\n\nc\n".each_line("", chomp: true).to_a
#=> ["a\n", "b\n", "c\n"]

StringIO.new("a\n\nb\n\nc\n").each_line("", chomp: true).to_a
#=> ["a\n", "b\n", "c"]

File.open('chomp.txt').each_line("", chomp: true).to_a
#=> ["a", "b", "c\n"]

The text file content is the same:

File.read('chomp.txt')
#=> "a\n\nb\n\nc\n"

Expected behavior - they return the same result.


Related issues 1 (1 open0 closed)

Related to Ruby master - Bug #18770: Inconsistent behavior of IO/StringIO's each methods when called with nil as a separator, limit and chomp: trueOpenActions
Actions #1

Updated by andrykonchin (Andrew Konchin) about 2 months ago

  • Description updated (diff)
Actions #2

Updated by jeremyevans0 (Jeremy Evans) about 1 month ago

  • Related to Bug #18770: Inconsistent behavior of IO/StringIO's each methods when called with nil as a separator, limit and chomp: true added

Updated by jeremyevans0 (Jeremy Evans) about 1 month ago

I agree that the behavior for all of these methods should be the same. I think the IO behavior makes the most sense, and String and StringIO should be changed to match it.

I submitted pull requests to Ruby (https://github.com/ruby/ruby/pull/5960) and StringIO (https://github.com/ruby/stringio/pull/29) to fix this.

Actions #4

Updated by jeremyevans (Jeremy Evans) 28 days ago

  • Status changed from Open to Closed

Applied in changeset git|609d73e8925f807786686caf635178bb1de74256.


[ruby/stringio] Fix handling of chomp with paragraph separator

Try to mirror IO behavior, where chomp takes out the entire paragraph
separators between entries, but does not chomp a single line separator
at the end of the string.

Partially Fixes [Bug #18768]

https://github.com/ruby/stringio/commit/a83ddbb7f0

Actions #5

Updated by jeremyevans0 (Jeremy Evans) 21 days ago

  • Status changed from Closed to Open

Updated by mame (Yusuke Endoh) 18 days ago

I have no strong opinion about this issue and am never against Jeremy's change, but I'm just curious. I thought that each_line(..., chomp: true) {|s| ... } was equal to each_line(...) {|s| s = s.chomp; ... }. In the above examples, StringIO's result (["a\n", "b\n", "c"]) was the simplest to me. @jeremyevans0 (Jeremy Evans) How did you think the IO behavior made the most sense?

Updated by jeremyevans0 (Jeremy Evans) 18 days ago

mame (Yusuke Endoh) wrote in #note-6:

I have no strong opinion about this issue and am never against Jeremy's change, but I'm just curious. I thought that each_line(..., chomp: true) {|s| ... } was equal to each_line(...) {|s| s = s.chomp; ... }. In the above examples, StringIO's result (["a\n", "b\n", "c"]) was the simplest to me. @jeremyevans0 (Jeremy Evans) How did you think the IO behavior made the most sense?

My thought here is that chomp: true, does not mean chomp with no arguments, it means chomp with the separator used for the each_line. So when you call each_line with '' (paragraph separator), it only chomps the paragraph separator. That's what the IO behavior already was, and I thought it best to make String/StringIO consistent with that.

Updated by mame (Yusuke Endoh) 17 days ago

@jeremyevans0 (Jeremy Evans) Thank you for your explanation, I understand. The keyword name "chomp" might be a bit confusing.

Updated by jeremyevans0 (Jeremy Evans) 17 days ago

mame (Yusuke Endoh) wrote in #note-8:

@jeremyevans0 (Jeremy Evans) Thank you for your explanation, I understand. The keyword name "chomp" might be a bit confusing.

I agree, the behavior is not obvious from the name chomp. However, it is obvious that chomp is supposed to remove the separator and not the generic newline. You can see this when providing an explicit line ending character:

'abc'.each_line('b', chomp: true).to_a
# => ["a", "c"]

require 'stringio'
StringIO.new('abc').each_line('b', chomp: true).to_a
# => ["a", "c"]

r, w = IO.pipe
w.write('abc')
w.close
r.each_line('b', chomp: true).to_a
# => ["a", "c"]
Actions

Also available in: Atom PDF