Project

General

Profile

Feature #15588

String#each_chunk and #chunks

Added by Glass_saga (Masaki Matsushita) 3 months ago. Updated 3 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
[ruby-core:91414]

Description

String#each_chunk iterates chunks of specified size in String.
String#chunks is a shorthand for str.each_chunk(n).to_a.

present:

str = <<EOS
20190101 20190102
20190103 20190104
EOS

str.scan(/.{1,9}/m) do |chunk|
  p chunk #=> "20190101 "
end

str.scan(/.{1,9}/m) do |chunk|
  chunk.strip!
  p chunk #=> "20190101"
end

str.scan(/.{1,9}/m) #=> ["20190101 ", "20190102\n", "20190103 ", "20190104\n"]
str.scan(/.{1,9}/m).map(&:strip) #=> ["20190101", "20190102", "20190103", "20190104"]

proposal:

str = <<EOS
20190101 20190102
20190103 20190104
EOS

str.each_chunk(9) do |chunk|
  p chunk #=> "20190101 "
end

str.each_chunk(9, strip: true) do |chunk|
  p chunk #=> "20190101"
end

str.chunks(9) #=> ["20190101 ", "20190102\n", "20190103 ", "20190104\n"]
str.chunks(9, strip: true) #=> ["20190101", "20190102", "20190103", "20190104"]

Files

patch.diff (6.56 KB) patch.diff Glass_saga (Masaki Matsushita), 02/06/2019 01:35 AM

History

Updated by shyouhei (Shyouhei Urabe) 3 months ago

Why the String#scan example you showed is not suitable for you? Tell us what makes you happy with the proposal.

Updated by mame (Yusuke Endoh) 3 months ago

I like the proposal itself. I don't think that chunks is a good name, though.

To take every n characters, I often write str.scan(/.{1,#{ n }}/m), but it looks a bit cryptic. In this case str.chunks(n) is simpler.

I dislike strip: true. It is too ad-hoc. Does it also support lstrip: true, rstrip: true, chop: true, chomp: true, etc? In principle, one method should do one thing, IMO.

#3

Updated by sawa (Tsuyoshi Sawada) 3 months ago

I am also not so sure if this feature is needed. But if I wanted such feature, I would ask to let String#scan take similar arguments as String#[]. That is, let the first argument point to the starting position, and an optional second argument to be the length. Since we want to capture multiple matches unlike with [], passing a single index for the first argument does not make much sense, but now we have Enumerator::ArithmeticSequence. So we should be able to do

str.scan((0..).step(9)) #=> ["20190101 ", "20190102\n", "20190103 ", "20190104\n"]
str.scan((0..).step(9), 8) #=> ["20190101", "20190102", "20190103", "20190104"]

Updated by naruse (Yui NARUSE) 3 months ago

This requires more concrete real world example.

Also available in: Atom PDF