Feature #4615: Add IO#split and iterative form of String#split - Ruby - Ruby Issue Tracking System

Actions

Copy link

Feature #4615

closed

Add IO#split and iterative form of String#split

Added by yimutang (Joey Zhou) over 14 years ago. Updated over 14 years ago.

Status:

Rejected

Assignee:

matz (Yukihiro Matsumoto)

Target version:

[ruby-core:35896]

Description

=begin
file.each_line(sep=$/) {|line| ... } can be used to iterate on lines.

But the separator is just a string or nil, it cannot be a regexp.

Sometimes I may want to iterate on "sentences", which are strings separated by (simply say) punctuations ".;!?".

So if I can write it like this:

file.split(/[.;!?]/) {|sentence| ... }

I think it will be very convenient.

You may say I can write it like this:

file.gets(nil).split(/[.;!?]/).each {|sentence| ... }

But this code will: (1) slurp in the whole file; (2) create a temporary array. It the file is a big one, those 2 steps seem both expensive and unnessary.

So I suggest a flexible IO#split: (also available for File and ARGF)

io.split(pattern=$/) {|field|...} -> io # default pattern is $/, not $;
io.split(pattern=$/) -> enumerator # not array

(I think adding a new method is better, rather than modifying the IO#each_line, making it accept regexp as argument.)

Well, String#split has only one form:

str.split(pattern=$;, limit=0) -> array

Maybe add a iterative form, when with a block:

str.split(pattern=$;, limit=0) {|field| ... } -> str

Joey Zhou
=end

Related issues 1 (0 open — 1 closed)

Actions

Copy link

#1 [ruby-core:35898]

Updated by naruse (Yui NARUSE) over 14 years ago

Status changed from Open to Assigned
Assignee set to matz (Yukihiro Matsumoto)

=begin

=end

Actions

Copy link

#2 [ruby-core:35901]

Updated by naruse (Yui NARUSE) over 14 years ago

=begin
Just a thought,

String#split drops a separator.
In this use case, you want to drop the separator?

Anyway on 1.9.2, StringScanner#scan_until seems the one you want.
=end

Actions

Copy link

#3 [ruby-core:35918]

Updated by yimutang (Joey Zhou) over 14 years ago

=begin
Yes, I've made a mistake. The split regexp should be /(?<=[.;!?])/ if I want to iterate on "sentences".

Well, the key points here are: (1)more flexible separator; (2)iterative idiom.

Ruby is just like Perl. $/ in Perl is just a string too. I saw in perldoc that "the value of $/ is a string, not a regex. awk has to be better for something." Maybe awk can set the record separator to a regexp?

So, if there's an IO#split or IO#each can take a regexp as separator, I think it's powerful.

IO and String classes have a few same methods: #each_line #each_char #each_byte, maybe a #split for IO is OK.

and I think (({str.split(pattern) {|filed| ...}})) is a pure Ruby idiom:)

Thank you for telling me that StringScanner has such a method. I'm not familiar with the standard libs. Thank you:)
=end

Actions

Copy link

#4 [ruby-core:35919]

Updated by nobu (Nobuyoshi Nakada) over 14 years ago

=begin
Use scanf.rb.
=end

Actions

Copy link

#5 [ruby-core:35929]

Updated by matz (Yukihiro Matsumoto) over 14 years ago

Status changed from Assigned to Rejected

=begin
Use scanf, or read then split. Besides that File#split does not well describe the method's behavior (read then split). It makes me feel it splits the file contents into several files.

matz.

=end

Actions

Copy link

#6 [ruby-core:35930]

Updated by yimutang (Joey Zhou) over 14 years ago

=begin
Well, how about (({string.split {|filed| ... }})) ?
=end

Actions

Copy link

#7 [ruby-core:35931]

Updated by nobu (Nobuyoshi Nakada) over 14 years ago

=begin
It should be another feature.
=end

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Feature #4615

Add IO#split and iterative form of String#split

Updated by naruse (Yui NARUSE) over 14 years ago

Updated by naruse (Yui NARUSE) over 14 years ago

Updated by yimutang (Joey Zhou) over 14 years ago

Updated by nobu (Nobuyoshi Nakada) over 14 years ago

Updated by matz (Yukihiro Matsumoto) over 14 years ago

Updated by yimutang (Joey Zhou) over 14 years ago

Updated by nobu (Nobuyoshi Nakada) over 14 years ago