Project

General

Profile

Feature #6802

String#scan should have equivalent yielding MatchData

Added by Ilya Vorontsov over 3 years ago. Updated about 3 years ago.

Status:
Assigned
Priority:
Normal
[ruby-core:46801]

Description

Ruby should have method to obtain not an array of arrays but of MatchData objects. It can help in obtaining named groups:

pattern = /x: (?\d+) y:(?\d+)/
polygon = []
text.scan_for_pattern(pattern){|m| polygon << Point.new(m[:x], m[:y]) }

Not to break existing code we need unique name. Ideas? May be #each_match


Related issues

Related to Ruby trunk - Feature #5749: new method String#match_all needed Assigned 12/12/2011
Related to Ruby trunk - Feature #5606: String#each_match(regexp) Feedback 11/10/2011

History

#1 [ruby-core:46802] Updated by Ilya Vorontsov over 3 years ago

Simple implementation:

class String
def each_match(pattern, &block)
return Enumerator.new(self, :each_match, pattern) unless block_given?
text = self
m = text.match(pattern)
while m
yield m
text = text[m.end(0)..-1]
m = text.match(pattern)
end
end
end

#2 [ruby-core:46806] Updated by Benoit Daloze over 3 years ago

=begin
You can use (({String#scan})) with the block form and (({$~})) (as well as other Regexp-related globals) for this:

> text="x:1 y:12 ; x:33 y:2"
> text.scan(/x:(?<x>\d+) y:(?<y>\d+)/) { p [$~[:x],$~[:y]] }
["1", "12"]
["33", "2"]

Please check your Regexp and give an example of (({text})) next time.
=end

#3 [ruby-core:46852] Updated by Ilya Vorontsov over 3 years ago

Thank you for a solution! I always forgot about regexp global vars. Though I suggest that using a special method here is more clear. So what'd you say about String#each_match and Regexp#each_match
Yes, implementation is as simple as
class String
def each_match(pat)
scan(pat){ yield $~ }
end
end

and similar for Regexp.

Eregon (Benoit Daloze) wrote:

=begin
You can use (({String#scan})) with the block form and (({$~})) (as well as other Regexp-related globals) for this:

> text="x:1 y:12 ; x:33 y:2"
> text.scan(/x:(?<x>\d+) y:(?<y>\d+)/) { p [$~[:x],$~[:y]] }
["1", "12"]
["33", "2"]

Please check your Regexp and give an example of (({text})) next time.
=end

#4 [ruby-core:46858] Updated by Thomas Sawyer over 3 years ago

+1 I have definitely used this before (as Facets' #mscan).

#5 [ruby-core:46861] Updated by Benoit Daloze over 3 years ago

prijutme4ty (Ilya Vorontsov) wrote:

Though I suggest that using a special method here is more clear.
So what'd you say about String#each_match and Regexp#each_match

I did indeed somewhat expected String#scan to yield a MatchData object, instead of $~.captures.
I'm in favor of String#each_match, it might be a nice addition and the name is clear, but the naming is different from the usual regexp methods on String, and it might not be worth to add a method (I agree $~ is not the prettiest thing around).

I think Regexp#each_match does not convey well what it does though.

#6 [ruby-core:47059] Updated by Tomoaki Nishiyama over 3 years ago

+1 to have a method to return MatchData.
This is related to (or duplicate of) #5749 and #5606.

Even with the simple implementation I think to establish a standard
name and specification.

#7 [ruby-core:49758] Updated by Yusuke Endoh about 3 years ago

  • Target version set to next minor
  • Assignee set to Yukihiro Matsumoto
  • Status changed from Open to Assigned

Also available in: Atom PDF