Feature #13890: Allow a regexp as an argument to 'count', to count more interesting things than single characters - Ruby - Ruby Issue Tracking System

Actions

Copy link

Feature #13890

open

Allow a regexp as an argument to 'count', to count more interesting things than single characters

Added by duerst (Martin Dürst) almost 8 years ago. Updated over 2 years ago.

Status:

Open

Assignee:

Target version:

[ruby-core:82743]

Description

Currently, String#count only accepts strings, and counts all the characters in the string.

However, I have repeatedly met the situation where I wanted to count more interesting things in strings.
These 'interesting things' can easily be expressed with regular expressions.

Here is a quick-and-dirty Ruby-level implementation:

class String
  alias old_count count

  def count (what)
    case what
    when String
      old_count what
    when Regexp
      pos = -1
      count = 0
      count += 1 while pos = index(what, pos+1)
      count
    end
  end
end

Please note that the implementation counts overlapping occurrences; maybe there is room for an option like overlap: :no.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

#1 [ruby-core:82745]

Updated by Eregon (Benoit Daloze) almost 8 years ago

Should it behave the same as str.scan(regexp).size ?

I think the default should be no overlap, and increment the position by the length of the match.

Actions

Copy link

#2 [ruby-core:82746]

Updated by duerst (Martin Dürst) almost 8 years ago

Eregon (Benoit Daloze) wrote:

I think the default should be no overlap, and increment the position by the length of the match.

That would be fine by me, too.

Actions

Copy link

#3 [ruby-core:84559]

Updated by duerst (Martin Dürst) over 7 years ago

Python allows to count strings, as follows:

str.count(sub[, start[, end]])
Return the number of non-overlapping occurrences of substring sub in the range [start, end]. Optional arguments start and end are interpreted as in slice notation.

Actions

Copy link

Updated by duerst (Martin Dürst) over 6 years ago

Related to Feature #12698: Method to delete a substring by regex match added

Actions

Copy link

#5 [ruby-core:111366]

Updated by shan (Shannon Skipper) over 2 years ago

I'd love to have this feature. A str.count(regexp) is something I see folk trying fairly often. A str.count(regexp) also avoids the intermediary Array of str.scan(regexp).size or the back bending with str.enum_for(:scan, regexp).count.

Actions

Copy link

#6 [ruby-core:111897]

Updated by matz (Yukihiro Matsumoto) over 2 years ago

If str.count(re) works as str.scan(re).size (besides efficiency), it's acceptable. But if someone needs overlapping, they needs to explain their use-case.

Matz.

Actions

Copy link

#7 [ruby-core:111903]

Updated by sawa (Tsuyoshi Sawada) over 2 years ago

Overlapping can be realized by putting the original regexp within a look-ahead.

s = "abcdefghij"
re = /.{3}/

Non-overlapping count:

s.scan(re).count # => 3
s.count(re) # => Expect 3

Overlapping count:

s.scan(/(?=#{re})/).count # => 8
s.count(/(?=#{re})/) # => Expect 8

So I do not think there is any need to particularly implement overlapping as a feature of this method.

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like1Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Feature #13890

Allow a regexp as an argument to 'count', to count more interesting things than single characters

Updated by Eregon (Benoit Daloze) almost 8 years ago

Updated by duerst (Martin Dürst) almost 8 years ago

Updated by duerst (Martin Dürst) over 7 years ago

Updated by duerst (Martin Dürst) over 6 years ago

Updated by shan (Shannon Skipper) over 2 years ago

Updated by matz (Yukihiro Matsumoto) over 2 years ago

Updated by sawa (Tsuyoshi Sawada) over 2 years ago