Project

General

Profile

Actions

Feature #15496

open

Extract between string as standard String api

Added by macdevign (Macdevign mac) almost 6 years ago. Updated almost 6 years ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:90851]

Description

I could not find the a simple String api to extract the string between two string, and I notice that many face the same issue and endup rolling their own solutions (eg https://stackoverflow.com/questions/9661478/how-to-return-the-substring-of-a-string-between-two-strings-in-ruby).

Given that string "between" extraction is such a common operation, will adding a focused and simplified String method make coding pleasant ?

This is my solution but probably someone can provide better and efficient implementation.

  # self: String instance, 
  # from: String -> first 'from' String  
  # to: String -> first 'to' String found after 'from'
  # ignore_case: Boolean -> to indicate case-sensitive search
  # from_index: Integer -> Position of 'From' string to start search from
  # return string between from and to, which exclude the argument, nil otherwise.

    def between(from, to, ignore_case: false, from_index: 0)
      str = self
      str, from, to = str.downcase, from.downcase, to.downcase if ignore_case

      from_idx = str.index(from, from_index)&.+(from.length)
      if from_idx
        to_idx = str.index(to, from_idx)
        return str[from_idx...to_idx] if to_idx
      end
      nil
    end

Test case

"Hello world".between("ell", "ld") => "o wor"
"Testing 123".between("Te", "123") => "sting "
"Testing 123".between("te", "123") => nil
"Testing 123".between("te", "123", ignore_case: true) => "sting "
"Testing 123".between("te", "123", from_index: 3) => nil

Actions #1

Updated by macdevign (Macdevign mac) almost 6 years ago

  • Description updated (diff)

Updated by shevegen (Robert A. Heiler) almost 6 years ago

Hmmm. I don't have a preference here since I can find arguments in favour or against
this. It's fairly easily possible to use a regex so .between (or any other name to
extract the substring between two boundaries) is not that necessary; on the other hand
it may be simpler to use an "officially approved" method.

If you feel strongly about your proposal you could propose it for mention during a
developer meeting at:

https://bugs.ruby-lang.org/issues/15462

(In ~a week or so from now on.)

Then you may get matz' opinion about the proposal, both feature-wise and API/name wise.

Updated by duerst (Martin Dürst) almost 6 years ago

macdevign (Macdevign mac) wrote:

Given that string "between" extraction is such a common operation,

Can you back that up with some additional information/data? For example, do you know other programming languages that have such a function/method?

I haven't had the need for such a method, and use cases I can think of very quickly need additional or different parameters (such as xth occurrence, between (Regexp) patterns rather than simple strings, before or after a string,...), which suggests to me that solving this specific problem with the general regular expression functionality already in Ruby.

Updated by macdevign (Macdevign mac) almost 6 years ago

duerst (Martin Dürst) wrote:

macdevign (Macdevign mac) wrote:

Given that string "between" extraction is such a common operation,

Can you back that up with some additional information/data? For example, do you know other programming languages that have such a function/method?

I haven't had the need for such a method, and use cases I can think of very quickly need additional or different parameters (such as xth occurrence, between (Regexp) patterns rather than simple strings, before or after a string,...), which suggests to me that solving this specific problem with the general regular expression functionality already in Ruby.

Yes, I agree that there is regex that can do the work in Ruby.

I use it quite often in data scraping and oftn wonder why need to resort to regex and worry about escaping regex character ?
There is many ways to do the extraction (eg regex) but if there is standard way and convenient way, I wonder why reinvent our own ? I like the way Ruby provide us the creativity to solve the problem, but such simple "common" task should be simpler :}

Compare to the other languages, I believe Ruby and Python are one of those scripting languages commonly used for data scraping, and it will help if there is simple method facilitate such extraction.

Just googling the for "extract between " for language such as python, javascript, java, etc, one can see this is common question,

If other languages didn't provide it, probably Ruby can set the catalyst for other language to include it ?

Readability wise, extraction through method seems easier on the eyes, but that could be just me :}

"The quick brown fox jumps over the lazy dog"[/jumps(.+)dog/,1] # regex
"The quick brown fox jumps over the lazy dog".between("jumps", "dog") # method

thank

Actions #5

Updated by macdevign (Macdevign mac) almost 6 years ago

  • Description updated (diff)
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0