Feature #15496
openExtract between string as standard String api
Description
I could not find the a simple String api to extract the string between two string, and I notice that many face the same issue and endup rolling their own solutions (eg https://stackoverflow.com/questions/9661478/how-to-return-the-substring-of-a-string-between-two-strings-in-ruby).
Given that string "between" extraction is such a common operation, will adding a focused and simplified String method make coding pleasant ?
This is my solution but probably someone can provide better and efficient implementation.
# self: String instance,
# from: String -> first 'from' String
# to: String -> first 'to' String found after 'from'
# ignore_case: Boolean -> to indicate case-sensitive search
# from_index: Integer -> Position of 'From' string to start search from
# return string between from and to, which exclude the argument, nil otherwise.
def between(from, to, ignore_case: false, from_index: 0)
str = self
str, from, to = str.downcase, from.downcase, to.downcase if ignore_case
from_idx = str.index(from, from_index)&.+(from.length)
if from_idx
to_idx = str.index(to, from_idx)
return str[from_idx...to_idx] if to_idx
end
nil
end
Test case
"Hello world".between("ell", "ld") => "o wor"
"Testing 123".between("Te", "123") => "sting "
"Testing 123".between("te", "123") => nil
"Testing 123".between("te", "123", ignore_case: true) => "sting "
"Testing 123".between("te", "123", from_index: 3) => nil
Updated by macdevign (Macdevign mac) almost 6 years ago
- Description updated (diff)
Updated by shevegen (Robert A. Heiler) almost 6 years ago
Hmmm. I don't have a preference here since I can find arguments in favour or against
this. It's fairly easily possible to use a regex so .between (or any other name to
extract the substring between two boundaries) is not that necessary; on the other hand
it may be simpler to use an "officially approved" method.
If you feel strongly about your proposal you could propose it for mention during a
developer meeting at:
https://bugs.ruby-lang.org/issues/15462
(In ~a week or so from now on.)
Then you may get matz' opinion about the proposal, both feature-wise and API/name wise.
Updated by duerst (Martin Dürst) almost 6 years ago
macdevign (Macdevign mac) wrote:
Given that string "between" extraction is such a common operation,
Can you back that up with some additional information/data? For example, do you know other programming languages that have such a function/method?
I haven't had the need for such a method, and use cases I can think of very quickly need additional or different parameters (such as xth occurrence, between (Regexp) patterns rather than simple strings, before or after a string,...), which suggests to me that solving this specific problem with the general regular expression functionality already in Ruby.
Updated by macdevign (Macdevign mac) almost 6 years ago
duerst (Martin Dürst) wrote:
macdevign (Macdevign mac) wrote:
Given that string "between" extraction is such a common operation,
Can you back that up with some additional information/data? For example, do you know other programming languages that have such a function/method?
I haven't had the need for such a method, and use cases I can think of very quickly need additional or different parameters (such as xth occurrence, between (Regexp) patterns rather than simple strings, before or after a string,...), which suggests to me that solving this specific problem with the general regular expression functionality already in Ruby.
Yes, I agree that there is regex that can do the work in Ruby.
I use it quite often in data scraping and oftn wonder why need to resort to regex and worry about escaping regex character ?
There is many ways to do the extraction (eg regex) but if there is standard way and convenient way, I wonder why reinvent our own ? I like the way Ruby provide us the creativity to solve the problem, but such simple "common" task should be simpler :}
Compare to the other languages, I believe Ruby and Python are one of those scripting languages commonly used for data scraping, and it will help if there is simple method facilitate such extraction.
Just googling the for "extract between " for language such as python, javascript, java, etc, one can see this is common question,
If other languages didn't provide it, probably Ruby can set the catalyst for other language to include it ?
Readability wise, extraction through method seems easier on the eyes, but that could be just me :}
"The quick brown fox jumps over the lazy dog"[/jumps(.+)dog/,1] # regex
"The quick brown fox jumps over the lazy dog".between("jumps", "dog") # method
thank
Updated by macdevign (Macdevign mac) almost 6 years ago
- Description updated (diff)