Project

General

Profile

Feature #15899

String#before and String#after

Added by kke (Kimmo Lehto) 14 days ago. Updated 4 days ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:92972]

Description

There seems to be no methods for getting a substring before or after a marker.

Too often I see and have to resort to variations of:

str[/(.+?);/, 1]
str.split(';').first
substr, _ = str.split(';', 2)
str.sub(/.*;/, '')
str[0...str.index(';')]

These create intermediate objects or/and are ugly.

The String#delete_suffix and String#delete_prefix do not accept regexps and thus only can be used if you first figure out the full prefix or suffix.

For this reason, I suggest something like:

> str = 'application/json; charset=utf-8'
> str.before(';')
=> "application/json"
> str.after(';')
=> " charset=utf-8"

What should happen if the marker isn't found? In my opinion, before should return the full string and after an empty string.

History

Updated by sawa (Tsuyoshi Sawada) 14 days ago

Since you are mentioning that String#delete_suffix and String#delete_prefix do not accept regexps and that is a weak point, you should better use regexps in the examples illustrating your proposal.

Updated by sawa (Tsuyoshi Sawada) 14 days ago

Using partition looks reasonable, and it can accept regexes.

str = 'application/json; charset=utf-8'
before, _, after = str.partition(/; /)
before # => "application/json"
after # => "charset=utf-8"

Updated by shevegen (Robert A. Heiler) 13 days ago

I can see where it may be useful, since it could shorten code like this:

first_part = "hello world!".split(' ').first

To:

first_part = "hello world!.before(' ')

It is not a huge improvement in my opinion, though. (My comment here has
not yet addressed the other part about using regexes - see a bit later for
that.)

I am not a big fan of the names, though. I somehow associate #before and #after
more with time-based operations; and rack/sinatra middleware (route) filters.

I do not have a better or alternative suggestion, although since we already have
delete_prefix, perhaps we could have some methods that return the desired prefix
instead (or suffix).

As for lack of regex support, I think sawa already pointed out that it may be
better to reason for changing delete_prefix and delete_suffix instead. That way
your demonstrated use case could be simplified as well.

Updated by kke (Kimmo Lehto) 13 days ago

Using partition looks reasonable, and it can accept regexes.

It also has the problem of creating extra objects that you need to discard with _ or assign and just leave unused.

I am not a big fan of the names, though. I somehow associate #before and #after
more with time-based operations; and rack/sinatra middleware (route) filters.

How about str.preceding(';') and str.following(';')?

Perhaps str.prior_to(';') and str.behind(';')?

Possibility of opposite reading direction can make these problematic.

str.left_from(';'), str.right_from(';')? Sounds a bit clunky.

Head and tail could be the unixy choice and more versatile for other use cases.

class String
  def head(count = 10, separator = "\n")
    ...
  end

  def tail(count = 10, separator = "\n")
    ...
  end
end

For my example use case, it would become:

str = "application/json; charset=utf-8"
mime = str.head(1, ';')
labels = str.tail(1, ';')

And to emulate something like $ curl xttp://x.example.com | head you would use response.body.head

Updated by kke (Kimmo Lehto) 5 days ago

How about first and last?

'hello world'.first(2)
 => 'he'
'hello world'.last(2)
 => 'ld'
'hello world'.first
 => 'h'
'hello world'.last
 => 'd'
'hello world'.first(1, ' ')
 => 'hello'
'hello world'.last(1, ' ')
 => 'world'
'application/json; charset=utf-8'.first(1, ';')
 => 'application/json'

Updated by marcandre (Marc-Andre Lafortune) 4 days ago

sawa is right. Just use partition and rpartition.

Also available in: Atom PDF