Project

General

Profile

Feature #12969

Allow optional parameter in String#strip and related

Added by herwinw (Herwin Quarantainenet) over 2 years ago. Updated about 2 years ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:78251]

Description

String#strip and related methods have a hardcoded match on whitespace, defined as "null, horizontal tab, line feed, vertical tab, form feed, carriage return, space". It would be nice to allow a parameter to specify the characters you want to strip.

This would result in a kind of marriage between String#delete and String#chomp. The first one is an example on how to specify character classes, the second one is an example on how to pass an optional parameter.

Examples how this would work:

"hellooo  ".rstrip          #=> "hellooo"
"hellooo  ".rstrip(" ")     #=> "hellooo"
"hellooo  ".rstrip(" o")    #=> "hell"
"hellooo  ".rstrip("o ")    #=> "hell"
"hellooo  ".rstrip("o")     #=> "hellooo  "
"hellooo  ".chomp(" ")      #=> "hellooo ", only replaces one character, thus not a sensible alternative

The same behaviour can be achieved by using String#sub and the correct anchors, but I think an optional parameter to String#strip is cleaner to read. It's probably faster too, since we don't have to initialize a regex state machine.

History

Updated by shyouhei (Shyouhei Urabe) over 2 years ago

I think it's possible. Just wonder if there are cases where it is useful.

Updated by herwinw (Herwin Quarantainenet) over 2 years ago

The concrete use case that I got was that I wanted to replace all trailing whitespace, but leave tabs/newlines etc untouched. The current code looks like this:

newstr = str.sub(/ +$/, '')

I tried to see if there was a more suitable method in String to do this, but both String#rstrip and String#chomp case close to what I needed here, but no cigar.

Updated by matz (Yukihiro Matsumoto) about 2 years ago

Removing pattern may not be set of single character, or may be complex.
Considering that, using regular expression is the best way, I think.

Matz.

Updated by naruse (Yui NARUSE) about 2 years ago

In the Unicode age the list of character is more and more complex.
Using a string as a list of characters is not a good idea now.

Though String#tr already exists, adding new one is not a good idea.

Updated by duerst (Martin Dürst) about 2 years ago

Yukihiro Matsumoto wrote:

Removing pattern may not be set of single character, or may be complex.
Considering that, using regular expression is the best way, I think.

I agree. It would also allow to specify character ranges and character properties.

Also available in: Atom PDF