Project

General

Profile

Feature #16295

Chainable aliases for String#-@ and String#+@

Added by byroot (Jean Boussier) 12 days ago. Updated 5 days ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:95702]

Description

Original discussion https://bugs.ruby-lang.org/issues/16150?next_issue_id=16147&prev_issue_id=16153#note-40

In #16150, headius (Charles Nutter) raised the following concern about String#-@ and String#+@:

headius (Charles Nutter) wrote:

Not exactly, -@ and +@ makes this much simpler

I do like the unary operators, but they also have some precedence oddities:

>> -"foo".size
=> -3
>> (-"foo").size
=> 3

And it doesn't work at all if you're chaining method calls:

>> +ary.to_s.frozen?
NoMethodError: undefined method `+@' for false:FalseClass
  from (irb):8
  from /usr/bin/irb:11:in `<main>'

But you are right, instead of the explicit dup with possible freeze you could use - or + on the result of to_s. However it's still not safe to modify it since it would modify the original string too.

After working for quite a while with those, I have to say I agree. They very often force to use parentheses, which is annoying, and an indication that regular methods would be preferable to unary operators.

In response matz (Yukihiro Matsumoto) proposed to alias them as String#+ and String#- without arguments:

How about making String#+ and #- without argument behave like #+@ and #-@ respectively, so that we can write:

 "foo".-.size
 ary.to_s.+.frozen?

My personal opinion is that descriptive method names would be preferable to +/-:

IMHO .- and .+ is not very elegant. Proper method names explaining the intent would be preferable.

  • -@ could be dedup, or deduplicate.
  • +@ could be mutable or mut.

History

Updated by shevegen (Robert A. Heiler) 12 days ago

I agree that + and - are not very elegant, as names. They are not very meaningful (as names).

On the other hand they are short, so from this point of view, useful in a practical manner,
but this is actually the main reason why I prefer the much longer .dup instead, and don't
use + and - at all. That leads to longer ruby code, but I just prefer it if the code I
look at makes "sense" to me, which is a very subjective criterium to apply, I am aware
of that.

I often prefer short english words/names in general, within reason. It is always a trade-off
of course. Ruby often allows "both" styles, where you can use a shorter or longer
variant. .map versus .collect is an example, although matz added this to make a
transition into ruby easier for people who are used the .e. g. the .collect idiom.

I myself only use .map though - and one reason is that it is shorter. :)

.append and .prepend on objects could be thought of as the same though; e. g.
I always remember << as "append". And it reminds me a bit of C++ too, even though
<< is not really "append" per se or a corresponding method that may have to exist.
I just like to remember it that way.

They very often force to use parentheses, which is annoying

I agree in general. Being able to omit parens is great. I personally use parens in
method definitions if there is at least one argument; other ruby users omit the parens
completely, which I can understand, even if I don't use that style. But more importantly
I agree that being able to decide whether to use parens or not is GREAT. In python you
are forced to use them, and I find this annoying. (I really think ruby is better than
python in many ways.)

To the suggestion itself for the names:

I think all of dedup, deduplicate, mutable or mut are a bit ... clumsy.

IF the question were SOLELY between:

dedup versus deduplicate

and

mut versus mutable

Then I think the shorter names would be a tiny bit better. But .dedup is not a great name,
and .mut is a bit confusing. .deduplicate seems too long, I actually typoed when I tried
to write it just now :) - .mutable is ... hmm. The name seems a bit more like .mutable?
to me, as a query method.

I am not sure that these names are great.

Perhaps we can come up with names that describe the behaviour, without
having to focus on + or -.

If I understand the problem correctly then the primary issue is to find good name
candidates? If so perhaps people can give some suggestions.

Perhaps some name with .freeze_* or something like that, or .unfreeze (not sure
here, I think we can not unfreeze, only freeze, so such a name may cause
confusion).

Actually we already have .dup which I assume is short for .duplicate. So perhaps
the methods could be centered around .dup.

.de_dup
.un_dup
.dup+
.dup-       # ok ok that does not work but ...
.dup_plus
.dup_minus  # clumsy too ...
.chain_dup  # uhm ...
.dup_chain  # sounds like a music song
.freeze_dup # no idea why this even came up ...
.duppity    # just sounds good

Well - short break from finding silly names ...

If we look at the documentation, we have:


+str → str (mutable)

If the string is frozen, then return duplicated mutable string.

If the string is not frozen, then return the string itself.

-str → str (frozen)

Returns a frozen, possibly pre-existing copy of the string.

The string will be deduplicated as long as it is not tainted, or has any instance variables set on it.


So how about ...

.frozen_copy
.frozen_or_copy

Actually, reading the documentation, .dedup seems to be ok:

.dedup

Even if the name is not perfect, it may be better than not
having an alternative.

I can't really think of a great name though. Perhaps others can
give some more ideas.

Updated by phluid61 (Matthew Kerwin) 11 days ago

It doesn't exactly fit the way messages are named in Ruby, but how about:

alias -@ frozen
alias +@ thawed

Updated by Eregon (Benoit Daloze) 11 days ago

I like #dedup for String#-@, partly for the relation with #dup.

For String#+@, I'd propose #buffer like buf = ''.buffer.
I don't like mut.

Updated by byroot (Jean Boussier) 11 days ago

phluid61 (Matthew Kerwin) wrote:

It doesn't exactly fit the way messages are named in Ruby, but how about:

alias -@ frozen
alias +@ thawed

-@ does more than freezing the string, it also lookup the fstring table and potentially returns you a pre-existing instance, potentially deduplicating equal strings. I believe the alias name should reflect this intent, otherwise people might confuse it with a simple alias to freeze.

Eregon (Benoit Daloze) wrote:

For String#+@, I'd propose #buffer like buf = ''.buffer.
I don't like mut.

I'm of two mind on that one.

I like buffer as well, but when I read it I'm thinking about an actual buffer for network reads etc, and String#b is already the proper method for such use case.

But I agree that mut / mutable isn't great as a name.

Updated by phluid61 (Matthew Kerwin) 10 days ago

byroot (Jean Boussier) wrote:

phluid61 (Matthew Kerwin) wrote:

It doesn't exactly fit the way messages are named in Ruby, but how about:

alias -@ frozen
alias +@ thawed

-@ does more than freezing the string, it also lookup the fstring table and potentially returns you a pre-existing instance, potentially deduplicating equal strings. I believe the alias name should reflect this intent, otherwise people might confuse it with a simple alias to freeze.

I think most of that functionality is equivalent to implementation detail, as far as String itself is concerned. Deduplication is a concern of the ObjectSpace.

If it's important, document it in the rdoc. The method name doesn't have to describe everything the method does.

Also: why is something like "dedup" any better? It sounds like a simple alias for intern (which, incidentally, returns a deduplicated, frozen instance..)

Updated by alanwu (Alan Wu) 10 days ago

I like dedup too. -@ was introduced to expose deduplication in the first place.
Usages I've seen all have to do with memory concerns. You wouldn't call it just to get a frozen string, you care far more that it can deduplicate.

Updated by phluid61 (Matthew Kerwin) 10 days ago

alanwu (Alan Wu) wrote:

I like dedup too. -@ was introduced to expose deduplication in the first place.

#11782 :

Specification:

  • +'foo' returns modifiable string.
  • -'foo' returns frozen string (because wasters will freeze below 0 degree in Celsius).

The optimisations aren't part of the original specification. In fact, it was all about adding +@, because at the time all string literals were intended to be frozen (and -@ was meant to do nothing.)

The deduplication came in #13077, and it was retrofit to -@ specifically because there was no better name for the method. fstring was the original proposal, because it invokes rb_fstring. The 'f' stands for 'frozen', by the way.

Usages I've seen all have to do with memory concerns. You wouldn't call it just to get a frozen string, you care far more that it can deduplicate.

I use -"string" because it's easier to type than "string".freeze, and both -@ and +@ are nice, clear signals of intention when I initialise a string; one is frozen, one is thawed. Deduplication is nice, but not my primary concern.

Updated by alanwu (Alan Wu) 10 days ago

phluid61 (Matthew Kerwin) Sorry bout that. I should have checked the history before posting my misleading comment!

Updated by phluid61 (Matthew Kerwin) 10 days ago

For what it's worth, I'm not against #dedup per se. -@ is great for signalling a frozen literal, but in the context at hand the method is more likely to be used to deduplicate a derived value.

What about adding a parameter to an existing method? some_str.freeze(dedup: true)

Updated by Dan0042 (Daniel DeLorme) 10 days ago

  • Description updated (diff)

It would be nice to see some real-world examples where chaining of these methods makes sense. "foo".-.size (always 3) and ary.to_s.+.frozen? (always false) are not very convincing. In my code I don't think I've ever wished to use these operations in the middle of a chain.

Updated by byroot (Jean Boussier) 5 days ago

Dan0042 (Daniel DeLorme)

Based on the gems I had to fix for #16150, this diff would be a typical use case: https://github.com/grpc/grpc/pull/20417/files

It's it's broken up in multiple lines so it's fine.

I also have this one from our private code base:

(+number.dup.to_s).force_encoding(Encoding::UTF_8).unicode_normalize(:nfkd)

Also available in: Atom PDF