Project

General

Profile

Feature #16150

Add a way to request a frozen string from to_s

Added by headius (Charles Nutter) about 1 month ago. Updated 21 days ago.

Status:
Closed
Priority:
Normal
Target version:
-
[ruby-core:94815]

Description

Much of the time when a user calls to_s, they are just looking for a simple string representation to display or to interpolate into another string. In my brief exploration, the result of to_s is rarely mutated directly.

It seems that we could save a lot of objects by providing a way to explicitly request a frozen string.

For purposes of discussion I will call this to_frozen_string, which is a terrible name.

This would reduce string allocations dramatically when applied to many common to_s calls:

  • Symbol#to_frozen_string could always return the same cached String representation. This method is heavily used by almost all Ruby code that intermingles Symbols and Strings.
  • nil, true, false, and any other singleton values in the system could similarly cache and return the same String object.
  • The strings coming from core types could also be in the fstring cache and deduplicated as a result.
  • User-provided to_s implementations could opt-in to caching and returning the same frozen String object when the author knows that the result will always be the same.

A few ideas for what to call this:

  • to_fstring or fstring reflects internal the "fstring" cache but is perhaps not obvious for most users.
  • to_s(frozen: true) is clean but there will be many cases when the kwargs hash doesn't get eliminated, making matters worse.
  • def to_s(frozen = false) would be mostly free but may not be compatible with existing to_s params (like Integer#to_s(radix)

This idea was inspired by schneems (Richard Schneeman)'s talk at RubyConf Thailand, where he showed significant overhead in ActiveRecord from Symbol#to_s allocation.

Associated revisions

Revision 6ffc045a
Added by Eregon (Benoit Daloze) 24 days ago

[EXPERIMENTAL] Make Symbol#to_s return a frozen String

  • Always the same frozen String for a given Symbol.
  • Avoids extra allocations whenever calling Symbol#to_s.
  • See [Feature #16150]

Revision 9d0866c7
Added by byroot (Jean Boussier) 24 days ago

[EXPERIMENTAL] Make Module#name return a frozen String

* Always the same frozen String for a given Module or Class.
* Avoids extra allocations whenever calling Module#name.
* See [Feature #16150]

Revision eff15a26
Added by byroot (Jean Boussier) 24 days ago

[EXPERIMENTAL] Make NilClass#to_s, TrueClass#to_s and FalseClass#to_s return a frozen String

* Always the same frozen String for each of these values.
* Avoids extra allocations whenever calling these 3 methods.
* See [Feature #16150]

Revision 4d3502d5
Added by nobu (Nobuyoshi Nakada) 23 days ago

More tests for [Feature #16150]

History

Updated by schneems (Richard Schneeman) about 1 month ago

Thank you for writing up the proposal. This would certainly helped my optimization case. Here’s the PR I referenced in my talk where I had to work around the allocations from calling to_s on a symbol: https://github.com/rails/rails/pull/34197

Updated by headius (Charles Nutter) about 1 month ago

Another idea: #to_z

Updated by Eregon (Benoit Daloze) about 1 month ago

Another possibility: make/let #to_s methods of core types return frozen strings.
The general contract of to_s would become: "return a String representing the object, which might be frozen. Use .to_s.dup if you need a mutable string".

That would be less compatible but would have the advantage of existing code already being able to use those savings without changes and complications.

My impression is to_s does already not guarantee to return a new String.
For example, String#to_s returns itself and mutating sth.to_s is therefore not safe if sth can be a String.
Also, to_s definitions for user classes seem likely to already return a frozen String simply because they might use a String literal (interpolated or not) for it and frozen-string-literal: true.

Updated by shevegen (Robert A. Heiler) about 1 month ago

Just a short comment - I originally had to write more, but I think my statement would
be too diluted.

Also, to_s definitions for user classes seem likely to already return
a frozen String simply because they might use a String literal
(interpolated or not) for it and frozen-string-literal: true.

IMO, it currently can not be assumed to hold true because there may also
be frozen-string-literal: false settings, and frozen strings are not yet
the default nor will be in ruby 3.0, so I would not assume this the case
right now. This may change in the future, but right now I think it would
confuse people if to_s and to_str do NOT return a string or string-like
object; and if a string is returned, it may be frozen, despite frozen-string
literals set to false. The latter would qualify as a bug in my opinion.

I have no specific opinion about the suggested APIs, but the names are a
bit strange; or a bit cumbersome. "fstring" reads a bit like "formatted
string".

Oddly enough, I think #to_z is a bit better simply because it is shorter,
but people may still ask why the "z" is there. I guess to_f was not
offered as suggestion for frozen, due to it already meaning to a float
representation. :)

But in general, I don't really have any pro or con opinion either way. (Most
of the suggestion seems to be more inspired about speed/efficiency and less
about specific end-user use.)

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

Great idea.

This definitely has to be a new method because using to_s(frozen: true) would require to know if the object supports the frozen keyword for its to_s method.

As a method, the default implementation would be to_s(*a,**o).freeze and interpolations can automatically use that. Then it's just a matter of changing various classes (such as Symbol) so that to_frozen_string is the master implementation and to_s becomes to_frozen_string.dup

Maybe it's possible for the parser to lexically convert the expressions -expr.to_s and << expr.to_s to use to_frozen_string

Naming is hard. Short is good because we want to encourage using that method as much as possible. to_fstr, to_s!, to_fs, to_fzs

Updated by headius (Charles Nutter) about 1 month ago

For example, String#to_s returns itself and mutating sth.to_s is therefore not safe if sth can be a String.

This is a very good point. I'd love to see all immutable core types start returning frozen string but I didn't want to push too hard. It would mean immediate benefits without any changes in existing apps and libraries.

Naming is hard. Short is good because we want to encourage using that method as much as possible. to_fstr, to_s!, to_fs, to_fzs

Oh, to_s! is pretty good.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

For example, String#to_s returns itself and mutating sth.to_s is therefore not safe if sth can be a String.

String#to_s is of course a special case. But that means if a string is currently not frozen, to_frozen_string would have to do dup.freeze... doesn't that ruin the entire point of this proposal, which is about efficiency? It would mean "a #{str}" creates a duplicate of str because interpolation uses the supposedly efficient to_frozen_string. So maybe this proposal really should be called to_efficient_string, which doesn't guarantee a frozen string but rather attempts to minimize allocations. In that context, to_s! makes even more sense because you don't know what kind of string you'll get.

Updated by Eregon (Benoit Daloze) about 1 month ago

How about specifically making Symbol#to_s return a frozen String?

I tried this trivial patch:

diff --git a/string.c b/string.c
index 05ce0ed8d6..1a0fa48a6a 100644
--- a/string.c
+++ b/string.c
@@ -10866,7 +10866,9 @@ sym_inspect(VALUE sym)
 VALUE
 rb_sym_to_s(VALUE sym)
 {
-    return str_new_shared(rb_cString, rb_sym2str(sym));
+    VALUE str = str_new_shared(rb_cString, rb_sym2str(sym));
+    OBJ_FREEZE(str);
+    return str;
 }


And there are 0 test-spec failures and 0 test-all failures.
So, let's make Symbol#to_s frozen?

I think in general it makes a lot of sense that immutable classes return a frozen String for #to_s.

Making #to_s return a frozen String for mutable core classes like Array or Hash is likely much less interesting, as one cannot cache a single String instance but must check the object's contents everytime (and the #to_s result depends on other objects' #to_s which makes it a lot more complicated).

Updated by Eregon (Benoit Daloze) about 1 month ago

I also tried this version which always returns the same String for a given Symbol:

diff --git a/string.c b/string.c
index 05ce0ed8d6..f306c905bc 100644
--- a/string.c
+++ b/string.c
@@ -10866,7 +10866,7 @@ sym_inspect(VALUE sym)
 VALUE
 rb_sym_to_s(VALUE sym)
 {
-    return str_new_shared(rb_cString, rb_sym2str(sym));
+    return rb_sym2str(sym);
 }


And that passes both test-spec and test-all, except this test which probably needs to be adapted:

[5/8] Test_StringCapacity#test_capacity_shared = 0.00 s            
  1) Failure:
Test_StringCapacity#test_capacity_shared [/home/eregon/code/ruby/test/-ext-/string/test_capacity.rb:17]:
<0> expected but was
<26>.

I'll make a PR.

Updated by ashmaroli (Ashwin Maroli) about 1 month ago

Similar to Symbol#to_s, even nil.to_s allocates a new string.

irb(main):001:0> 5.times { p nil.to_s.__id__ }
21118460
21118340
21118180
21118100
21118000

IMO, since nil is a singleton, nil.to_s should also return the same empty string on every call.

Updated by sawa (Tsuyoshi Sawada) about 1 month ago

For methods that convert object to a string, we already have three:

  • to_s
  • to_str
  • inspect

Adding yet another method (or an option for an existing method) would make the picture too complicated. It might be a good time to reconsider the division of labor between the existing three methods.

It is suspicious whether the distinction between to_s vs. to_str, to_h vs. to_hash, to_a vs. to_ary, or to_i vs. to_int etc. is nowadays used in the way it was intended at the beginning. Note that, while the single splat * on an object implicitly calls the explicit to_a, the double splat ** implicitly calls the implicit to_hash. Also note that there is no explicit counterpart for the implicit method to_proc, which is used as in Symbol#to_proc, Method#to_proc, Hash#to_proc, where the (missing) explicit counterpart should be used.

Since the implicit vs. explicit distinction is messed up, but we are not having a big problem with it, there is not much motivation to try to maintain such distinction.

Personally, I am not that sure why we need to_str just to show that an object is string-like. For example, we could have the argument of Array#join undergo to_s (with a fallback to to_str) instead of (just) to_str. I don't see any problem with that.

As an example of rearranging the division of labour, one method out of the three mentioned above could be used for creating a new string instance, and another method for referencing a cached representation.

But that is just an example. There may be other ways that can make more sense. The point is, we can perhaps make use of the existing methods rather than adding a new one.

Updated by Eregon (Benoit Daloze) about 1 month ago

ashmaroli (Ashwin Maroli) wrote:

Similar to Symbol#to_s, even nil.to_s allocates a new string.

Right, as mentioned in the description, nil, true and false#to_s could all be frozen and cached the same way as for Symbol#to_s.
Although I'd think calling #to_s on those is already less frequent and performance-sensitive than Symbol#to_s.

Are there other immutable types where it would be worthwhile to cache the #to_s result?
I can't find any now, at least it doesn't seem worthwhile for numeric types, and most other core types are not immutable.

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

So, let's make Symbol#to_s frozen?
I think in general it makes a lot of sense that immutable classes return a frozen String for #to_s.

That makes sense, and I agree that's the cleanest solution. There's really nothing in the contract that says to_s is supposed to return a non-frozen string.
But I can't agree to a backward-incompatible change without a proper deprecation period.
So I went and added Feature #16153 for a way to allow a phase-in period for frozen Symbol#to_s.

Updated by Eregon (Benoit Daloze) about 1 month ago

Dan0042 (Daniel DeLorme) wrote:

But I can't agree to a backward-incompatible change without a proper deprecation period.
So I went and added Feature #16153 for a way to allow a phase-in period for frozen Symbol#to_s.

I'm unsure if such troubles are necessary, because the practical incompatibility might be extremely low, such as illustrated by in-repository specs and tests.

Do we have an example of real code breaking because of Symbol#to_s returning a frozen String?
Could someone try my PR on their app and see if it breaks anything?

Updated by Eregon (Benoit Daloze) about 1 month ago

I tried running ActiveSupport tests from Rails master on Ruby 2.6.2 + my PR, and found that one change is needed:

Error:
ConfigurableActiveSupport#test_configuration_accessors_are_not_available_on_instance:
FrozenError: can't modify frozen String
    /home/eregon/code/rails/activesupport/lib/active_support/ordered_options.rb:43:in `chomp!'
    /home/eregon/code/rails/activesupport/lib/active_support/ordered_options.rb:43:in `method_missing'
...

https://github.com/rails/rails/blob/7af44f49dbf221c3b7f0b0d476913a74b6a1d0e4/activesupport/lib/active_support/ordered_options.rb#L42-L43
A simple fix is to use name_string = +name.to_s, or to refactor to use start_with? + name_string[0..-2] instead of chomp!.
Then all ActiveSupport tests pass.

Updated by byroot (Jean Boussier) about 1 month ago

Eregon (Benoit Daloze), for context Matz recently refused a very similar change https://bugs.ruby-lang.org/issues/15836

Updated by matz (Yukihiro Matsumoto) about 1 month ago

For frozen Symbol#to_s, I see a clear benefit. But I worry a little bit for incompatibility.
So how about making an experiment by the next preview(2) to see how big the incompatibility is?
String#to_s etc. is a different story. we need to discuss further.

Matz.

Updated by byroot (Jean Boussier) 25 days ago

matz (Yukihiro Matsumoto) Great to hear.

I get you are worried about String#to_s, but what about others that have been mentioned here, namely:

  • NilClass#to_s
  • TrueClass/FalseClass#to_s
  • Module#name

Updated by Eregon (Benoit Daloze) 25 days ago

So how about making an experiment by the next preview(2) to see how big the incompatibility is?

Thank you, I will merge https://github.com/ruby/ruby/pull/2437 and mark the change as experimental.

String#to_s etc. is a different story.

I agree String#to_s should remain as-is.
Returning a frozen String for String#to_s would just add more allocations, since the current String#to_s just returns self.

what about others that have been mentioned here

I think nil/true/false.to_s and Module#name could be frozen, much like Symbol#to_s.
There might be some incompatible usages but I would expect very rare, and the FrozenError would make it pretty clear how to deal with it
(just add String#+@ or #dup, we could even adapt the FrozenError message for Strings to mention this if desired).

Updated by matz (Yukihiro Matsumoto) 25 days ago

byroot (Jean Boussier) wrote:

I get you are worried about String#to_s, but what about others that have been mentioned here, namely:

  • NilClass#to_s
  • TrueClass/FalseClass#to_s
  • Module#name

Ok for the experiment.

Matz.

#22

Updated by Eregon (Benoit Daloze) 24 days ago

  • Status changed from Open to Closed

Applied in changeset git|6ffc045a817fbdf04a6945d3c260b55b0fa1fd1e.


[EXPERIMENTAL] Make Symbol#to_s return a frozen String

  • Always the same frozen String for a given Symbol.
  • Avoids extra allocations whenever calling Symbol#to_s.
  • See [Feature #16150]

Updated by Eregon (Benoit Daloze) 24 days ago

  • Assignee set to Eregon (Benoit Daloze)
  • Status changed from Closed to Assigned

Reopening, the issue was accidentally closed as I merged the Symbol#to_s change.
I will make a PR for nil/true/false.to_s and Module#name to return a frozen String.

Updated by byroot (Jean Boussier) 24 days ago

I got the Module#name one ready here: https://github.com/ruby/ruby/pull/2487

For nil/true/false it is a bit trickier because the returned string are ASCII encoded, so I'm not quite sure how to statically create ASCII fstrings with the existing APIs.

#25

Updated by byroot (Jean Boussier) 24 days ago

  • Status changed from Assigned to Closed

Applied in changeset git|9d0866c7d7b9cbe36a851744a37806e747e0e7a8.


[EXPERIMENTAL] Make Module#name return a frozen String

* Always the same frozen String for a given Module or Class.
* Avoids extra allocations whenever calling Module#name.
* See [Feature #16150]

Updated by Eregon (Benoit Daloze) 24 days ago

  • Status changed from Closed to Assigned

byroot (Jean Boussier) Great, I merged your PR.
Sorry, I should have asked first if you wanted to do it.

Do you want to do the PR for nil/true/false as well then?

Updated by byroot (Jean Boussier) 24 days ago

Do you want to do the PR for nil/true/false as well then?

I tried but I'm quite unsure how to do it, so I don't mind if you handle it.

On a side note I'm currently running our test suite with the following monkey patch:

module FreezeName
  def name
    name = super
    name = -name if name
    name
  end
end
Module.singleton_class.prepend(FreezeName)

module FreezeToS
  def to_s
    -super
  end
end

Symbol.prepend(FreezeToS)
NilClass.prepend(FreezeToS)
TrueClass.prepend(FreezeToS)
FalseClass.prepend(FreezeToS)

This way I should be able to identify some breakage across a large amount of gems.

Updated by nobu (Nobuyoshi Nakada) 24 days ago

byroot (Jean Boussier) wrote:

For nil/true/false it is a bit trickier because the returned string are ASCII encoded, so I'm not quite sure how to statically create ASCII fstrings with the existing APIs.

static VALUE
true_to_s(VALUE obj)
{
    return rb_fstring_enc_lit("true", rb_usascii_encoding());
}

Updated by byroot (Jean Boussier) 24 days ago

Ok, so I just finished updating our application to be compatible with both Symbol#to_s and Module#name returning frozen strings.

Over the 502 gems used in the repo, I found 5 impacted gems, all because of Symbol#to_s, and all in very minor ways:

Inside our repo of over 1 million lines of code, I found 4 impacted methods, all were trivially fixed. Again they were all caused by Symbol#to_s, and very similar in nature to the public ones I listed above.

Overall I was able to trivially update a gigantic app to be ready for this change, all that in under an hour.

Updated by byroot (Jean Boussier) 24 days ago

nobu (Nobuyoshi Nakada) Yes that's what I had in mind, but then from what I understand it means we have to lookup the fstring table every time rather than just return a existing pointer. So it's not really an optimisation, is it?

Updated by mame (Yusuke Endoh) 24 days ago

byroot (Jean Boussier) wrote:

Over the 502 gems used in the repo, I found 5 impacted gems, all because of Symbol#to_s, and all in very minor ways:

A great report. For the record, I add another case I've heard:

Currently, seven impacted cases are found.

Updated by nobu (Nobuyoshi Nakada) 24 days ago

Then store those objects in global variables, and rb_gc_register_mark_object.

Updated by headius (Charles Nutter) 24 days ago

I'm glad to see some smaller parts of this are moving forward. A few updates from my end (JRuby):

  • Frozen strings from Symbol etc:

We are on board with making Symbol, nil, true, false, and Module all return frozen strings, and have an open PR to do this for Symbol right now: https://github.com/jruby/jruby/pull/5868

  • String#to_s concerns:

Right now since it returns self, it may be frozen or unfrozen, so you are NEVER safe mutating the result of to_s directly. With the changes above, even more types will start returning frozen strings. I feel like we need a way to audit the mutation of to_s results we want to make frozen in the future, similar to the --debug:frozen-string-literal feature that prints out where a frozen string came from if you try to mutate it.

I guess my logic boils down to this right now:

  • to_s often returns a frozen string right now.
  • to_s will return more frozen strings in the future if these PRs make it into 2.7.
  • Both now and in the future, it is not safe to mutate the results of a to_s.
  • So, let's give people a way to find these mutations as part of 2.7, so we can help eliminate such cases.

Updated by byroot (Jean Boussier) 24 days ago

headius (Charles Nutter) I totally get your opinion on String#to_s, however on a very practical standpoint I agree with Eregon (Benoit Daloze):

Returning a frozen String for String#to_s would just add more allocations, since the current String#to_s just returns self.

If we were to freeze the return of String#to_s I wouldn't be surprised if we ended up causing much more allocation that we're saving with Symbol#to_s and Module#name.

But granted that this now cause this somewhat weird situation where to_s might or might not return a frozen string. that being said it's already kinda the case:

>> "foo".freeze.to_s.frozen?
=> true

Updated by headius (Charles Nutter) 24 days ago

But granted that this now cause this somewhat weird situation where to_s might or might not return a frozen string. that being said it's already kinda the case

Yeah, this is precisely the problem. It is made even worse by the large number of libraries that already set frozen-string-literal.

One thing we should all be able to agree on at this point: it is not safe to blindly mutate the result of String#to_s, since more and more cases will produce frozen strings. With the experimental changes accepted above, an additional set of to_s results will also be frozen. So, by extension, it is not now and will never be safe to blindly mutate the result of calling any to_s.

I'd argue it's actually NEVER safe to modify the result of calling to_s, because for mutable source strings you'd be mutating the original contents! This is probably not desired behavior in the large majority of existing code. With that condition in mind, it would actually be much safer for us to start returning frozen strings from String#to_s as soon as possible.

I know this is a difficult pill to swallow. The original purpose of this issue was to consider adding a way to request frozen strings from to_s, since that would at least be an opt-in. Hence the suggestions of a to_z or similar new mechanism.

In the 2.7 timeframe, I'd like to see some combination of the following:

  • A way to explicitly request a frozen string from any object, such as to_z, so code could start migrating toward frozen strings today.

This would be an extension of existing "str".freeze and -"str" logic for explicitly requesting a single frozen string literal, but would allow making this request for any to_s result.

  • A debug mode that would warn if code attempts to mutate the result of String#to_s (e.g. --debug:frozen-string-to_s), since based on the above conditions this is almost never advisable.

This will help us audit existing code and start "fixing" it right now without a hard break. I'd like to see the above warning all the time, but that would require tracking source file and line for all literal strings (as in the current --debug:frozen-string-literal).

  • A pragma to explicitly set a file as having mutable-string-literal, as a local escape hatch for future default frozen-string-literal.
  • A command-line flag to force mutable-string-literal when no pragma exists, as a global escape hatch for future default frozen-string-literal.

These are a simple way to guarantee a soft landing if (when?) we default to frozen-string-literal globally in 3.0. All cases that would break with default frozen-string-literal could at least be made to work with the flag.

Updated by headius (Charles Nutter) 24 days ago

Oh I should point out another problem with the current situation...

some_str.to_s.freeze

The above code is also unsafe right now, because since to_s returns self, we're now freezing the original string!

In order to safely get a frozen string from an object, you must ALWAYS do some_str.to_s.dup.freeze. In order to safely get a mutable string from an object, you must ALWAYS do some_str.to_s.dup.

So what are we saving by leaving the current unsafe behavior in place?

#37

Updated by byroot (Jean Boussier) 24 days ago

  • Status changed from Assigned to Closed

Applied in changeset git|eff15a269fdc37d2b09cf1dfe8c1b1bf6e377a32.


[EXPERIMENTAL] Make NilClass#to_s, TrueClass#to_s and FalseClass#to_s return a frozen String

* Always the same frozen String for each of these values.
* Avoids extra allocations whenever calling these 3 methods.
* See [Feature #16150]

Updated by byroot (Jean Boussier) 23 days ago

In order to safely get a frozen string from an object, you must ALWAYS do some_str.to_s.dup.freeze. In order to safely get a mutable string from an object, you must ALWAYS do some_str.to_s.dup.

Not exactly, -@ and +@ makes this much simpler.

-@ will return self if the string is already frozen.

+@ will return self if the string is already mutable.

So these two allows you to safely get either a frozen or mutable string with the minimum amount of allocations.

Updated by headius (Charles Nutter) 23 days ago

Not exactly, -@ and +@ makes this much simpler

I do like the unary operators, but they also have some precedence oddities:

>> -"foo".size
=> -3
>> (-"foo").size
=> 3

And it doesn't work at all if you're chaining method calls:

>> +ary.to_s.frozen?
NoMethodError: undefined method `+@' for false:FalseClass
    from (irb):8
    from /usr/bin/irb:11:in `<main>'

But you are right, instead of the explicit dup with possible freeze you could use - or + on the result of to_s. However it's still not safe to modify it since it would modify the original string too.

Updated by matz (Yukihiro Matsumoto) 23 days ago

Hi,

In message "Re: [ruby-core:95142] [Ruby master Feature#16150] Add a way to request a frozen string from to_s"
on Sat, 28 Sep 2019 04:33:32 +0000 (UTC), headius@headius.com writes:

Issue #16150 has been updated by headius (Charles Nutter).

Not exactly, -@ and +@ makes this much simpler

I do like the unary operators, but they also have some precedence oddities:

>> -"foo".size
=> -3
>> (-"foo").size
=> 3

And it doesn't work at all if you're chaining method calls:

How about making String#+ and #- without argument behave like #+@ and #-@ respectively, so that we can write:

 "foo".-.size
 ary.+.to_s.frozen?
                        matz.

Updated by byroot (Jean Boussier) 21 days ago

However it's still not safe to modify it since it would modify the original string too.

IMHO that's two different use cases. Either:

  • You want to be sure not to mutate the string, then you use .dup
  • You don't care about mutating the original string, you just want to minimize allocations, then you use +@.

It's similar with freeze and -@. Either:

  • You simply want to prevent the string from being mutated, you use .freeze
  • You know you're going to hold on that string and want to optimize memory retention, you use -@.

How about making String#+ and #- without argument behave like #+@ and #-@ respectively, so that we can write:

IMHO .- and .+ is not very elegant. Proper method names explaining the intent would be preferable.

  • -@ could be dedup, or deduplicate.
  • +@ could be mutable or mut.

Updated by jonathanhefner (Jonathan Hefner) 21 days ago

  • -@ could be dedup, or deduplicate.

dedup is an interesting choice because it implies the opposite of dup.

Although you might run into naming collisions if / when you extend this functionality to other types. For example, I would expect Array#dedup to have a very different meaning, entirely unrelated to freezing.

Another alternative might be something like intern_s, since String#intern already exists. (Although this might also be awkward to extend to other types. For example, should it be Array#intern_a for consistency or simply Array#intern?)

Updated by Eregon (Benoit Daloze) 21 days ago

I'll get to the point because I think we're starting to discuss unrelated things in this issue.

headius (Charles Nutter) wrote:

I'd argue it's actually NEVER safe to modify the result of calling to_s [...]

Yes, and it has never been.
String#to_s has always returned itself, including for frozen strings, so of course to_s means "convert to a String if you're not already" and gives no guarantee to return a new String ready to be mutated.
It is the same for all other to_* methods, isn't it?

I think this is already well understood by Rubyists, and the changes for Symbol#to_s, Module#name, true/false/nil.to_s just make an existing behavior a bit more common.
So, I think there is nothing to change about String#to_s, except maybe improving the documentation of Kernel#to_s in general.

I know this is a difficult pill to swallow. The original purpose of this issue was to consider adding a way to request frozen strings from to_s, since that would at least be an opt-in. Hence the suggestions of a to_z or similar new mechanism.

I think there is no need for yet another "convert to String" method.
We changed the methods that used to allocate on every call instead of returning an internal frozen String (I don't think there are others, are there?).

Making String#to_s not return self is just worse for allocations, and would be inconsistent with other to_* methods.
Is there any advantage to make String#to_s not return self?

  • A way to explicitly request a frozen string from any object, such as to_z, so code could start migrating toward frozen strings today.

Why are String#-@ and str.dup.freeze not enough?

  • A pragma to explicitly set a file as having mutable-string-literal, as a local escape hatch for future default frozen-string-literal.
  • A command-line flag to force mutable-string-literal when no pragma exists, as a global escape hatch for future default frozen-string-literal.

That's about frozen string literals, #11473, please comment there, I don't think we should mix both issues.

In fact, I think this issue is done and is already correctly closed.
We addressed the cases that were allocating too much, without the need for a new method for converting to strings.

Also available in: Atom PDF