Project

General

Profile

Actions

Feature #20205

open

Enable `frozen_string_literal` by default

Added by byroot (Jean Boussier) 12 months ago. Updated about 1 month ago.

Status:
Assigned
Target version:
-
[ruby-core:116382]

Description

Context

The frozen_string_literal: true pragma was introduced in Ruby 2.3, and as far as I'm aware the plan was initially to make it the default for Ruby 3.0, but this plan was abandoned because it would be too much of a breaking change without any real further notice.

According to Matz, he still wishes to enable frozen_string_literal by default in the future, but a reasonable migration plan is required.

The main issue is backward compatibility, flipping the switch immediately would break a lot of code, so there must be some deprecation period.

The usual the path forward for this kind of change is to emit deprecation warnings one of multiple versions in advance.

One example of that was the Ruby 2.7 keyword argument deprecation. It was quite verbose, and some users were initially annoyed, but I think the community pulled through it and I don't seem to hear much about it anymore.

So for frozen string literals, the first step would be to start warning when a string that would be frozen in the future is mutated.

Deprecation Warning Implementation

I implemented a quick proof of concept with @etienne in https://github.com/Shopify/ruby/pull/549

In short:

  • Files with # frozen_string_literal: true or # frozen_string_literal: false don't change in behavior at all.
  • Files with no # frozen_string_literal comment are compiled to use putchilledstring opcode instead of regular putstring.
  • This opcode mark the string with a user flag, when these strings are mutated, a warning is issued.

Currently the proof of concept issue the warning at the mutation location, which in some case can make locating where the string was allocated a bit hard.

But it is possible to improve it so the message also include the location at which the literal string was allocated, and learning from the keyword argument warning experience,
we can record which warnings were already issued to avoid spamming users with duplicated warnings.

As currently implemented, there is almost no overhead. If we modify the implementation to record the literal location,
we'd incur a small memory overhead for each literal string in a file without an explicit frozen_string_literal pragma.

But I believe we could do it in a way that has no overhead if Warning[:deprecated] = false.

Timeline

The migration would happen in 3 steps, each step can potentially last multiple releases. e.g. R0 could be 3.4, R1 be 3.7 and R2 be 4.0.
I don't have a strong opinion on the pace.

  • Release R0: introduce the deprecation warning (only if deprecation warnings enabled).
  • Release R1: make the deprecation warning show up regardless of verbosity level.
  • Release R2: make string literals frozen by default.

Impact

Given that rubocop is quite popular in the community and it has enforced the usage of # frozen_string_literal: true for years now,
I suspect a large part of the actively maintained codebases in the wild wouldn't see any warnings.

And with recent versions of minitest enabling deprecation warnings by default (and potentially RSpec too),
the few that didn't migrate will likely be made compatible quickly.

The real problem of course are the less actively developed libraries and applications. For such cases, any codebase can remain compatible by setting RUBYOPT="--disable=frozen_string_literal",
and so even after R2 release. The flag would never be removed any legacy codebase can continue upgrading Ruby without changing a single line of cod by just flipping this flag.

Workflow for library maintainers

As a library maintainer, fixing the deprecation warnings can be as simple as prepending # frozen_string_literal: false at the top of all their source files, and this will keep working forever.

Alternatively they can of course make their code compatible with frozen string literals.

Code that is frozen string literal compatible doesn't need to explicitly declare it. Only code that need it turned of need to do so.

Workflow for application owners

For application owners, the workflow is the same than for libraries.

However if they depend on a gem that hasn't updated, or that they can't upgrade it, they can run their application with RUBYOPT="--disable=frozen_string_literal" and it will keep working forever.

Any user running into an incompatibility issue can set RUBYOPT="--disable=frozen_string_literal" forever, even in 4.x, the only thing changing is the default value.

And any application for which all dependencies have been made fully frozen string literal compatible can set RUBYOPT="--enable=frozen_string_literal" and start immediately removing magic comment from their codebase.


Related issues 2 (0 open2 closed)

Related to Ruby master - Feature #11473: Immutable String literal in Ruby 3Closedmatz (Yukihiro Matsumoto)Actions
Related to Ruby master - Feature #20390: Issue with StringIO and chilled stringsClosedbyroot (Jean Boussier)Actions

Updated by matheusrich (Matheus Richard) 12 months ago

Given that rubocop is quite popular in the community and it has enforced the usage of # frozen_string_literal: true for years now,
I suspect a large part of the actively maintained codebases in the wild wouldn't see any warnings.

That's true, but because standardrb doesn't enforce it (and many folks have been defaulting to that), I've seen several codebases not consistently using the pragma.

Updated by mame (Yusuke Endoh) 12 months ago

I think we should evaluate the value of # frozen_string_literal: true before making it the default.

The purpose of # frozen_string_literal: true is to make Ruby code fast and memory-saving. When it was introduced, no quantitative evaluation was available except for micro-benchmarks, because most code did not support frozen_string_literal.

Now that there are many gems supporting # frozen_string_literal: true. So we can evaluate it. For example, how much would the performance degrade if we removed # frozen_string_literal: true from all code used in yjit-bench?

If the degradation is large enough, then making # frozen_string_literal: true the default may be worth serious consideration. If the degradation is negligible, it would be reasonable for Rubocop to stop enforcing # frozen_string_literal: true and remove the magic comments from existing gems as well.

Updated by zverok (Victor Shepelev) 12 months ago

The purpose of # frozen_string_literal: true is to make Ruby code fast and memory-saving. When it was introduced, no quantitative evaluation was available except for micro-benchmarks, because most code did not support frozen_string_literal.

I believe that whatever the initial intention, the "frozen string literals" concept being adopted by many codebases is also a cultural thing, not only performance-related.

The "string literals are frozen by default" changes the way we program, if even slightly. That’s actually a long-standing topic to have more impactful freezing (of the constants, for example), but frozen string literals at the very least prevent trivial errors like

HEADER = "<html><body>"

def generate
  output = HEADER # no `.dup`
  output << "<p>test</p>" # actually changed HEADER
  # ...
end

Updated by byroot (Jean Boussier) 12 months ago

I think we should evaluate the value of # frozen_string_literal: true before making it the default.

So it's not 100% reliable because I ran it locally rather than on a benchmarking server, as evidenced by some strange effect on benchmarks that don't normally deal with strings (e.g. setivar_object), but here are the results:

mutable: ruby 3.4.0dev (2024-01-24T08:24:16Z disable-frozen-str.. a39d5eae1e) [arm64-darwin23]
frozen: ruby 3.4.0dev (2024-01-24T08:24:16Z disable-frozen-str.. bb0cee8dab) [arm64-darwin23]

--------------  ------------  ----------  -----------  ----------  --------------  --------------
bench           mutable (ms)  stddev (%)  frozen (ms)  stddev (%)  frozen 1st itr  mutable/frozen
activerecord    32.9          4.0         32.7         5.3         1.06            1.01          
chunky-png      578.0         1.1         555.2        1.0         1.01            1.04          
erubi-rails     1124.2        2.4         1158.7       1.7         0.93            0.97          
hexapdf         1703.9        2.7         1670.2       2.2         1.04            1.02          
liquid-c        34.3          5.7         34.1         5.0         1.02            1.01          
liquid-compile  39.8          4.2         38.0         5.3         0.99            1.05          
liquid-render   97.2          3.4         97.7         3.5         1.01            0.99          
lobsters        637.8         5.2         614.9        4.2         1.11            1.04          
mail            82.7          3.7         82.1         3.7         1.00            1.01          
psych-load      1500.8        0.8         1487.6       0.5         0.99            1.01          
railsbench      1116.3        2.1         1099.8       1.6         1.02            1.01          
rubocop         112.0         3.8         111.2        3.7         1.03            1.01          
ruby-lsp        79.5          2.6         80.3         2.4         0.92            0.99          
sequel          36.4          4.4         36.5         3.3         0.95            1.00          
binarytrees     238.4         1.9         238.8        1.7         0.99            1.00          
blurhash        281.9         1.4         282.5        1.7         0.99            1.00          
erubi           166.1         2.0         171.5        2.5         0.97            0.97          
etanni          198.7         2.7         200.5        1.8         1.04            0.99          
fannkuchredux   2065.3        0.7         2078.5       0.5         0.99            0.99          
fluentd         1192.0        1.0         1202.7       0.7         1.03            0.99          
graphql         2323.6        0.7         2340.5       0.5         0.98            0.99          
graphql-native  385.3         1.5         384.2        2.1         1.00            1.00          
lee             739.3         1.8         750.1        1.2         0.98            0.99          
matmul          1494.2        0.8         1497.8       0.9         1.01            1.00          
nbody           71.1          2.3         71.3         5.1         1.00            1.00          
nqueens         169.8         1.6         167.6        2.0         1.00            1.01          
optcarrot       4202.7        0.6         4216.8       0.5         1.00            1.00          
rack            56.0          3.3         53.9         3.9         1.06            1.04          
ruby-json       1976.3        0.9         1992.4       0.3         1.00            0.99          
rubykon         6971.3        0.8         7053.4       0.4         1.00            0.99          
sudoku          1836.1        0.4         1836.3       0.3         1.00            1.00          
tinygql         441.4         1.0         447.4        1.2         0.96            0.99          
30k_ifelse      1459.0        8.6         1429.1       4.6         0.96            1.02          
30k_methods     3331.8        3.7         3264.0       1.1         1.03            1.02          
cfunc_itself    81.5          1.8         81.2         2.9         1.02            1.00          
fib             187.0         1.6         188.8        1.5         0.94            0.99          
getivar         65.1          2.2         65.7         2.4         1.01            0.99          
keyword_args    141.6         2.3         140.9        2.1         1.00            1.01          
respond_to      184.9         1.9         185.7        1.1         0.96            1.00          
setivar         39.2          2.9         39.2         2.4         1.03            1.00          
setivar_object  72.7          2.5         77.5         2.1         0.95            0.94          
setivar_young   72.5          1.9         77.5         0.9         0.93            0.94          
str_concat      69.7          2.5         69.9         1.9         1.02            1.00          
throw           14.9          5.0         14.8         4.5         1.04            1.01          
--------------  ------------  ----------  -----------  ----------  --------------  --------------
Legend:
- frozen 1st itr: ratio of mutable/frozen time for the first benchmarking iteration.
- mutable/frozen: ratio of mutable/frozen time. Higher is better for frozen. Above 1 represents a speedup.

I used the following patch to disable frozen string literals globally: https://github.com/ruby/ruby/compare/master...Shopify:ruby:disable-frozen-string-literal, and ran the suite with ruby run_benchmarks.rb --chruby 'mutable::head-mutable-strings;frozen::head'.

Most of these benchmarks don't really deal with string, but one that I think is the most close to reality and to Ruby bread and butter is lobsters and it seem to be quite positive.

I also didn't enable YJIT, I suspect the difference would be bigger if I did, as the extra GC pressure would be relatively bigger.

I'll see about using the yjit-perf benchmark server to run these more scientifically.

I also want to note that I have to explictly freeze RUBY_DESCRIPTION in my patch, otherwise it would cause a Ractor issue, which is another argument for frozen strings, as they ease the necessary work necessary to make code Ractor compatible.

Updated by byroot (Jean Boussier) 12 months ago

but because standardrb doesn't enforce it (and many folks have been defaulting to that), I've seen several codebases not consistently using the pragma.

Yes, I didn't extend on my motivations for bringing this now, but It's in big part influenced by standardrb and some other communities starting to discourage the use of frozen string literals. I suspect many people are starting to be fed up with that magic comment and we probably reached peak usage of it.

Updated by duerst (Martin Dürst) 12 months ago

frozen_string_literal can lead to more efficient code, but it's also part of a programming style (usually called functional programming). Functional programming often leads to cleaner code, but it may require some additional programming effort. There's also a fundamental conflict between object-oriented programming (objects are generally mutable) and functional programming, although Ruby is pretty good at integrating these concepts.

My main issue with this proposal is that I think it's probably the right thing for most big code bases, but it may not be the right thing for quick-and-dirty small scripts. And Ruby is used, and should continue to be usable, for both kinds of code.

Maybe what could help is a declaration on a higher level, e.g. per gem or so rather than per source file. (That's just an idea, with many open questions: Where would the setting go? How would the interpreter pick it up? How would people become aware of it? ...)

Updated by Eregon (Benoit Daloze) 12 months ago

+1 to this issue proposal, it sounds like a good plan.
And agreed with @zverok (Victor Shepelev), having frozen literals by default is not only faster and less wasteful in memory usage, but also a safer model, where "String used like buffers/mutated Strings" are more explicit, which certainly seems good for readability and avoiding to accidentally mutate a String which should not be mutated.

Updated by byroot (Jean Boussier) 12 months ago

Maybe what could help is a declaration on a higher level, e.g. per gem or so rather than per source file

That's my fallback proposal if this one doesn't go through. Devise a way to set compile options for all files inside a directory, and then integrate with Rubygems to enable frozen string literals on a per gem basis.

But I'd prefer to just flip the default if possible.

Updated by mame (Yusuke Endoh) 12 months ago

@byroot (Jean Boussier) Thanks for the quick benchmark! It depends on further benchmarking, but currently, I don't see enough advantage over compatibility. Do you think it is worth the risk of compatibility?

@zverok (Victor Shepelev) Ah, that's exactly why I was against the introduction of frozen_string_literal, because people misunderstand it like you. This feature must not be mixed up with "immutability". I can understand if all String objects were frozen by default, but freezing only string "literals" makes no sense at all (except in terms of performance). Consider, just adding .upcase or .b or something to a String literal makes it mutable. I don't find "a cultural thing" in such a fragile thing.

Updated by byroot (Jean Boussier) 12 months ago

I'm still trying to get a hold onto the benchmarking server, but the 5% figure on lobsters seems to be quite consistent on my machine, so I think I can use it for my argumentation.

It depends on further benchmarking, but currently, I don't see enough advantage over compatibility. Do you think it is worth the risk of compatibility?

Alright, so I fear my answer to this will be longer than expected.

So I'm of course biased because the type of workload I work with are very heavily string based (Web), so for me, a 5% improvement (to be confirmed) is quite significant.

And also because while we have many dependencies (over 700 transitive gems in the monolith), we do make sure to prune or replace the abandoned ones, and have the habit to contribute to them regularly. So I'm quite confident we can get all our dependencies ready for this change without much work and in short order, It will certainly be much less work than the Ruby 2.7 keyword argument change was.

Now of course, for Ruby users that don't generally deal with Strings much, this is just yet another annoying change that don't give them anything substantial.

But also retaining compatibility for them is really trivial. If you are running a legacy code base or outdated dependency, but yet are upgrading to a newer Ruby, is it really that much work to just set RUBYOPT="--disable=frozen_string_literal" an move on? Especially after multiple versions warning you that you'll have to do it at some point? Is that legacy code even still working on 3.x after the keyword argument change?

Maybe my bias cause me to downplay the compatibility concern, but it really doesn't seem significant to me assuming a reasonably long timeline so the ecosystem have time to prepare and catch up.

Updated by jeremyevans0 (Jeremy Evans) 12 months ago

byroot (Jean Boussier) wrote in #note-10:

Is that legacy code even still working on 3.x after the keyword argument change?

It depends on what you consider legacy code. Legacy code designed for Ruby 1.8 or 1.9, which never used keyword arguments, was unaffected by the keyword argument changes in Ruby 3.

Updated by Dan0042 (Daniel DeLorme) 12 months ago

I think the community pulled through it and I don't seem to hear much about it anymore.

Careful; the community "pulled through" because there was already a lot of accumulated good will, and the 2.7 migration burned through some of that reserve. Yet another migration might result in "not again!" syndrome and the community not pulling through nearly as well. It depends on how annoying the migration is, and the perceived benefit.

I should note that since dynamic string literals are no longer frozen since 3.0, the disruption would be that much smaller.

  • Files with no # frozen_string_literal comment are compiled to use putchilledstring opcode instead of regular putstring.
  • This opcode mark the string with a user flag, when these strings are mutated, a warning is issued.

That's a lot like #16153 so I like it. I would also like to have # frozen_string_literal: chilled to enable this behavior on my own terms, without impacting gems over which I have no control.

learning from the keyword argument warning experience

I think another important lesson from that experience is that gems are different from app code. If I want to optimize my app to use frozen string literals, I have to enable this warning at the global level, and then if warnings from gems are mixed in it makes my job a lot more annoying. For all kinds of reasons I do not want to update my apps and gems at the same time.

Maybe what could help is a declaration on a higher level, e.g. per gem or so rather than per source file

It's a bit similar to #17156 so I like it. Actually I would much prefer this than changing the default; it allows every app and gem to upgrade on their own terms, without enforcing a one-size-fits-all default, and without the noise of a pragma in every file. Especially if combined with frozen_string_literal: chilled it would be very empowering.

But also retaining compatibility for them is really trivial. If you are running a legacy code base or outdated dependency, but yet are upgrading to a newer Ruby, is it really that much work to just set RUBYOPT="--disable=frozen_string_literal" an move on?

Conversely let me ask: Is it really that much work to just set RUBYOPT="--enable=frozen_string_literal" in your application instead of forcing a new default on everyone else?

Updated by rubyFeedback (robert heiler) 12 months ago

I think this was discussed or suggested before. Personally
I have all my .rb files with one header line being:

# frozen_string_literal: true

so a change towards frozen string literals would not affect
me, as I already use that by default, since many years
actually.

So the question, then, is, how this may affect other ruby
users and developers.

One example of that was the Ruby 2.7 keyword argument
deprecation. It was quite verbose, and some users were
initially annoyed, but I think the community pulled
through it and I don't seem to hear much about it anymore.

This is not quite how I remember it; I think there were tons
of warnings initially, in particular from rails code bases,
and some changes were made to the warning situation.

zverok wrote:

I believe that whatever the initial intention, the
"frozen string literals" concept being adopted by
many codebases is also a cultural thing, not only
performance-related.

Well, I mostly added it for consistency reasons, and also
to make it easier to transition and upgrade to new ruby
releases. So for me, it just was easier to add it to all
my .rb files, via a standard header I keep on re-using
in my .rb files. Not sure if it is a cultural thing,
but for me it is more a habit.

zverok wrote:

The "string literals are frozen by default" changes
the way we program, if even slightly.

Yep, that is very true.

In oldschool ruby I could just do:

x = 'foo'
x << 'bar'

I prefer the oldschool ruby variant. But I understand the performance
reasons for frozen Strings, so it's a trade off. That's why I think
being able to tell ruby to not use frozen strings would be nice, if
only of reason of nostalgia, as I liked the oldschool behaviour more.

The way I solve this is actually solely via .frozen? checks and then
.dup. There are other shorter ways (nobu showed that), but for some
odd reason I like the more explicit check. Even then, naturally, I
prefer the oldschool ruby way without having to check on the frozen
status. This is just nostalgia, though - I moved towards every String
object being frozen by default in my own ruby code.

some other communities starting to discourage the use of frozen
string literals. I suspect many people are starting to be fed
up with that magic comment and we probably reached peak usage of it.

I am not fed up with the comment. I'd actually be more fed up with
it if it were removed, and frozen_string_literal: false no longer
being possible. Even though I no longer use it myself. :P

There's also a fundamental conflict between object-oriented
programming (objects are generally mutable) and functional
programming, although Ruby is pretty good at integrating
these concepts.

I don't see that distinction really. Personally I follow more of
the object-definition used by Alan Kay, who in turn used a more
closer model inspired from molecular biology (and, extending this,
Erlang's model could be to have objects everywhere, even though
Erlang is not an OOP language).

What is an OOP language? What is a functional language? Ruby kind
of is mostly OOP but there-is-more-than-one-way-to-do-it, so ruby
is a bit of a hybrid language, even if I'd say it is heavily leaning
towards OOP. So what is the ideal functional language? Haskell?
Is a monad an object? Can it walk on a moebius strip without falling
down?

martin wrote:

My main issue with this proposal is that I think it's probably
the right thing for most big code bases, but it may not be the
right thing for quick-and-dirty small scripts. And Ruby is
used, and should continue to be usable, for both kinds of code.

Well, I don't disagree necessarily. But if frozen_string_literal
is honoured, people can just use frozen_string_literal: false,
right? Even some commandline shorthand notation for that could
be used, such as --fsf (frozen string false) or so. Or people
can put that frozen string literal: false in the header of the
.rb file. I don't think this should be the primary rationale
for not making frozen string literals true NOT the default,
though.

martin wrote:

Maybe what could help is a declaration on a higher level,
e.g. per gem or so rather than per source file. (That's
just an idea, with many open questions: Where would the
setting go? How would the interpreter pick it up? How
would people become aware of it? ...)

Not a bad idea in itself; I often wanted to have more fine-tuned
control over gems I'd publish. More recommendations by default;
people can then ignore that if they want to of course. But I
think the question about frozen string literals is a language
design question, more than one of gem authors or small-scripts
use. So matz should decide on that.

jeremy wrote:

It depends on what you consider legacy code. Legacy code
designed for Ruby 1.8 or 1.9, which never used keyword
arguments, was unaffected by the keyword argument
changes in Ruby 3.

Yeah. I had in the back of my mind remembering more issues than
byroot may recall. I am glad I am not the only one; I mix up
things a lot these days.

Daniel wrote:

Careful; the community "pulled through" because there was already
a lot of accumulated good will, and the 2.7 migration burned
through some of that reserve. Yet another migration might result
in "not again!" syndrome and the community not pulling through
nearly as well. It depends on how annoying the migration is,
and the perceived benefit.

Agreed. Although often the perceived benefit is small for many
folks. In my own code, having frozen Strings by default probably
led to a noticeable speed-up, but initially it was quite some
work to adjust my code base to it. Even then I still prefer the
oldschool ruby behaviour :D - imagine if there would be no speed
penalty when String objects would be mutable at all times.

Daniel wrote:

Actually I would much prefer this than changing the default;
it allows every app and gem to upgrade on their own terms,
without enforcing a one-size-fits-all default, and without
the noise of a pragma in every file. Especially if combined
with frozen_string_literal: chilled it would be very empowering.

Also understandable. Although, I find "the noise of a pragma
in every file" actually less annoying than having to do an
object.frozen? check. ;)

Even then I still think this is a language design decision to
be made (either way how it goes). The flexibility situation
you refer to, be it "chilled" or anything else, as well as
more fine-tuned control over gems, and what not, is a secondary
question. First should come the "enable frozen strings by
default yes/no", as a language design decision, IMO.

Anyway, on the suggestion itself, I am +1, but if I may suggest:

  • There should be a clearly documented transition path, e. g.
    in "2025 this will happen, in 2026 that will happen, in 2027
    it is completed" (or something like that) if decided on it.
    This may also help for those people who want to adjust towards
    the frozen string false/true situation, and for whatever reason
    have not yet. At some point in time even slow movers should be
    able to adjust (unless they have some reason not to, but in
    this case they could comment here, if they are made aware of
    the discussion here).

  • The impact on "legacy" ruby code should be evaluated as objectively
    as possible, and to also get people involved who are using, for
    instance, frozen string literal: false.

Updated by byroot (Jean Boussier) 12 months ago

Conversely let me ask: Is it really that much work to just set RUBYOPT="--enable=frozen_string_literal" in your application instead of forcing a new default on everyone else?

The thing is, I can't.

Code that is written for frozen_string_literal: true can generally run with frozen_string_literal: false, but the opposite is not true.

Updated by palkan (Vladimir Dementyev) 12 months ago

byroot (Jean Boussier) wrote in #note-8:

Devise a way to set compile options for all files inside a directory

Here you go: https://github.com/ruby-next/freezolite 🙂

integrate with Rubygems to enable frozen string literals on a per gem basis

Smth like spec.frozen_string_literals = true? And during gem registration (setting up a load path), RubyGems can add the path to the frozen list, so upon load we can set the compile option. That should work, I think.

Updated by byroot (Jean Boussier) 12 months ago

Here you go: https://github.com/ruby-next/freezolite 🙂

As mentioned on Reddit when you first published that gem, it's a nice Hack, but I don't think it's quite robust enough. If changing the default isn't accepted and instead we try to make it a per gem configuration, I think Ruby will need to expose a better API to do this in a more reliable and clean way.

Updated by tenderlovemaking (Aaron Patterson) 12 months ago

mame (Yusuke Endoh) wrote in #note-9:

I can understand if all String objects were frozen by default, but freezing only string "literals" makes no sense at all (except in terms of performance).

For me, freezing string literals is a useful way to catch bugs. When I mutate strings, I am trying to be intentional, purposeful, and isolated.

buf = "".b # I intend to mutate this string
name = "Aaron" # I don't intend to mutate this, if I do, it's a bug

The places in code where I intend to mutate a string are infrequent and isolated so I am happy to pay a ".dup" tax in order to avoid bugs in other parts of my code. Maybe we could freeze all strings by default in the future, but when we're on the subject of compatibility, I think doing that right now would be far too extreme. Freezing string literals buys us performance (as byroot says, even 5% is great), and some safety. The trade is compatibility, but I don't think the trade is worthwhile and not very extreme.

Updated by tenderlovemaking (Aaron Patterson) 12 months ago

The trade is compatibility, but I don't think the trade is worthwhile and not very extreme.

Sorry, I made a typo. I do think the trade is worthwhile, and I don't think it's very extreme.

Updated by palkan (Vladimir Dementyev) 12 months ago

byroot (Jean Boussier) wrote in #note-16:

Here you go: https://github.com/ruby-next/freezolite 🙂

As mentioned on Reddit when you first published that gem, it's a nice Hack, but I don't think it's quite robust enough. If changing the default isn't accepted and instead we try to make it a per gem configuration, I think Ruby will need to expose a better API to do this in a more reliable and clean way.

Sure, it must be a part of MRI (and other implementations). Consider it a PoC (though, quite robust and battle-tested in production) and example of how to approach path-based compilation settings. The most important thing here is an API to wrap code loading so we can adjust settings on-the-fly (smth like require-hooks).

Updated by palkan (Vladimir Dementyev) 12 months ago

Files with # frozen_string_literal: true or # frozen_string_literal: false don't change in behavior at all.

There is one use case in which having # frozen_string_literal: true differs from RUBYOPT=--enable=frozen_string_literal today:

# frozen_string_literal: true

class B
  attr_reader :name

  def initialize
    @name = "B"
  end

  def eval_name
    instance_eval '@name = "C"'
  end
end

b = B.new

puts b.name.frozen?

b.eval_name

puts b.name.frozen?

Results in:

$ ruby -v
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [arm64-darwin21]

$ ruby b.rb

true
false

$ RUBYOPT='--enable=frozen_string_literal' ruby b.rb

true
true

Updated by kddnewton (Kevin Newton) 12 months ago

There is one use case in which having # frozen_string_literal: true differs from RUBYOPT=--enable=frozen_string_literal today

I don't think that's a difference, you don't have the magic comment in the eval. Changing it to:

instance_eval "# frozen_string_literal: true\n@name = \"C\""

makes them both true.

Updated by byroot (Jean Boussier) 12 months ago

That's by design, each eval call is its own file, and it doesn't inherit the pragma from the file where it's invoked, which is logical if you think about it.

But you are right that it is uncommon for users to put # frozen_string_literal: pragmas in evaled code, so this could indeed require some small adjustment for codebases that already define a pragma, but it should be very rare.

Updated by palkan (Vladimir Dementyev) 12 months ago

byroot (Jean Boussier) wrote in #note-22:

But you are right that it is uncommon for users to put # frozen_string_literal: pragmas in evaled code, so this could indeed require some small adjustment for codebases that already define a pragma, but it should be very rare.

Yeah, that's what I meant. Even if users put # frozen_string_literal: in every file (but not within eval) they still might need to adjust their code. So, I'd suggest covering this edge case in the migration guide (or whatever).

Updated by Dan0042 (Daniel DeLorme) 12 months ago

byroot (Jean Boussier) wrote in #note-14:

Conversely let me ask: Is it really that much work to just set RUBYOPT="--enable=frozen_string_literal" in your application instead of forcing a new default on everyone else?

The thing is, I can't.

Sorry, that didn't make sense to me... with a new ruby version having frozen_string_literal enabled by default you are "quite confident we can get all our dependencies ready", and yet with --enable=frozen_string_literal which has exactly the same effect, you "can't" ??? It seems to me the ideal course here is you test your 700 gems with --enable=frozen_string_literal then submit any fixes to the gems' maintainers, and possiblity convince them to switch to frozen_string_literal: false style, and then you can run your app with --enable=frozen_string_literal and everyone's happy. Doesn't seem to be any need to change the default of the ruby interpreter for that.

Code that is written for frozen_string_literal: true can generally run with frozen_string_literal: false, but the opposite is not true.

Indeed, and that's why changing the current default of false to true runs the risk of incompatibility. If you consider that a problem that prevents you from using --enable=frozen_string_literal, why would it not be a problem when changing the interpreter default?

I mean, of course what's missing in the above is the "chilled string" of the proposal, which provides a feasible migration path. But having these chilled strings is orthogonal to changing the default. You could run your app with e.g. --enable=CHILLED_string_literal and achieve your goal without having to change the default.

Updated by byroot (Jean Boussier) 12 months ago

and possiblity convince them to switch to frozen_string_literal: false style

You are missing the social aspect of it.

As of today running ruby with --enable=frozen_string_literal is a little known feature that basically no-one is doing. It's hard, if not impossible to convince maintainers to take in such changes. Since it's not the default, it basically comes down to personal preference (cf standardrb).

And any new gem created from today will likely not test with --enable=frozen_string_literal, so when they get added on our apps, they won't work. So it will be a never ending task.

Now if the Ruby project states that in the future the default will flip, even if it's a long time from now, it becomes easy to convince maintainers.

I'm not asking to change the default because I'm too lazy to fix some gems. I'm more than happy to help gems upgrade, I fixed over a hundred gems to be compatible with Ruby 3.0..., and will likely fix many more to be compatible with frozen string literals whenever Ruby decide to make the switch.

This is just me recognizing I can't realistically make this change alone in my corner of the ecosystem, no matter how much effort I put in it.

But also, I'm not getting this out of thin air, unless I misunderstood Matz, he stated he wishes to enable frozen string literals by default at some point, and that the only reason it wasn't done yet is the lack of a proper migration plan. So I'm not the one to suggest to flip the default in the first place, I'm merely proposing a migration plan.

Updated by jeremyevans0 (Jeremy Evans) 12 months ago

byroot (Jean Boussier) wrote in #note-25:

As of today running ruby with --enable=frozen_string_literal is a little known feature that basically no-one is doing. It's hard, if not impossible to convince maintainers to take in such changes.

As a counterpoint, I've been testing Sequel, Roda, and Rodauth since the release of Ruby 2.3 with --enable-frozen-string-literal. When starting out, many of the dependencies (direct and transitive) broke internally, and I had to submit pull requests to fix them. It took a few years, but I eventually got all related pull requests merged upstream (or alternative fixes implemented), and for years the tests have been clean with --enable-frozen-string-literal.

In most cases, it was easy to convince maintainers to fix any breakage encountered when using --enable-frozen-string-literal, and in many cases, maintainers took it upon themselves to fix other issues I didn't encounter.

And any new gem created from today will likely not test with --enable=frozen_string_literal, so when they get added on our apps, they won't work. So it will be a never ending task.

It's been years since I've had to submit a pull request upstream related to --enable-frozen-string-literal, so I don't think it necessarily has to be a never ending task.

However, I'm not dealing with 700 transitive dependencies, maybe not even 70. And when you are using --enable-frozen-string-literal, you need to have everything fixed for things to work. So maybe at Shopify scale, the problem really is intractable.

Now if the Ruby project states that in the future the default will flip, even if it's a long time from now, it becomes easy to convince maintainers.

Certainly it becomes easier, but my experience is that it is already easy. I would expect you are more likely to run into a dependency that isn't maintained, versus a dependency that is maintained but the maintainer is against fixing --enable-frozen-string-literal issues.

But also, I'm not getting this out of thin air, unless I misunderstood Matz, he stated he wishes to enable frozen string literals by default at some point, and that the only reason it wasn't done yet is the lack of a proper migration plan. So I'm not the one to suggest to flip the default in the first place, I'm merely proposing a migration plan.

I am in favor of switching to frozen static string literals by default with the migration plan proposed.

Updated by byroot (Jean Boussier) 12 months ago

when you are using --enable-frozen-string-literal, you need to have everything fixed for things to work. So maybe at Shopify scale, the problem really is intractable.

Yes you are right, I should have said, it's hard if not impossible to convince all the maintainers of my transitive dependencies.

Updated by Eregon (Benoit Daloze) 12 months ago

mame (Yusuke Endoh) wrote in #note-9:

@zverok (Victor Shepelev) Ah, that's exactly why I was against the introduction of frozen_string_literal, because people misunderstand it like you. This feature must not be mixed up with "immutability". I can understand if all String objects were frozen by default, but freezing only string "literals" makes no sense at all (except in terms of performance). Consider, just adding .upcase or .b or something to a String literal makes it mutable. I don't find "a cultural thing" in such a fragile thing.

It is not full immutability of all Strings, true.
But it is full immutability of all (static) String literals. And that is valuable. With the new default it becomes impossible to accidentally mutate a String literal, which is a nice error category to remove.
It also saves some memory because the same String literal in different places is the same String object.
If not frozen, the bytes can be shared but not the String object itself.

I would say 5% on lobsters is a huge gain.
Very few optimizations can give that much (and as an extra it's fairly simple and well understood semantically).


My impression is most Rubyists are aware that basically all new code should use # frozen_string_literal: true semantics.
The default of false is basically deprecated in practice and has almost no value, except compatibility for existing code.

As an example very very few files use # frozen_string_literal: false.
And those that do typically only do it because they have no been migrated to # frozen_string_literal: true and to not break under --enable-frozen-string-literal.

Updated by byroot (Jean Boussier) 12 months ago

I would say 5% on lobsters is a huge gain.

Still trying to get hold on our benchmarking server...

But just to note, lobsters itself doesn't use frozen_string_literal: true, and it's probably the case of at least some of its dependencies too.

When I have some extra time I'd also like to make it --enable=frozen_string_liteal compatible to see if a couple more % could be squeezed out of it.

Updated by Dan0042 (Daniel DeLorme) 12 months ago

byroot (Jean Boussier) wrote in #note-25:

You are missing the social aspect of it.

Thank you for the explanation. I'm not sure I fully agree but it makes a lot more sense now.


Eregon (Benoit Daloze) wrote in #note-28:

I would say 5% on lobsters is a huge gain.

Let's not cherry-pick here. 5% is the best case. The worst case is -6% on setivar_object. And something more relevant to a rails application: erubi has -3%. The average of all tests is roughly 0%. So it's not like the performance benefit is clearly compelling. At least based on these preliminary benchmarks.

Updated by byroot (Jean Boussier) 12 months ago

The worst case is -6% on setivar_object

This is a fluke caused by random slowdown on my development machine. Here's the benchmark source: https://github.com/Shopify/yjit-bench/blob/3774b4bc320519f8b560eb23bdea48f549cf8b30/benchmarks/setivar_object.rb

There is absolutely nothing in there influenced by frozen strings. I'm somewhat confident on the 5% figure for lobsters because I ran it alone about 10 times and always got 5%. But running the full suite on my machine takes way too long to do that for all benchmarks.

I shouldn't have ran the micro-benchmarks anyway, only the headline benchmarks make sense.

What is certain is that turning on frozen_string_literal cannot possibly have a negative performance impact. Only null or positive. So don't lean to much on these preliminary results.

Updated by Dan0042 (Daniel DeLorme) 12 months ago

byroot (Jean Boussier) wrote in #note-31:

I shouldn't have ran the micro-benchmarks anyway, only the headline benchmarks make sense.

I'll freely admit I have no idea which benchmarks are micro and which are headline. What about erubi and erubi-rails?

What is certain is that turning on frozen_string_literal cannot possibly have a negative performance impact.

I wouldn't affirm "cannot possibly", but I tend to agree; though deduplication has an overhead, I'd be surprised if it was measurable. So the erubi -3% result stands out. Was that a benchmark glitch?

Updated by byroot (Jean Boussier) 12 months ago

I have no idea which benchmarks are micro and which are headline. What about erubi and erubi-rails?

Each yjit-bench suite has a category: https://github.com/Shopify/yjit-bench/blob/3774b4bc320519f8b560eb23bdea48f549cf8b30/benchmarks.yml#L21

The more relevant ones are "headline", they generally are more sizeable and varied, so mor erepresentative of actually production workloads.

erubi and erubi-rails are both in the headline category. But frozen_string_literal doesn't matter one bit for eruby because eruby already compile into code that use frozen literals regardless.

though deduplication has an overhead

frozen_string_literal: true doesn't incur any overhead even if it means you sometimes need to dup.

With frozen_string_literal: false, "foo" is strictly equivalent to "foo".dup except that the dup is done as part of putstring instead of a second instruction. But we could even eliminate that if we wanted by compiling "literal".dup into putstring. It is very very unlikely to make any measurable difference though.

So yes, if we want to be extremely pedantic it's possible to generate a micro-benchmark that would suffer from frozen_string_literal: true, but in reality the impact can't reasonably be negative.

Updated by byroot (Jean Boussier) 12 months ago

Alright, I finally got hold of the benchmarking server. For the record it's an AWS c5n.metal, with various tuning like disabling frequency scaling etc to hopefully get more stable results.

Both rubyes are ran with YJIT enabled. mutable has MUTABLE_STRINGS=1 which makes the frozen_string_literal: true comment become essentially a noop.

I ran headline benchmark twice for good measures, most of them seem to be consistent with 1% between the two runs, except for ruby-lsp which has widely inconsistent results (+/-12%??), and hexapdf too to some extent (+/-4%).

mutable: ruby 3.4.0dev (2024-01-24T08:24:16Z mutable-strings bb0cee8dab) +YJIT dev [x86_64-linux]
frozen: ruby 3.4.0dev (2024-01-24T08:24:16Z mutable-strings bb0cee8dab) +YJIT dev [x86_64-linux]

--------------  ------------  ----------  -----------  ----------  --------------  --------------
bench           mutable (ms)  stddev (%)  frozen (ms)  stddev (%)  frozen 1st itr  mutable/frozen
activerecord    72.8          1.2         70.8         1.2         1.03            1.03          
chunky-png      1666.3        0.1         1672.7       0.2         1.00            1.00          
erubi-rails     2330.1        0.3         2312.2       0.2         1.01            1.01          
hexapdf         3960.1        8.9         3614.7       7.3         1.04            1.10          
liquid-c        107.1         1.3         101.3        1.4         0.95            1.06          
liquid-compile  100.0         2.8         98.5         3.1         1.00            1.02          
liquid-render   153.0         1.1         137.8        1.1         0.97            1.11          
lobsters        1346.6        8.5         1239.2       7.6         1.00            1.09          
mail            202.8         0.8         198.9        1.0         1.02            1.02          
psych-load      3723.2        0.1         3661.3       0.1         1.01            1.02          
railsbench      2669.8        0.1         2519.6       0.2         1.02            1.06          
rubocop         278.6         5.9         272.5        6.1         1.00            1.02          
ruby-lsp        217.5         8.7         250.7        10.4        0.98            0.87          
sequel          111.8         0.7         111.8        0.8         1.01            1.00          
--------------  ------------  ----------  -----------  ----------  --------------  --------------
Legend:
- frozen 1st itr: ratio of mutable/frozen time for the first benchmarking iteration.
- mutable/frozen: ratio of mutable/frozen time. Higher is better for frozen. Above 1 represents a speedup.



mutable: ruby 3.4.0dev (2024-01-24T08:24:16Z mutable-strings bb0cee8dab) +YJIT dev [x86_64-linux]
frozen: ruby 3.4.0dev (2024-01-24T08:24:16Z mutable-strings bb0cee8dab) +YJIT dev [x86_64-linux]

--------------  ------------  ----------  -----------  ----------  --------------  --------------
bench           mutable (ms)  stddev (%)  frozen (ms)  stddev (%)  frozen 1st itr  mutable/frozen
activerecord    72.8          1.2         70.8         1.3         1.00            1.03          
chunky-png      1664.1        0.2         1672.5       0.2         1.01            0.99          
erubi-rails     2351.9        0.2         2311.4       0.2         1.01            1.02          
hexapdf         3759.9        7.0         3696.5       5.8         1.04            1.02          
liquid-c        108.0         1.1         101.3        1.4         0.97            1.07          
liquid-compile  100.5         3.1         98.5         3.1         0.99            1.02          
liquid-render   152.8         1.0         138.0        1.1         0.97            1.11          
lobsters        1339.0        9.8         1244.2       7.9         1.00            1.08          
mail            202.6         0.7         198.9        1.0         1.00            1.02          
psych-load      3627.0        0.1         3659.6       0.0         0.99            0.99          
railsbench      2670.7        0.1         2560.3       1.7         1.02            1.04          
rubocop         279.5         6.3         271.4        6.1         1.00            1.03          
ruby-lsp        274.2         9.5         245.0        9.8         0.98            1.12          
sequel          111.9         0.7         112.0        1.3         1.01            1.00          
--------------  ------------  ----------  -----------  ----------  --------------  --------------
Legend:
- frozen 1st itr: ratio of mutable/frozen time for the first benchmarking iteration.
- mutable/frozen: ratio of mutable/frozen time. Higher is better for frozen. Above 1 represents a speedup.

Overall the performance benefit seem much more important than on my initial benchmark, likely in part because YJIT is enabled and also likely in part because of the different architecture (X86_64 vs ARM64).

Updated by matz (Yukihiro Matsumoto) 11 months ago

I agree with the proposal. It seems a well-thought process to migrate. The performance improvement was not as great as I had hoped for. But since I feel that the style of individually freezing strings when setting them to constants is not beautiful, and since I feel that magic comment is not a good style. I feel that making string literals frozen is the right direction to go in the long run.

Matz.

Updated by byroot (Jean Boussier) 11 months ago

Thank you Matz.

In that case I'll work with @etienne into getting the proof of concept into a mergeable feature over the next few weeks.

Updated by Dan0042 (Daniel DeLorme) 11 months ago

Question: what is the effect of "chilled string" on #frozen?
Does chilled_string.frozen? return true?
Personally I think it should, as I remarked in #16153#note-11

Updated by byroot (Jean Boussier) 11 months ago

Question: what is the effect of "chilled string" on #frozen?
Does chilled_string.frozen? return true?

That's the current idea yes.

"chilled string".frozen? # => true
+"chilled string" # returns a new, mutable string (without chilled flag)
-"chilled string" # returns a different, frozen string
"chilled string".freeze # the chilled string become frozen for real.
"chilled string" << "foo" # emit a warning and clear the "chilled" status.

This way as you point out in the other issues, the common dup if frozen? idiom works as expected.

Actions #39

Updated by etienne (Étienne Barrié) 10 months ago

  • Status changed from Open to Closed

Applied in changeset git|12be40ae6be78ac41e8e3f3c313cc6f63e7fa6c4.


Implement chilled strings

[Feature #20205]

As a path toward enabling frozen string literals by default in the future,
this commit introduce "chilled strings". From a user perspective chilled
strings pretend to be frozen, but on the first attempt to mutate them,
they lose their frozen status and emit a warning rather than to raise a
FrozenError.

Implementation wise, rb_compile_option_struct.frozen_string_literal is
no longer a boolean but a tri-state of enabled/disabled/unset.

When code is compiled with frozen string literals neither explictly enabled
or disabled, string literals are compiled with a new putchilledstring
instruction. This instruction is identical to putstring except it marks
the String with the STR_CHILLED (FL_USER3) and FL_FREEZE flags.

Chilled strings have the FL_FREEZE flag as to minimize the need to check
for chilled strings across the codebase, and to improve compatibility with
C extensions.

Notes:

  • String#freeze: clears the chilled flag.
  • String#-@: acts as if the string was mutable.
  • String#+@: acts as if the string was mutable.
  • String#clone: copies the chilled flag.

Co-authored-by: Jean Boussier

Updated by Dan0042 (Daniel DeLorme) 10 months ago · Edited

Thank you for this great feature!

Would it be possible to have a ruby API for chilled strings? Something like str.chill or String.chill(str) ...

  • String#+@: acts as if the string was mutable.

Nitpick: actually, String#+@ creates a dup as if the string was immutable.

Updated by byroot (Jean Boussier) 10 months ago

Would it be possible to have a ruby API for chilled strings? Something like str.chill or String.chill(str)

It's technically very easy to implement, yes. Would need to be a separate feature request though.

Also one drawback is that right now chilled string are an internal concept that we'll be able to get rid of and cleanup in the future. If we expose it to users, we'll have to keep it forever. So it's debatable whether the benefit outweigh the maintenance burden long term.

Updated by Dan0042 (Daniel DeLorme) 10 months ago

it's debatable whether the benefit outweigh the maintenance burden long term.

Yeah, good point. And if truly necessary we could get a chilled string with eval(str.inspect) anyway.

Updated by headius (Charles Nutter) 9 months ago

I am a bit late to the party but nobody seems to have raised a concern I have.

If a chilled string appears to be frozen? then a consumer may proceed to use the string expecting it to remain frozen, such as for a cache key. If that string can later become unfrozen and be modified, warning or not, they may now have a broken cache with an unexpected mutable key.

I don't know of a specific case for this, but the fact that chilled strings masquerade as frozen when they are not really frozen seems like a major issue to me.

Updated by byroot (Jean Boussier) 9 months ago

While your concern is absolutely valid, I don't think it's much of a problem in practice. There is pretty much an infinite amount of code out there, so I don't think there is any way really to discount it, but after having migrated a gigantic codebase to 3.4-dev no such issue appeared.

And generally speaking, code that want to hold on hold on a frozen string like you mention tend to use either str.dup.freeze, or str.freeze or -str. str.frozen? ? str : str.dup.freeze isn't a very common pattern.

So yes it could happen, but you'd really need many stars to align for that.

Updated by byroot (Jean Boussier) 8 months ago

I thought about this more at Kaigi, maybe avoiding the false positive on the str.dup if str.frozen? pattern isn't worth the possible confusion.

I'll experiment with chilled strings frozen? method returning false.

Actions #46

Updated by byroot (Jean Boussier) 8 months ago

Updated by Dan0042 (Daniel DeLorme) 8 months ago

@byroot (Jean Boussier) How is that experiment going? I'm all for experimenting, but just as in #15554 I believe we should reduce false positives to a minimum.

Also I spent some time on this, but I'm having a hard time coming up with a non-contrived example case where returning false for #frozen? is beneficial.

Updated by byroot (Jean Boussier) 8 months ago

I'm still on my way back from Kaigi, so I haven't started working on this. But I had a quick chat with Matz, and it wasn't clear to him that we went with frozen? -> true, and he was clear he expects frozen? -> false.

Also based on the reception I saw of the release notes in various places, it seems to have created quite a bit of confusion.

So yes, this will introduce some false positive, which is unfortunate, but I'll definitely change the behavior soon.

Updated by Dan0042 (Daniel DeLorme) 8 months ago

This is quite unfortunate, as there is not a single useful case for frozen? -> false
Oh well :-/

Updated by Eregon (Benoit Daloze) 8 months ago

avoiding the false positive on the str.dup if str.frozen? pattern

+str seems a good replacement for that pattern.
Besides, I would think it's pretty rare that it's OK to use that pattern, because it mutates a String that is not "owned" if it happens to not be frozen (e.g. it's typically not OK if it's an argument, and if intended then no need to check if frozen).

IOW, returning false for frozen? for chilled strings seems safer and I think will cause very few false positive warnings (in comparison to true positive chilled strings warnings).

Updated by Eregon (Benoit Daloze) 8 months ago

@Dan0042 @headius (Charles Nutter) mentioned a few above.
It also seems pretty bad that an object could frozen? => true and then become unfrozen, I think that alone could cause very tricky and serious bugs (e.g. it would break custom hash tables).

Updated by Eregon (Benoit Daloze) 8 months ago

To give a concrete example, Hash would be broken if it calls frozen? for string keys. That's the case on Rubinius: https://github.com/rubinius/rubinius/blob/84368419a49767ef9549a5778812e5f54b6c6223/core/hash.rb#L54-L56
So the pattern of "safe frozen copy" str = str.dup.freeze unless str.frozen? would be broken if frozen? would return true for chilled strings (str would still be mutable).

Updated by Dan0042 (Daniel DeLorme) 8 months ago

I was going to say this is MRI, not Rubinius, but it turns out chilled strings actually have a bug in MRI when used as Hash keys

k = "key"
h = {}
h[k] = 42
k << "!"  #warning: literal string will be frozen in the future
p h       #{"key!"=>42}

The hash key is definitely supposed to be frozen here, not just chilled.

Actions #54

Updated by byroot (Jean Boussier) 8 months ago

  • Related to Feature #20390: Issue with StringIO and chilled strings added

Updated by hartator (Julien Khaleghy) 8 months ago · Edited

I think making # frozen_string_literal: true the default is a bad idea.

If the main point is to make Ruby faster, IRL benchmarks so far have shown the reverse. byroot's own initial benchmarks show regular strings being faster or as fast as frozen strings in 62.2% of libs that has been tested. I would be interested to see if removing # frozen_string_literal: true will actually make some libs faster (edit: super tiny benchmarks on pry repo running its tests on 3 runs after 1 warmup and can be just noise, but 35.052s, 34.469s, 34.948s with # frozen_string_literal: true vs. 34.525s, 34.577s, 34.714s without # frozen_string_literal: true) .

If the main point is to avoid some kind of bugs, the reverse is also true. # frozen_string_literal: true can be a misdirection and introduces its own kind of bugs. IRL this has introduces stubble bugs in our SerpApi codebase, consider this file:

# frozen_string_literal: true

class Converter
  LANGUADE_CODES = {
    en: "English",
    es: "Spanish",
    jp: "Japanese"
  }
  def lower_case_version
    LANGUADE_CODES.transform_values!(&:downcase)
  end
end

The initial quick assessment is to feel safe that LANGUADE_CODES can't be modified and lower_case_version would have one of these behaviors:

  • Raise an error
  • Log a warning
  • Code works, but LANGUADE_CODES is unmodified.
  • Code works, but LANGUADE_CODES is modified for only this object instance.

However, none of the above is true.

Surprisingly for both junior and senior engineers, a #lower_case_version method call will silently modify LANGUADE_CODES for all Ruby threads:

[7] pry(main)> Converter::LANGUADE_CODES
=> {:en=>"english", :es=>"spanish", :jp=>"japanese"}

I am afraid we are making a more complex and inflexible Ruby (I do agree that string << "a story," appending style is lovely and super useful) for unfortunately no tangible gain in code explicitness and correctness, and now even have introducing surprising new behaviors. I think it's easier to just assume all String objects are mutable and just allows flexible coding.

Updated by hsbt (Hiroshi SHIBATA) 3 months ago

  • Status changed from Closed to Open
  • Assignee set to matz (Yukihiro Matsumoto)

I and @yhonda try to fix this warnings at httpclient.

https://github.com/nahi/httpclient/pull/462

I surprisedly faced stdlib have this warning like:

ruby-dev/lib/ruby/3.4.0+0/open-uri.rb:455: warning: literal string will be frozen in the future
ruby-dev/lib/ruby/3.4.0+0/logger/log_device.rb:45: warning: literal string will be frozen in the future

But above warnings caused by httpclient usage like:

def set_body_encoding
  if type = self.content_type
    OpenURI::Meta.init(o = '')
    o.meta_add_field('content-type', type)
    @body_encoding = o.encoding
  end
end

It's hard to found above code from open-uri warning.

Updated by mame (Yusuke Endoh) 3 months ago

I did quick investigation on the warnings that @hsbt (Hiroshi SHIBATA) said.

ruby-dev/lib/ruby/3.4.0+0/open-uri.rb:455: warning: literal string will be frozen in the future

This warning is due to the fact that httpclient calls OpenURI::Meta.init(o = '') and OpenURI::Meta.init destructively modifies the argument by Object#extend.

ruby-dev/lib/ruby/3.4.0+0/logger/log_device.rb:45: warning: literal string will be frozen in the future

This is probably a warning in StringIO#write. Passing a frozen string literal like StringIO.new(“foo”) is not warned, but warned after StringIO#write.

$ ruby -rstringio -w -e 'StringIO.new("")'

$ ruby -rstringio -w -e 'StringIO.new("").write("foo")'
-e:1: warning: literal string will be frozen in the future

Note that we can't issue a warning at StringIO.new("") because it could be a false positive if you use the StringIO read-only.

Updated by mame (Yusuke Endoh) 3 months ago

--debug-frozen-string-literal does not work well for StringIO.new("") case.

$ ruby --enable-frozen-string-literal --debug-frozen-string-literal -rstringio -w -e 'io = StringIO.new("")
io.write("foo")'
-e:2:in 'StringIO#write': not opened for writing (IOError)
        from -e:2:in '<main>'

The message failed to spot Line 1 where the string is allocated.

Updated by byroot (Jean Boussier) 3 months ago

--debug-frozen-string-literal does not work well for StringIO.new("") case.

Yes, for code that have a String#frozen? conditional, the source of the issue can be harder to find, that is why the initial implementation had String#frozen? return true.

@hsbt (Hiroshi SHIBATA) since you re-opened, what would you consider a condition for closing again?

Updated by mame (Yusuke Endoh) 3 months ago

byroot (Jean Boussier) wrote in #note-59:

Yes, for code that have a String#frozen? conditional, the source of the issue can be harder to find, that is why the initial implementation had String#frozen? return true.

The problem is that the warning does not point where you are doing StringIO.new(""). Even if #frozen? method returns true on a chilled string, I don't think that problem will be solved at all.

In this case, I think --debug-frozen-string-literal could work by stopping the #frozen? check and making StringIO#write attempt to actually modify the string unless read-only mode is explicitly specified.

But what can I say, do we have to go so far to change it? I totally agree with hartator #note-55.

Updated by byroot (Jean Boussier) 3 months ago

The problem is that the warning does not point where you are doing StringIO.new("")

We can improve that by displaying where the String was allocated when you are running with --debug-frozen-string-literal. I'll work on that Monday with @etienne.

I get that in a few cases it can be hard to track down, but having done the work to get Shopify monolith run with --enable-frozen-string-literal, hence having fixed issue in a bunch of our dependencies, it really wasn't that much work.

And in the rare case of old gems like httpclient that are no longer actively developed, just slapping # frozen_string_literal: false is a simple way out: https://github.com/Shopify/httpclient/commit/19790ac5bef02613b368ad7f3443767c8d481ec4

If any gem is causing you trouble, let me know and I'll happily fix it for you.

Updated by byroot (Jean Boussier) 3 months ago

We just submitted a PR with @etienne: https://github.com/ruby/ruby/pull/11893

Now the default warning is: warning: literal string will be frozen in the future (run with --debug-frozen-string-literal for more information)

And if you run with --debug-frozen-string-literal, the warning will be:

test.rb:3: warning: literal string will be frozen in the future
test.rb:1: the string was created here

Which should make it much easier to find the source of the issue in cases like: https://github.com/nahi/httpclient/pull/462

Updated by pdfrod (Pedro Rodrigues) 3 months ago

hartator (Julien Khaleghy) wrote in #note-55:

If the main point is to avoid some kind of bugs, the reverse is also true. # frozen_string_literal: true can be a misdirection and introduces its own kind of bugs. IRL this has introduces stubble bugs in our SerpApi codebase, consider this file:

# frozen_string_literal: true

class Converter
  LANGUADE_CODES = {
    en: "English",
    es: "Spanish",
    jp: "Japanese"
  }
  def lower_case_version
    LANGUADE_CODES.transform_values!(&:downcase)
  end
end

I don't think this example can be used to argue for or against frozen string literals, because there's no attempt to mutate strings here at all (downcase never modifies the original string; it always returns a copy). The real problem in this code is that it's using transform_values!, which mutates the hash. The code would have issues even if the values were integers. For example:

class Converter
  LANGUADE_CODES = {
    en: 1,
    es: 2,
    jp: 3
  }
  def lower_case_version
    LANGUADE_CODES.transform_values! { _1 * 100 }
  end
end
irb> Converter::LANGUADE_CODES
=> {:en=>1, :es=>2, :jp=>3}

irb> Converter.new.lower_case_version
=> {:en=>100, :es=>200, :jp=>300}

irb> Converter::LANGUADE_CODES
=> {:en=>100, :es=>200, :jp=>300}  # Oops, Converter::LANGUADE_CODES got mutated

Updated by byroot (Jean Boussier) 3 months ago

  • Status changed from Open to Closed

Updated by mame (Yusuke Endoh) about 1 month ago

I checked the usage of # frozen-string-literal: true among the public gems. (@ko1 (Koichi Sasada) gave me this idea)

Number of gems that have `# frozen-string-literal: true` in all .rb files

  among all public gems: 14254 / 175170 (8.14%)
  per-file basis: 445132 / 3051922 (14.59%)

  among public gems with the latest version released in 2015-present: 14208 / 101904 (13.94%)
  per-file basis: 443460 / 1952225 (22.72%)

  among public gems with the latest version released in 2020-present: 11974 / 41205 (29.06%)
  per-file basis: 389559 / 1136445 (34.28%)

  among public gems with the latest version released in 2022-present: 9121 / 26721 (34.13%)
  per-file basis: 329742 / 848061 (38.88%)

I used these scripts: https://gist.github.com/mame/d4a9be41acf3d25cc2667a6959c4d37d

Less than 10% of all gems are fully frozen-string-literal: true compliant. Even among active gems released in the last three years, only 34% are fully frozen-string-literal: true.

Incidentally, when ko1 asked matz about his prediction before showing these results, he replied, “About 90% of files are using frozen-string-literal: true". However, the actual rate was about 39% even on a per-file basis.

This suggests that the decision was based on a major misunderstanding of the current situation. @matz (Yukihiro Matsumoto), are you sure you don't want to rethink this?

Updated by byroot (Jean Boussier) about 1 month ago

Note that you don't necessarily need to set # frozen_string_literal: true to be compatible.

Lots of code is compatible without the comment just by the virtue of not mutating literals, and some other code is compatible because it sets #frozen_string_literal: false.

Updated by kddnewton (Kevin Newton) about 1 month ago

I think to be more accurate your script would probably want to check for string literals that will or will not be frozen.

Updated by ko1 (Koichi Sasada) about 1 month ago

some other code is compatible because it sets #frozen_string_literal: false.

FYI

only a few. survey which pragma is used on my collected ruby scripts from rubygems:

[[:FILES, 2945839],
 [["frozen_string_literal", "true"], 372224],
 [["frozen_string_literal", "false"], 4108],

all usages (it contains not pragma, just a comment):
https://gist.github.com/ko1/801a55069673aa3b45bf5c9c9be26e42

Updated by hsbt (Hiroshi SHIBATA) about 1 month ago

  • Status changed from Closed to Assigned

I re-opened this ticket for further discussion with Matz.

Updated by austin (Austin Ziegler) about 1 month ago

byroot (Jean Boussier) wrote in #note-66:

Note that you don't necessarily need to set # frozen_string_literal: true to be compatible.

Lots of code is compatible without the comment just by the virtue of not mutating literals, and some other code is compatible because it sets #frozen_string_literal: false.

It is worth noting that standardrb (the variant of Rubocop that I use) turns off # frozen_string_literal: true checking (in part because of hints that frozen strings would never be enforced, which RuboCop has on by default. I believe that I have deleted a whole bunch of cases where I used it in some of my gems (mime-types, diff-lcs), but I try to be very careful about allocations &c in all of my gems.

Updated by byroot (Jean Boussier) about 1 month ago

be very careful about allocations &c in all of my gems.

If you wish not to include the comment, but remain compatible a good way is to add --enable-frozen-string-literal in your CI matrix.

Updated by byroot (Jean Boussier) about 1 month ago

To put the number in perspective, I adapted @mame's script to count the same thing but for a specific application and all its dependencies: https://gist.github.com/byroot/87235e9c2553aacd36c31b1f16e81b84

For https://github.com/lobsters/lobsters

$ ruby /tmp/frozen.rb . $(bundle show --paths)
total files: 8037
files with comment: 5424
rate 67.5%

And yet, lobsters is compatible: https://github.com/lobsters/lobsters/blob/96cf0b32ee81bb1bd7e6e9cb8c7030631145f902/.github/workflows/check.yml#L46

Updated by byroot (Jean Boussier) about 1 month ago

Also it probably doesn't make a huge difference, but your regexp is a bit too strict:

/^# frozen[-_]string[-_]literal: true$/

Should be something like:

/^#\s*frozen[-_]string[-_]literal:\s*true\s*$/

Updated by Dan0042 (Daniel DeLorme) about 1 month ago

I watched this Ruby Core Developers Q&A video recently, and at 20:30 on the topic of frozen string literals by default everyone was complaining about the annoyance of having to put the magic comment at the top of every file. But if that's the main justification, there's a very easy fix that doesn't involve changing the global default: just set a default frozen_string_literal per directory, so that all files loaded afterward within a gem or app directory will use that default.

And even if the global default is changed to frozen_string_literal: true (which I am in favor of), having the ability to set a per-gem default would make it MUCH easier to add frozen_string_literals: false compatibility to older gems.

Actions

Also available in: Atom PDF

Like10
Like0Like1Like2Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like1Like0Like0Like0Like0Like0Like0Like0Like0Like3Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like2Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0