Feature #8976

file-scope freeze_string directive

Added by Akira Tanaka 7 months ago. Updated 3 months ago.

[ruby-core:57574]
Status:Open
Priority:Normal
Assignee:-
Category:-
Target version:current: 2.2.0

Description

Yesterday, we had a face-to-face developer meeting.
https://bugs.ruby-lang.org/projects/ruby/wiki/DevelopersMeeting20131001Japan
Several committers attended.
matz didn't attended, though. (This means this issue is not concluded.)

We believe we found a better way to freeze static string literals for
less GC pressure.
"static string literal" is a string literal without dynamic expression.

Currently, f-suffix, "..."f, is used to freeze a string literal to avoid
String object allocation.

There are several problems for f-suffix:

  • The notation is ugly.
  • Syntax error on Ruby 2.0. We cannot use the feature in version independent libraries. So, it is difficult to deploy.
  • Need to modify for each string literal. This is cumbersome.

The new way we found is a file-scope directive as follows

# freeze_string: true

The above comment at top of a file changes semantics of
static string literals in the file.
The static string literals will be frozen and always returns same object.
(The semantics of dynamic string literals is not changed.)

This way has following benefits:

  • No ugly f-suffix.
  • No syntax error on older Ruby.
  • We need only a line for each file.

We can write version independent library using frozen static string literals as follows.

  • Use the directive at top of the file: # freeze_string: true Older Ruby ignore this as a comment.
  • Use "...".dup for strings to be modified. Older Ruby has small disadvantage: useless dup is called.

Note that the directive effects all static string literals regardless of
single quotes, double quotes, %q-string, %qq-string and here documents.
The reason that the directive is effective not only single quotes is
we want to use escape sequences such as \n in frozen string literals.

Also note that similar directive is already exist:

% ruby -w -e '
def m
end
'
-e:3: warning: mismatched indentations at 'end' with 'def' at 2
% ruby -w -e '# -- warn_indent: false --
def m
end
'

The directive, warn_indent: false, disables "mismatched indentations" warning.

nobu implemented this feature in the meeting.
Please attach the patch, nobu.


Related issues

Related to ruby-trunk - Feature #8977: String#frozen that takes advantage of the deduping Assigned 10/02/2013
Related to ruby-trunk - Feature #8579: Frozen string syntax Closed 06/29/2013
Duplicated by ruby-trunk - Feature #9278: Magic comment "immutable: string" makes "literal".freeze ... Open 12/22/2013

History

#2 Updated by Sam Saffron 7 months ago

coupled with this I strongly feel we need a more usable way of using the deduping elsewhere.

Currently string#freeze will only affect the current string. If we had a string#frozen we could have it return a deduped frozen copy. From memory profiling the largest leak in ruby gems is strings that really should be duduped using a mechanism like it.

raised a separate issue on this: http://bugs.ruby-lang.org/issues/8977

#3 Updated by Sam Saffron 7 months ago

Can we also have a global switch to enable this everywhere (for debugging), it can make it simple to isolate the spots where it would fall over.

#4 Updated by Koichi Sasada 7 months ago

(2013/10/02 13:18), sam.saffron (Sam Saffron) wrote:

Can we also have a global switch to enable this everywhere (for debugging), it can make it simple to isolate the spots where it would fall over.

+1. It should be another ticket.

--
// SASADA Koichi at atdot dot net

#5 Updated by Akira Tanaka 7 months ago

akr (Akira Tanaka) wrote:

There are several problems for f-suffix:

  • The notation is ugly.

I forgot to mention Akira Matsuda's presentation at RubyShift 2013:
http://sssslide.com/speakerdeck.com/a_matsuda/rails-engines-from-the-bottom-up
(The presentation shows code snippets with f-suffix extensively.)

He said that the response of audience for f-suffix was negative.

#6 Updated by Charlie Somerville 7 months ago

I forgot to mention Akira Matsuda's presentation at RubyShift 2013:
http://sssslide.com/speakerdeck.com/a_matsuda/rails-engines-from-the-bottom-up
(The presentation shows code snippets with f-suffix extensively.)

He said that the response of audience for f-suffix was negative.

To be fair, you wouldn't really use f-suffix strings in all the places they were used in the presentation. They're only really useful in tight loops, or in code where aesthetics does not matter (eg. code generated from ERB).

#7 Updated by Gabriel Sobrinho 7 months ago

Maybe I'm too late but why not use the same object when calling String#freeze?

I mean, currently this:

"something".freeze.objectid
=> 70273877530260
"something".freeze.object
id
=> 70273877536840

And change the compiler to do this:

"something".freeze.objectid
=> 70273877530260
"something".freeze.object
id
=> 70273877530260

Not sure about the work that need to be done and even if it's possible but it would maintain the syntax compatibility with legacy versions of ruby.

I'm not against the "something"f syntax, regardless it's really strange for the ruby idiom, I'm concerned about libraries that needs to run on legacy rubies.

#8 Updated by Thomas Enebo 7 months ago

I think having a pragma at the top of the file will be much more error prone than the f-syntax. As a file grows, the ability to notice you are in a frozen string file goes down. It would have been great if Ruby had started immutable strings by default but that ship has sailed, I think having some files be immutable will be confusing.

Are we sure we cannot find a nicer syntax for frozen strings: %f{hello, I am frozen}?

#9 Updated by Charles Nutter 7 months ago

I agree with Tom here. I think it's going to be almost useless to have a full-file "freeze-string" directive.

  • From file to file, the meaning of a literal string would change. This would be confusing for everyone dealing with a project. They'd get frozen string errors in one file and not in another for exactly the same code.

  • Users would be forced to create new mutable strings with String.new. There would be no other way. So in one file you could create a new mutable string with "" and in another you'd have to use String.new.

It would be a very bad idea to have a directive that completely changes the meaning of code from one file to another.

#10 Updated by Brian Shirai 7 months ago

It would be a very bad idea to have a directive that completely changes the meaning of code from one file to another.

For consistency sake, it should be noted that, in fact, this is exactly what the existing encoding pragma does, and it's also the express purpose of refinements.

Hence, a more nuanced argument than this broad stroke of "very bad idea" may be needed.

#11 Updated by Charles Nutter 7 months ago

brixen (Brian Shirai) wrote:

For consistency sake, it should be noted that, in fact, this is exactly what the existing encoding pragma does, and it's also the express purpose of refinements.

The encoding directive changes the interpretation of the bytes within strings, but does not change their behavior. If m17n is working properly, you may never even see a difference in code, since even strings with different encodings can be negotiated into combining, matching regexp, and converting to other encodings.

Refinements change the meaning of code within a lexical scope...not within an entire file (unless it is the file's scope that is being refined). This is more analogous to instance_eval on a block, which changes what "self" methods are called against. You are correct that they do change the meaning of code within their scope, but whether that's a good feature or not is beyond the scope of this discussion. I do not particularly like refinements.

A frozen string directive would actually change the behavior of the strings in that file, making operations that worked before fail to work under the directive. Encoding does not make some methods on string start to raise errors, except where you may have differing encodings (which can happen without an encoding directive too).

Hence, a more nuanced argument than this broad stroke of "very bad idea" may be needed.

I'm not sure this is the place to have a meta-argument about how to argue for or against this proposal. But since you suggest a more nuanced argument, I suggest you look at the original points in my comment that explain why it would be a bad idea.

Do you have any arguments to make for or against this proposal?

#12 Updated by Thomas Enebo 7 months ago

Brian since I have been able to infer you dislike both M17n and refinements that you agree with Charlie and I that this particular pragma might not be an idea you endorse? Perhaps you can elucidate a better argument against it?

#13 Updated by Yui NARUSE 7 months ago

enebo (Thomas Enebo) wrote:

I think having a pragma at the top of the file will be much more error prone than the f-syntax. As a file grows, the ability to notice you are in a frozen string file goes down. It would have been great if Ruby had started immutable strings by default but that ship has sailed, I think having some files be immutable will be confusing.

Enhance your IDE.

Are we sure we cannot find a nicer syntax for frozen strings: %f{hello, I am frozen}?

Almost all of the idea to add new syntax break 2.0 compatibility, and it makes adoption of frozen strings slow.
It is the motivation of this suggestion.

#14 Updated by Martin Dürst 7 months ago

On 2013/10/03 2:27, brixen (Brian Shirai) wrote:

Issue #8976 has been updated by brixen (Brian Shirai).

It would be a very bad idea to have a directive that completely changes the meaning of code from one file to another.

For consistency sake, it should be noted that, in fact, this is exactly what the existing encoding pragma does,

The reason why there is an encoding pragma, and why it's per file, is
because text editors deal with one encoding per file. Doing something
like an encoding pragma e.g. on a block basis would not work well
together with editors.

I agree with Charles and others that a file-based directive isn't a good
idea for frozen/fixed strings.

From a more general perspective, it feels to me that introducing all
these frozen options will increase performance, but at the cost of
programmer effort. That would be the case also e.g. for something like
type hints,..., but that's not Ruby style.

Regards, Martin.

and it's also the express purpose of refinements.

Hence, a more nuanced argument than this broad stroke of "very bad idea" may be needed.


Feature #8976: file-scope freeze_string directive
https://bugs.ruby-lang.org/issues/8976#change-42221

Author: akr (Akira Tanaka)
Status: Open
Priority: Normal
Assignee:
Category:
Target version: current: 2.1.0

#15 Updated by Charles Nutter 7 months ago

duerst (Martin Dürst) wrote:

From a more general perspective, it feels to me that introducing all
these frozen options will increase performance, but at the cost of
programmer effort. That would be the case also e.g. for something like
type hints,..., but that's not Ruby style.

Personally, I think the more important benefit of having instantly-frozen literal strings, arrays, and hashes is for safer concurrency and data integrity.

  • If I pass you a reference to a string, I can create that string frozen and be sure you don't modify it. Same goes for arrays and hashes.
  • If I am initializing a global array or hash that should never be modified, I can create it frozen immediately.

Of course these can all be done by calling .freeze on the object as well, but creating immutable structures right away avoids any mistakes. And then the potential VM optimizations are a bonus on top of that.

But I stand by my opinion that a global pragma for frozen literal strings is a bad idea, because it makes all literal strings in the file start raising errors for half their methods, and it makes it impossible to copy/paste from one file to another without code potentially breaking.

#16 Updated by Robert A. Heiler 7 months ago

I am mildly in favour of it so +1

As it is compatible with older ruby I see little harm in it. But please don't forget proper documentation, if this is given the thumbs up by matz!

#17 Updated by Thomas Enebo 7 months ago

naruse (Yui NARUSE) wrote:

enebo (Thomas Enebo) wrote:

I think having a pragma at the top of the file will be much more error prone than the f-syntax. As a file grows, the ability to notice you are in a frozen string file goes down. It would have been great if Ruby had started immutable strings by default but that ship has sailed, I think having some files be immutable will be confusing.

Enhance your IDE.

It is an answer but one I think is not acceptable (obviously that is only my opinion).

Are we sure we cannot find a nicer syntax for frozen strings: %f{hello, I am frozen}?

Almost all of the idea to add new syntax break 2.0 compatibility, and it makes adoption of frozen strings slow.
It is the motivation of this suggestion.

Yeah. ko1 talked to me yesterday about this. I have been trying to think of other ways to not break backwards compatibility and have not thought of anything.

It is possible to put out a PL to 2.0 to add the syntax but obviously a decision like that is a big one. Older 2.0 libraries will not be able to read it. OTOH, MRI has been periodically putting out security fixes so perhaps a syntax error to force an upgrade is not a bad thing?

#18 Updated by Thomas Enebo 7 months ago

enebo (Thomas Enebo) wrote:

naruse (Yui NARUSE) wrote:

enebo (Thomas Enebo) wrote:

I think having a pragma at the top of the file will be much more error prone than the f-syntax. As a file grows, the ability to notice you are in a frozen string file goes down. It would have been great if Ruby had started immutable strings by default but that ship has sailed, I think having some files be immutable will be confusing.

Enhance your IDE.

It is an answer but one I think is not acceptable (obviously that is only my opinion).

Are we sure we cannot find a nicer syntax for frozen strings: %f{hello, I am frozen}?

Almost all of the idea to add new syntax break 2.0 compatibility, and it makes adoption of frozen strings slow.
It is the motivation of this suggestion.

Yeah. ko1 talked to me yesterday about this. I have been trying to think of other ways to not break backwards compatibility and have not thought of anything.

It is possible to put out a PL to 2.0 to add the syntax but obviously a decision like that is a big one. Older 2.0 libraries will not be able to read it. OTOH, MRI has been periodically putting out security fixes so perhaps a syntax error to force an upgrade is not a bad thing?

Read "Older 2.0 libraries will not be able to read it" as "Older MRI 2.0 implementations will not be able to read libraries which use this new syntax."

#19 Updated by Akira Tanaka 7 months ago

2013/10/2 enebo (Thomas Enebo) tom.enebo@gmail.com:

Issue #8976 has been updated by enebo (Thomas Enebo).

I think having a pragma at the top of the file will be much more error prone than the f-syntax. As a file grows, the ability to notice you are in a frozen string file goes down. It would have been great if Ruby had started immutable strings by default but that ship has sailed, I think having some files be immutable will be confusing.

I think it is not a big problem because
we don't need to aware the directive is exist or not if we use
"..." to strings not modified and "...".dup to strings to be modified.
--
Tanaka Akira

#20 Updated by Yusuke Endoh 7 months ago

"...".dup looks too verbose to me.
How about using "..." for a mutable string and '...' for an immutable?

I'm not so keen on a file-scope directive itself, though...

Yusuke Endoh mame@tsg.ne.jp

#21 Updated by Akira Tanaka 7 months ago

2013/10/8 mame (Yusuke Endoh) mame@tsg.ne.jp:

Issue #8976 has been updated by mame (Yusuke Endoh).

"...".dup looks too verbose to me.
How about using "..." for a mutable string and '...' for an immutable?

I considered it at first.
But we (at the meeting) abondoned it because
we want to use escape sequences, such as \n, in immutable strings.

I wrote this concern in as follows.

| Note that the directive effects all static string literals regardless of
| single quotes, double quotes, %q-string, %qq-string and here documents.
| The reason that the directive is effective not only single quotes is
| we want to use escape sequences such as \n in frozen string literals.
--
Tanaka Akira

#22 Updated by Boris Stitnicky 7 months ago

"..."f might be mildly ugly, but is hard to beat.
5 minutes of my thinking did not yield any better idea.
I share mame's feeling abou the file-scope directive.

#23 Updated by Charles Nutter 7 months ago

boris_stitnicky (Boris Stitnicky) wrote:

"..."f might be mildly ugly, but is hard to beat.
5 minutes of my thinking did not yield any better idea.
I share mame's feeling abou the file-scope directive.

I still don't like file-scope directive since it changes behavior rather than just content. In other words, a file that gains a frozen string directive suddenly creates strings that are only half functional.

See also #8992 that might address all issues by simply making the compiler and #freeze methods smarter.

  • Compiler would see through "literal".freeze and do what "literal"f" does now.
  • String#freeze could be adapted to use the fstring cache internally, so all frozen strings would be interned (in the Java sense).
  • No new backward-incompatible syntax.
  • Easy expansion to other literal syntaxes like arrays and hashes.

I think we need to kill off the "literal"f syntax and a file-scope directive is not the right way to do it.

#24 Updated by Akira Tanaka 7 months ago

2013/10/10 headius (Charles Nutter) headius@headius.com:

Issue #8976 has been updated by headius (Charles Nutter).

See also #8992 that might address all issues by simply making the compiler and #freeze methods smarter.

  • Compiler would see through "literal".freeze and do what "literal"f" does now.
  • String#freeze could be adapted to use the fstring cache internally, so all frozen strings would be interned (in the Java sense).
  • No new backward-incompatible syntax.
  • Easy expansion to other literal syntaxes like arrays and hashes.

#8992 doesn't address the problem follows.

| * Need to modify for each string literal.
| This is cumbersome.
--
Tanaka Akira

#25 Updated by Benoit Daloze 7 months ago

akr (Akira Tanaka) wrote:

2013/10/10 headius (Charles Nutter) headius@headius.com:

Issue #8976 has been updated by headius (Charles Nutter).

See also #8992 that might address all issues by simply making the compiler and #freeze methods smarter.

  • Compiler would see through "literal".freeze and do what "literal"f" does now.
  • String#freeze could be adapted to use the fstring cache internally, so all frozen strings would be interned (in the Java sense).
  • No new backward-incompatible syntax.
  • Easy expansion to other literal syntaxes like arrays and hashes.

#8992 doesn't address the problem follows.

| * Need to modify for each string literal.
| This is cumbersome.
--
Tanaka Akira

I think freezing these literals by adding ".freeze" to each of them is appropriate, these literals should already (in current version) be frozen to prevent any modification.

Changing semantics file-based is much too dangerous. And calling #dup just to have a mutable static literal String has no good meaning, it is just weird.

#26 Updated by Hiroshi SHIBATA 3 months ago

  • Target version changed from 2.1.0 to current: 2.2.0

Also available in: Atom PDF