Feature #8579

Frozen string syntax

Added by Charlie Somerville 10 months ago. Updated 6 months ago.

[ruby-core:55699]
Status:Closed
Priority:Normal
Assignee:Charlie Somerville
Category:syntax
Target version:2.1.0

Description

I'd like to propose a new type of string literal - %f().

Because Ruby strings are mutable, every time a string literal is evaluated a new String object must be duped.

It's quite common to see code that stores a frozen String object into a constant which is then reused for performance reasons. Example: https://github.com/rack/rack/blob/master/lib/rack/methodoverride.rb

A new %f() string literal would instead evaluate to the same frozen String object every time. The benefit of this syntax is that it removes the need to pull string literals away from where they are used.

Here's an example of the proposed %f() syntax in action:

def foo
  ["bar".object_id, %f(bar).object_id]
end

p foo # might print "[123, 456]"

p foo # might print "[789, 456]"

These string literals could also be stored into a global refcounted table for deduplication across the entire program, futher reducing memory usage.

If this proposal is accepted, I can handle implementation work.

FrozenStringSyntax.pdf (41 KB) Charlie Somerville, 08/20/2013 08:57 AM

FrozenStringSyntax_2.pdf (33.7 KB) Charlie Somerville, 08/20/2013 11:29 AM


Related issues

Related to ruby-trunk - Feature #8923: Frozen nil/true/false Open 09/19/2013
Related to ruby-trunk - Feature #8906: Freeze Symbols Closed 09/14/2013
Related to ruby-trunk - Feature #8909: Expand "f" frozen suffix to literal arrays and hashes Rejected 09/14/2013
Related to ruby-trunk - Feature #8992: Use String#freeze and compiler tricks to replace "str"f s... Closed 10/08/2013
Related to ruby-trunk - Feature #8976: file-scope freeze_string directive Open 10/02/2013

Associated revisions

Revision 42773
Added by Charlie Somerville 8 months ago

  • NEWS: Add note about frozen string literals

  • compile.c (casewhenoptimizableliteral): optimize NODELIT strings
    in when clauses of case statements

  • ext/ripper/eventids2.c: add tSTRING_SUFFIX

  • parse.y: add 'f' suffix on string literals for frozen strings

  • test/ripper/testscannerevents.rb: add scanner tests

  • test/ruby/test_string.rb: add frozen string tests

[Feature #8579]

History

#1 Updated by Magnus Holm 10 months ago

+1.

What about interpolation? %F() would be useful although they can't be
deduplicated.

// Magnus Holm

#2 Updated by Yusuke Endoh 10 months ago

I'm neutral for the proposal itself.
Instead of a new kind of %-notation, it would be better to introduce a modifier like regexp literal:

%r(foo)o
%q(foo)o

Yusuke Endoh mame@tsg.ne.jp

#3 Updated by Boris Stitnicky 10 months ago

+1 to mame's proposal, literals already take long to learn for us average Joe users, only operator precedence is worse.

#4 Updated by Charlie Somerville 10 months ago

mame, is there a precedent of using modifiers on non-regexp literals?

I'm not against your proposal, but it would be odd for this particular feature to introduce a new syntax.

Also, if we used a modifier, how would that affect other types of percent literals like %w or %i?

#5 Updated by Nobuyoshi Nakada 10 months ago

charliesome (Charlie Somerville) wrote:

Also, if we used a modifier, how would that affect other types of percent literals like %w or %i?

As for %i, it doesn't make sense to freeze symbols, so it would freeze result array if it were introduced.
And all modifiers don't have to be applied to all kinds of literals.

#6 Updated by Charlie Somerville 8 months ago

ko1 - please see attached a slide for the upcoming developer meeting in Japan.

#7 Updated by Koichi Sasada 8 months ago

(2013/08/20 8:57), charliesome (Charlie Somerville) wrote:

ko1 - please see attached a slide for the upcoming developer meeting in Japan.

My position is negative to introduce new %f() syntax.

I like suffix that mame proposed .

There are two dimmention:
(1) frozen.
(2) once.

mame-san proposed (2) feature. And `once' feature doesn't return frozen,
but return same object.

Try with once suffix with regexp.

##
3.times{|i|
r = /foo/o
p [r.objectid, r.frozen?]
r.instance
variableset(:@foo, i)
p r.instance
variable_get(:@foo)
}
#=>
[23212812, false]
0
[23212812, false]
1
[23212812, false]
2
##

I believe you (charliesome) don't care about syntax, but you want a
feature to introduce frozen, or same string literal. right?

My idea is to introduce frozen suffix for (1) and once suffix for (2).

frozenstr = "foo"f
once
str = "foo"o
3.times{|i|
oncedynamicstr = "#{i}"o #=> every time it returns "0".
}
frozenandonce_str = "foo"of

I think %f() is not good syntax.
... "f" and "o" suffix are also bad? :P

--
// SASADA Koichi at atdot dot net

#8 Updated by Koichi Sasada 8 months ago

(2013/08/20 10:01), SASADA Koichi wrote:

My idea is to introduce frozen suffix for (1) and once suffix for (2).

Because we already introduced "i" (imaginary number literal) and "r"
(rational number) suffixes.

overuse? :)

--
// SASADA Koichi at atdot dot net

#9 Updated by Charlie Somerville 8 months ago

I believe you (charliesome) don't care about syntax, but you want a
feature to introduce frozen, or same string literal. right?

Correct. I have a preference toward %f (for consistency with other string types), but I am happy as long as the same string literal feature is accepted.

My idea is to introduce frozen suffix for (1) and once suffix for (2).

I'm ok with frozen suffix, but I'm not so sure about once suffix. I believe matz is negative towards once:

17:10 matz: I don't recommend /re/o behavior
17:11 matz: because it requires more complex cache mechanism

Anyway, I will revise my slide and post it here later.

#10 Updated by Charlie Somerville 8 months ago

Updated slide with f suffix syntax

#11 Updated by Alexey Muranov 8 months ago

=begin
Just two put in my 2 cents, i have suggested in a comment to #7791 to have an intermediate class between String and Symbol, but maybe frozen strings are better. What would you say about a new symbol-like literal syntax, like, for example,

|'this is a frozen string'

?
Since frozen strings are likely to be used as hash keys, in simple cases the quotes may be dropped:

|thisisafrozenstring

Just a thought.
=end

#12 Updated by Charlie Somerville 8 months ago

I have found a problem with using f-suffix syntax.

What should happen in this case?

"hello "f "world"

Note that this is not a problem when using %f syntax as it is a syntax error to have adjacent percent-strings like this:

%f(hello ) %f(world)

#13 Updated by Kurt Stephens 8 months ago

How about something more generic? A prefix operator that memoizes and freezes any expression result in a thread-safe manner on first eval:

%f'a frozen string'                 #
%f"a frozen #{interpolated} string" # 
%f{a: 'frozen', hash: 'value'}      # A frozen, inline memoized Hash.
%f[:a, 'frozen', :array, 'value']   # A frozen, inline, memoized Array.
%f(some(:method, 'call'))           # You get the idea.

#14 Updated by Yura Sokolov 8 months ago

24.08.2013 23:55 пользователь "kstephens (Kurt Stephens)" <
redmine@ruby-lang.org> написал:

Issue #8579 has been updated by kstephens (Kurt Stephens).

How about something more generic? A prefix operator that memoizes and
freezes any expression result in a thread-safe manner on first eval:

%f'a frozen string'                 #
%f"a frozen #{interpolated} string" #
%f{a: 'frozen', :hash 'value'}      # A frozen, inline memoized Hash.
%f[:a, 'frozen', :array, 'value']   # A frozen, inline, memoized

Array.
%f(some(:method, 'call')) # You get the idea.

Looks pretty! +1


Feature #8579: Frozen string syntax
https://bugs.ruby-lang.org/issues/8579#change-41341

Author: charliesome (Charlie Somerville)
Status: Open
Priority: Normal
Assignee: ko1 (Koichi Sasada)
Category: syntax
Target version: current: 2.1.0

I'd like to propose a new type of string literal - %f().

Because Ruby strings are mutable, every time a string literal is
evaluated a new String object must be duped.

It's quite common to see code that stores a frozen String object into a
constant which is then reused for performance reasons. Example:
https://github.com/rack/rack/blob/master/lib/rack/methodoverride.rb

A new %f() string literal would instead evaluate to the same frozen
String object every time. The benefit of this syntax is that it removes the
need to pull string literals away from where they are used.

Here's an example of the proposed %f() syntax in action:

def foo
  ["bar".object_id, %f(bar).object_id]
end

p foo # might print "[123, 456]"

p foo # might print "[789, 456]"

These string literals could also be stored into a global refcounted table
for deduplication across the entire program, futher reducing memory usage.

If this proposal is accepted, I can handle implementation work.

http://bugs.ruby-lang.org/

#15 Updated by Yukihiro Matsumoto 8 months ago

I accept the suffix idea ("frozen string"f), and string concatenation for frozen strings will be invalid ("foo"f "bar" would be syntax error).

Matz.

#16 Updated by Charlie Somerville 8 months ago

  • Assignee changed from Koichi Sasada to Charlie Somerville

#17 Updated by Charlie Somerville 8 months ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r42773.
Charlie, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • NEWS: Add note about frozen string literals

  • compile.c (casewhenoptimizableliteral): optimize NODELIT strings
    in when clauses of case statements

  • ext/ripper/eventids2.c: add tSTRING_SUFFIX

  • parse.y: add 'f' suffix on string literals for frozen strings

  • test/ripper/testscannerevents.rb: add scanner tests

  • test/ruby/test_string.rb: add frozen string tests

[Feature #8579]

#18 Updated by Koichi Sasada 8 months ago

(2013/08/31 15:21), matz (Yukihiro Matsumoto) wrote:

I accept the suffix idea ("frozen string"f), and string concatenation for frozen strings will be invalid ("foo"f "bar" would be syntax error).

There are comments on this change. For example, it is confusing about
the character 'f'. For example, it remind "Float".

e.g.: "1.0"f is not 1.0 (Flaot).

--
// SASADA Koichi at atdot dot net

#19 Updated by Koichi Sasada 6 months ago

(2013/08/31 16:20), charliesome (Charlie Somerville) wrote:

Feature #8579: Frozen string syntax

Just another syntax idea:

FROZEN{ 'foo' }

Advantage:
* Can implement on Ruby level in 2.0 or before
* Can extend for other literals such as Array
* Similar to BEGIN{ ... } / END{ ... }

Disadvantage:
* Long
* Not in Ruby style?
* A few compatibility issue for programs using FROZEN() method

--
// SASADA Koichi at atdot dot net

#20 Updated by Charles Nutter 6 months ago

See also http://bugs.ruby-lang.org/issues/8992 which proposes just making "literal string".freeze do the right thing in the compiler.

FROZEN { } is not terrible syntax, but it's the longest one suggested.

Also available in: Atom PDF