Bug #14475
String de-duplication is broken in files with frozen_string_literal: true
Description
Create 2 files:
test.rb
$LOAD_PATH.unshift(File.dirname(__FILE__)) require 'frozen' puts frozen.object_id puts frozen.object_id
frozen.rb
# frozen_string_literal: true A = "a" def frozen -"#{A}" end
Run test.rb
% ruby test.rb 70189973179400 70189973179260
Change to # frozen_string_literal: false
And you get
70189973181360 70189973181360
So something is over-optimising here, fix should be backported to 2.5.0 imo.
Related issues
Updated by sam.saffron (Sam Saffron) about 3 years ago
Simpler repro:
x = "#{1}".freeze p x.object_id x = -x p x.object_id x = "#{1}".freeze p x.object_id x = -x p x.object_id
70210401536460 70210401536460 70210401536240 70210401536240
Updated by sam.saffron (Sam Saffron) about 3 years ago
Workaround ... whenever de-duping strings always use -+
x = "#{1}".freeze # this is always properly de-duped x = -+x
Any objections to changing:
static VALUE str_uminus(VALUE str) { if (OBJ_FROZEN(str)) { return str; } else { return rb_fstring(str); } }
To
static VALUE str_uminus(VALUE str) { return rb_fstring(str); }
Updated by normalperson (Eric Wong) about 3 years ago
sam.saffron@gmail.com wrote:
Workaround ... whenever de-duping strings always use
-+
Yes, String#-@ documentation states:
"If the string is frozen, then return the string itself."
So I guess it is spec...
However, it seems compile-time can skip freezestring instructions
for -@ like it does with +@ since r62039
diff --git a/compile.c b/compile.c index 5bca0dca99..7d065b25d3 100644 --- a/compile.c +++ b/compile.c @@ -2851,11 +2851,11 @@ iseq_peephole_optimize(rb_iseq_t *iseq, LINK_ELEMENT *list, const int do_tailcal struct rb_call_info *ci = (struct rb_call_info *)OPERAND_AT(niobj, 0); /* * freezestring debug_info - * send <:+@, 0, ARG_SIMPLE> + * send <:+@, 0, ARG_SIMPLE> # :-@, too * => - * send <:+@, 0, ARG_SIMPLE> + * send <:+@, 0, ARG_SIMPLE> # :-@, too */ - if (ci->mid == idUPlus && + if ((ci->mid == idUPlus || ci->mid == idUMinus) && (ci->flag & VM_CALL_ARGS_SIMPLE) && ci->orig_argc == 0) { ELEM_REMOVE(list); diff --git a/test/ruby/test_optimization.rb b/test/ruby/test_optimization.rb index 6e463e1863..55e7acbbf0 100644 --- a/test/ruby/test_optimization.rb +++ b/test/ruby/test_optimization.rb @@ -567,6 +567,21 @@ def test_peephole_string_literal_range end end + def test_peephole_dstr + code = "#{<<~'begin;'}\n#{<<~'end;'}" + begin; + exp = (-'a').object_id + z = 'a' + exp == (-"#{z}").object_id + end; + [ false, true ].each do |fsl| + iseq = RubyVM::InstructionSequence.compile(code, + frozen_string_literal: fsl) + assert_equal(true, iseq.eval, + "[ruby-core:85542] [Bug #14475] fsl: #{fsl}") + end + end + def test_branch_condition_backquote bug = '[ruby-core:80740] [Bug #13444] redefined backquote should be called' class << self
Updated by sam.saffron (Sam Saffron) about 3 years ago
"If the string is frozen, then return the string itself."
Yeah I do not agree with this documentation, I think it should be changed.
Trouble is that there is no simple way to de-duplicate unconditionally without inefficiency:
Say x
is an arbitrary string (either frozen or unfrozen) :
x = -x # may return a non fstring x = -+x # will return fstring, but makes an unneeded copy x = -x.dup # fstring again, uneeded copy x = x.frozen? ? -+x : -x # too verbose, uneeded copy
Instead why not change it so -
is deduped unconditionally?
Updated by Anonymous about 3 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r62407.
compile.c: drop freezestring insn on String#-@
Followup to r62039 and remove the redundant freezestring
insn which was preventing deduplication from String#-@
- compile.c (iseq_peephole_optimize): drop freezestring insn on String#-@ [ruby-core:85542] [Bug #14475]
Updated by sam.saffron (Sam Saffron) about 3 years ago
Opened https://bugs.ruby-lang.org/issues/14478 to discuss change to uminus
Updated by normalperson (Eric Wong) about 3 years ago
sam.saffron@gmail.com wrote:
"If the string is frozen, then return the string itself."
Yeah I do not agree with this documentation, I think it should be changed.
Trouble is that there is no simple way to de-duplicate
unconditionally without either inefficiency or side effects:
Yeah, I originally proposed a new method back in the day but
compromised with String#-@ https://bugs.ruby-lang.org/issues/13077
Say
x
is an arbitrary string (either frozen or unfrozen) :x = -x # may return a non fstring x = -+x # will return fstring, but makes an unneeded copy x = -x.dup # fstring again, uneeded copy
Maybe these can be optimized in the VM (still ugly, and
redefinition checks aren't free, either).
x = x.frozen? ? -+x : -x # too verbose
Instead why not change it so `-` deduped unconditionally?
At this point, it may break existing code :<
Updated by sam.saffron (Sam Saffron) about 3 years ago
I created a new topic for this at: https://bugs.ruby-lang.org/issues/14478
At this point, it may break existing code :<
I think it is worth the risk cause adding #fstring here seem overkill and almost nobody would be using uminus now anyway cause they do not know about it.
Updated by duerst (Martin Dürst) about 3 years ago
- Related to Feature #14478: String #uminus should de-dupe unconditionally added