Project

General

Profile

Actions

Bug #14475

closed

String de-duplication is broken in files with frozen_string_literal: true

Added by sam.saffron (Sam Saffron) over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
2.5.0, 2.6.0
[ruby-core:85542]

Description

Create 2 files:

test.rb

$LOAD_PATH.unshift(File.dirname(__FILE__))
require 'frozen'

puts frozen.object_id
puts frozen.object_id

frozen.rb

# frozen_string_literal: true

A = "a"
def frozen
  -"#{A}"
end

Run test.rb

% ruby test.rb
70189973179400
70189973179260

Change to # frozen_string_literal: false

And you get

70189973181360
70189973181360

So something is over-optimising here, fix should be backported to 2.5.0 imo.


Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #14478: String #uminus should de-dupe unconditionallyClosedActions

Updated by sam.saffron (Sam Saffron) over 4 years ago

Simpler repro:

x = "#{1}".freeze
p x.object_id
x = -x
p x.object_id

x = "#{1}".freeze
p x.object_id
x = -x
p x.object_id

70210401536460
70210401536460
70210401536240
70210401536240

Updated by sam.saffron (Sam Saffron) over 4 years ago

Workaround ... whenever de-duping strings always use -+

x = "#{1}".freeze
# this is always properly de-duped
x = -+x

Any objections to changing:

static VALUE
str_uminus(VALUE str)
{
    if (OBJ_FROZEN(str)) {
	return str;
    }
    else {
	return rb_fstring(str);
    }
}

To

static VALUE
str_uminus(VALUE str)
{
  return rb_fstring(str);  
}

Updated by normalperson (Eric Wong) over 4 years ago

wrote:

Workaround ... whenever de-duping strings always use -+

Yes, String#-@ documentation states:

"If the string is frozen, then return the string itself."

So I guess it is spec...

However, it seems compile-time can skip freezestring instructions
for -@ like it does with +@ since r62039

diff --git a/compile.c b/compile.c
index 5bca0dca99..7d065b25d3 100644
--- a/compile.c
+++ b/compile.c
@@ -2851,11 +2851,11 @@ iseq_peephole_optimize(rb_iseq_t *iseq, LINK_ELEMENT *list, const int do_tailcal
	struct rb_call_info *ci = (struct rb_call_info *)OPERAND_AT(niobj, 0);
	/*
	 *  freezestring debug_info
-	 *  send <:+@, 0, ARG_SIMPLE>
+	 *  send <:+@, 0, ARG_SIMPLE>  # :-@, too
	 * =>
-	 *  send <:+@, 0, ARG_SIMPLE>
+	 *  send <:+@, 0, ARG_SIMPLE>  # :-@, too
	 */
-	if (ci->mid == idUPlus &&
+	if ((ci->mid == idUPlus || ci->mid == idUMinus) &&
	    (ci->flag & VM_CALL_ARGS_SIMPLE) &&
	    ci->orig_argc == 0) {
	    ELEM_REMOVE(list);
diff --git a/test/ruby/test_optimization.rb b/test/ruby/test_optimization.rb
index 6e463e1863..55e7acbbf0 100644
--- a/test/ruby/test_optimization.rb
+++ b/test/ruby/test_optimization.rb
@@ -567,6 +567,21 @@ def test_peephole_string_literal_range
end
end

+  def test_peephole_dstr
+    code = "#{<<~'begin;'}\n#{<<~'end;'}"
+    begin;
+      exp = (-'a').object_id
+      z = 'a'
+      exp == (-"#{z}").object_id
+    end;
+    [ false, true ].each do |fsl|
+      iseq = RubyVM::InstructionSequence.compile(code,
+                                                 frozen_string_literal: fsl)
+      assert_equal(true, iseq.eval,
+                  "[ruby-core:85542] [Bug #14475] fsl: #{fsl}")
+    end
+  end
+
def test_branch_condition_backquote
bug = '[ruby-core:80740] [Bug #13444] redefined backquote should be called'
class << self

Updated by sam.saffron (Sam Saffron) over 4 years ago

"If the string is frozen, then return the string itself."

Yeah I do not agree with this documentation, I think it should be changed.

Trouble is that there is no simple way to de-duplicate unconditionally without inefficiency:

Say x is an arbitrary string (either frozen or unfrozen) :

x = -x # may return a non fstring

x = -+x # will return fstring, but makes an unneeded copy

x = -x.dup # fstring again, uneeded copy 

x = x.frozen? ? -+x : -x # too verbose, uneeded copy

Instead why not change it so - is deduped unconditionally?

Actions #5

Updated by Anonymous over 4 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r62407.


compile.c: drop freezestring insn on String#-@

Followup to r62039 and remove the redundant freezestring
insn which was preventing deduplication from String#-@

Updated by normalperson (Eric Wong) over 4 years ago

wrote:

"If the string is frozen, then return the string itself."

Yeah I do not agree with this documentation, I think it should be changed.

Trouble is that there is no simple way to de-duplicate
unconditionally without either inefficiency or side effects:

Yeah, I originally proposed a new method back in the day but
compromised with String#-@ https://bugs.ruby-lang.org/issues/13077

Say x is an arbitrary string (either frozen or unfrozen) :

x = -x # may return a non fstring

x = -+x # will return fstring, but makes an unneeded copy

x = -x.dup # fstring again, uneeded copy 

Maybe these can be optimized in the VM (still ugly, and
redefinition checks aren't free, either).

x = x.frozen? ? -+x : -x # too verbose


Instead why not change it so `-` deduped unconditionally? 

At this point, it may break existing code :<

Updated by sam.saffron (Sam Saffron) over 4 years ago

I created a new topic for this at: https://bugs.ruby-lang.org/issues/14478

At this point, it may break existing code :<

I think it is worth the risk cause adding #fstring here seem overkill and almost nobody would be using uminus now anyway cause they do not know about it.

Actions #9

Updated by duerst (Martin Dürst) over 4 years ago

  • Related to Feature #14478: String #uminus should de-dupe unconditionally added
Actions

Also available in: Atom PDF