Project

General

Profile

Bug #14475

String de-duplication is broken in files with frozen_string_literal: true

Added by sam.saffron (Sam Saffron) 3 months ago. Updated 3 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
2.5.0, 2.6.0
[ruby-core:85542]

Description

Create 2 files:

test.rb

$LOAD_PATH.unshift(File.dirname(__FILE__))
require 'frozen'

puts frozen.object_id
puts frozen.object_id

frozen.rb

# frozen_string_literal: true

A = "a"
def frozen
  -"#{A}"
end

Run test.rb

% ruby test.rb
70189973179400
70189973179260

Change to # frozen_string_literal: false

And you get

70189973181360
70189973181360

So something is over-optimising here, fix should be backported to 2.5.0 imo.


Related issues

Related to Ruby trunk - Feature #14478: String #uminus should de-dupe unconditionallyOpen

Associated revisions

Revision 7606806c
Added by normal 3 months ago

compile.c: drop freezestring insn on String#-@

Followup to r62039 and remove the redundant freezestring
insn which was preventing deduplication from String#-@

  • compile.c (iseq_peephole_optimize): drop freezestring insn on String#-@ [Bug #14475]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62407 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 62407
Added by normalperson (Eric Wong) 3 months ago

compile.c: drop freezestring insn on String#-@

Followup to r62039 and remove the redundant freezestring
insn which was preventing deduplication from String#-@

  • compile.c (iseq_peephole_optimize): drop freezestring insn on String#-@ [Bug #14475]

Revision 86de3e41
Added by nobu (Nobuyoshi Nakada) 3 months ago

compile.c: keep debug info

  • compile.c (iseq_peephole_optimize): keep freezestring insn with debug info. [Bug #14475]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62418 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 62418
Added by nobu (Nobuyoshi Nakada) 3 months ago

compile.c: keep debug info

  • compile.c (iseq_peephole_optimize): keep freezestring insn with debug info. [Bug #14475]

History

#1 [ruby-core:85556] Updated by sam.saffron (Sam Saffron) 3 months ago

Simpler repro:

x = "#{1}".freeze
p x.object_id
x = -x
p x.object_id

x = "#{1}".freeze
p x.object_id
x = -x
p x.object_id

70210401536460
70210401536460
70210401536240
70210401536240

#2 [ruby-core:85557] Updated by sam.saffron (Sam Saffron) 3 months ago

Workaround ... whenever de-duping strings always use -+

x = "#{1}".freeze
# this is always properly de-duped
x = -+x

Any objections to changing:

static VALUE
str_uminus(VALUE str)
{
    if (OBJ_FROZEN(str)) {
    return str;
    }
    else {
    return rb_fstring(str);
    }
}

To

static VALUE
str_uminus(VALUE str)
{
  return rb_fstring(str);  
}

#3 [ruby-core:85558] Updated by normalperson (Eric Wong) 3 months ago

sam.saffron@gmail.com wrote:

Workaround ... whenever de-duping strings always use -+

Yes, String#-@ documentation states:

"If the string is frozen, then return the string itself."

So I guess it is spec...

However, it seems compile-time can skip freezestring instructions
for -@ like it does with +@ since r62039

 diff --git a/compile.c b/compile.c
 index 5bca0dca99..7d065b25d3 100644
 --- a/compile.c
 +++ b/compile.c
 @@ -2851,11 +2851,11 @@ iseq_peephole_optimize(rb_iseq_t *iseq, LINK_ELEMENT *list, const int do_tailcal
    struct rb_call_info *ci = (struct rb_call_info *)OPERAND_AT(niobj, 0);
    /*
     *  freezestring debug_info
 -   *  send <:+@, 0, ARG_SIMPLE>
 +   *  send <:+@, 0, ARG_SIMPLE>  # :-@, too
     * =>
 -   *  send <:+@, 0, ARG_SIMPLE>
 +   *  send <:+@, 0, ARG_SIMPLE>  # :-@, too
     */
 -  if (ci->mid == idUPlus &&
 +  if ((ci->mid == idUPlus || ci->mid == idUMinus) &&
        (ci->flag & VM_CALL_ARGS_SIMPLE) &&
        ci->orig_argc == 0) {
        ELEM_REMOVE(list);
 diff --git a/test/ruby/test_optimization.rb b/test/ruby/test_optimization.rb
 index 6e463e1863..55e7acbbf0 100644
 --- a/test/ruby/test_optimization.rb
 +++ b/test/ruby/test_optimization.rb
 @@ -567,6 +567,21 @@ def test_peephole_string_literal_range
 end
 end

 +  def test_peephole_dstr
 +    code = "#{<<~'begin;'}\n#{<<~'end;'}"
 +    begin;
 +      exp = (-'a').object_id
 +      z = 'a'
 +      exp == (-"#{z}").object_id
 +    end;
 +    [ false, true ].each do |fsl|
 +      iseq = RubyVM::InstructionSequence.compile(code,
 +                                                 frozen_string_literal: fsl)
 +      assert_equal(true, iseq.eval,
 +                  " [Bug #14475] fsl: #{fsl}")
 +    end
 +  end
 +
 def test_branch_condition_backquote
 bug = ' [Bug #13444] redefined backquote should be called'
 class << self

#4 [ruby-core:85560] Updated by sam.saffron (Sam Saffron) 3 months ago

"If the string is frozen, then return the string itself."

Yeah I do not agree with this documentation, I think it should be changed.

Trouble is that there is no simple way to de-duplicate unconditionally without inefficiency:

Say x is an arbitrary string (either frozen or unfrozen) :

x = -x # may return a non fstring

x = -+x # will return fstring, but makes an unneeded copy

x = -x.dup # fstring again, uneeded copy 

x = x.frozen? ? -+x : -x # too verbose, uneeded copy

Instead why not change it so - is deduped unconditionally?

#5 Updated by Anonymous 3 months ago

  • Status changed from Open to Closed

Applied in changeset trunk|r62407.


compile.c: drop freezestring insn on String#-@

Followup to r62039 and remove the redundant freezestring
insn which was preventing deduplication from String#-@

  • compile.c (iseq_peephole_optimize): drop freezestring insn on String#-@ [Bug #14475]

#7 [ruby-core:85564] Updated by normalperson (Eric Wong) 3 months ago

sam.saffron@gmail.com wrote:

"If the string is frozen, then return the string itself."

Yeah I do not agree with this documentation, I think it should be changed.

Trouble is that there is no simple way to de-duplicate
unconditionally without either inefficiency or side effects:

Yeah, I originally proposed a new method back in the day but
compromised with String#-@ https://bugs.ruby-lang.org/issues/13077

Say x is an arbitrary string (either frozen or unfrozen) :

x = -x # may return a non fstring

x = -+x # will return fstring, but makes an unneeded copy

x = -x.dup # fstring again, uneeded copy 

Maybe these can be optimized in the VM (still ugly, and
redefinition checks aren't free, either).

x = x.frozen? ? -+x : -x # too verbose

Instead why not change it so `-` deduped unconditionally? 

At this point, it may break existing code :<

#8 [ruby-core:85567] Updated by sam.saffron (Sam Saffron) 3 months ago

I created a new topic for this at: https://bugs.ruby-lang.org/issues/14478

At this point, it may break existing code :<

I think it is worth the risk cause adding #fstring here seem overkill and almost nobody would be using uminus now anyway cause they do not know about it.

#9 Updated by duerst (Martin Dürst) 3 months ago

  • Related to Feature #14478: String #uminus should de-dupe unconditionally added

Also available in: Atom PDF