misc #9188

r43870 make benchmark/bm_so_k_nucleotide.rb slow

Added by Narihiro Nakamura 5 months ago. Updated about 1 month ago.

[ruby-core:58730]
Status:Closed
Priority:Normal
Assignee:Aman Gupta
Category:core
Target version:2.1.0

Description

Hi.

I think r43870 make benchmark/bmsok_nucleotide.rb slow.

r43870
% time ./miniruby ./benchmark/bmsoknucleotide.rb
./miniruby ./benchmark/bm
soknucleotide.rb 1.70s user 0.01s system 99% cpu 1.718 total

r43869
% time ./miniruby ./benchmark/bmsoknucleotide.rb
./miniruby ./benchmark/bm
soknucleotide.rb 1.52s user 0.03s system 99% cpu 1.559 total


Related issues

Related to ruby-trunk - Feature #8998: string keys for hash literals should use fstrings Closed 10/08/2013
Related to ruby-trunk - Bug #9382: [patch] add opt_aref_str and opt_aset_str Closed 01/08/2014

Associated revisions

Revision 44058
Added by Aman Gupta 4 months ago

hash.c: revert r43870 and add alternative parser patch for literal keys

  • hash.c (hashasetstr): revert r43870 due to performance issue [Bug #9188]
  • parse.y (assoc): convert literal string hash keys to fstrings
  • test/ruby/test_hash.rb (class TestHash): expand test

History

#1 Updated by Aman Gupta 5 months ago

=(

Maybe we should revert it?

#2 Updated by Eric Wong 5 months ago

Unfortunately, r43870 is a trade-off and optimized for frequently-reused
keys.

How about only using rb_fstring for string literal hash keys?

#3 Updated by Aman Gupta 5 months ago

How about only using rb_fstring for string literal hash keys?

I think this makes sense. We won't see a 10% savings like we did in #8998, but it's a fair compromise.

Do you have any thoughts on an implementation? I'll revert the existing commit in the meantime.

#4 Updated by Eric Wong 5 months ago

"tmm1 (Aman Gupta)" ruby@tmm1.net wrote:

Do you have any thoughts on an implementation? I'll revert the
existing commit in the meantime.

Not really, but I think it needs to be done in the parser which I'm not
familiar with. Hopefully you or somebody else has time/interest,
I've got more critical problems/projects to attend to than performance.

#5 Updated by Aman Gupta 4 months ago

There were some minor improvements to rb_fstring() recently, but the hash changes are still slow on trunk.
In my environment:

trunk

./miniruby -I. benchmark/bmsoknucleotide.rb > /dev/null 3.72s user 0.03s system 99% cpu 3.757 total
./miniruby -I. benchmark/bm
soknucleotide.rb > /dev/null 3.76s user 0.03s system 99% cpu 3.794 total
./miniruby -I. benchmark/bmsoknucleotide.rb > /dev/null 3.70s user 0.03s system 99% cpu 3.736 total
./miniruby -I. benchmark/bm
soknucleotide.rb > /dev/null 3.78s user 0.03s system 99% cpu 3.817 total

revert r43870

./miniruby -I. benchmark/bmsoknucleotide.rb > /dev/null 3.50s user 0.02s system 99% cpu 3.528 total
./miniruby -I. benchmark/bm
soknucleotide.rb > /dev/null 3.49s user 0.02s system 99% cpu 3.515 total
./miniruby -I. benchmark/bmsoknucleotide.rb > /dev/null 3.54s user 0.02s system 99% cpu 3.570 total
./miniruby -I. benchmark/bm
soknucleotide.rb > /dev/null 3.52s user 0.02s system 99% cpu 3.546 total

I'm trying to come up with an alternative patch to keep the string literal hash key test passing, but I'm having a hard time with the parser.
The best I can come up with is the following, which works but feels wrong:

@@ -535,7 +535,7 @@ newhash
for (i = num; i > 0; i -= 2) {
const VALUE v = TOPN(i - 2);
const VALUE k = TOPN(i - 1);
- rbhashaset(val, k, v);
+ rbhashaset(val, RBTYPEP(k, TSTRING) ? rbfstring(k) : k, v);
}
POPN(num);
}

Ideally, I'd like to replace alternate putstring instructions involved in any newhash instruction with putobject instructions.
Could someone point out the best place in the compiler to do this?

#6 Updated by Aman Gupta 4 months ago

After a few more false starts in the compiler, I ended up with the following patch to the parser.

diff --git a/parse.y b/parse.y
index 8207ad7..4629a60 100644
--- a/parse.y
+++ b/parse.y
@@ -4912,7 +4912,11 @@ assocs : assoc
assoc : argvalue tASSOC argvalue
{
/%%%/
- $$ = listappend(NEWLIST($1), $3);
+ if (ndtype($1) == NODESTR) {
+ $$ = listappend(NEWLIST(NEWLIT(rbfstring($1->ndlit))), $3);
+ } else {
+ $$ = list
append(NEWLIST($1), $3);
+ }
/*%
$$ = dispatch2(assoc
new, $1, $3);
%*/
diff --git a/test/ruby/testhash.rb b/test/ruby/testhash.rb
index 6aaba46..42429b4 100644
--- a/test/ruby/testhash.rb
+++ b/test/ruby/test
hash.rb
@@ -209,10 +209,11 @@ class TestHash < Test::Unit::TestCase
assert_equal(256, h[z])
end

  • def testASETstring
  • def testfstringliteralkey a = {"ABC" => :t} b = {"ABC" => :t} assertsame a.keys[0], b.keys[0]
  • assert_same "ABC".freeze, a.keys[0]
    end

    def test_EQUAL # '=='

#7 Updated by Aman Gupta 4 months ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r44058.
Narihiro, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


hash.c: revert r43870 and add alternative parser patch for literal keys

  • hash.c (hashasetstr): revert r43870 due to performance issue [Bug #9188]
  • parse.y (assoc): convert literal string hash keys to fstrings
  • test/ruby/test_hash.rb (class TestHash): expand test

#8 Updated by Aman Gupta 4 months ago

@charliesome helped me with my compiler patch. It adds a new optasetstr instruction that handles the string literal hash_aset case (hsh["abc"]=).

Unfortunately, it did not make a big difference in long-lived strings generated in our app. The patch follows for posterity.

diff --git a/compile.c b/compile.c
index 812f692..1cc038c 100644
--- a/compile.c
+++ b/compile.c
@@ -5290,6 +5290,24 @@ iseqcompileeach(rbiseqt *iseq, LINKANCHOR *ret, NODE * node, int poped)
break;
}
case NODE
ATTRASGN:{
+ if (node->ndmid == idASET &&
+ node->nd
recv != (NODE *)1 &&
+ node->ndargs &&
+ nd
type(node->ndargs) == NODEARRAY &&
+ node->ndargs->ndalen == 2 &&
+ ndtype(node->ndargs->ndhead) == NODESTR)
+ {
+ COMPILE(ret, "recv", node->ndrecv);
+ COMPILE(ret, "value", node->nd
args->ndnext->ndhead);
+ ADDINSN2(ret, line, optasetstr,
+ new
callinfo(iseq, idASET, 2, 0, 0),
+ rbfstring(node->ndargs->ndhead->ndlit));
+ if (poped) {
+ ADDINSN(ret, line, pop);
+ }
+ break;
+ }
+
DECL
ANCHOR(recv);
DECLANCHOR(args);
VALUE flag = 0;
diff --git a/hash.c b/hash.c
index 1321b83..8bbd569 100644
--- a/hash.c
+++ b/hash.c
@@ -2368,7 +2368,7 @@ static VALUE rb
hashcomparebyidp(VALUE hash);
* h1["a"] #=> 100
* h1.comparebyidentity
* h1.comparebyidentity? #=> true
- * h1["a"] #=> nil # different objects.
+ * h1["a".dup] #=> nil # different objects.
* h1[:c] #=> "c" # same symbols are all same.
*
*/
diff --git a/insns.def b/insns.def
index 63a36b3..dcecfdc 100644
--- a/insns.def
+++ b/insns.def
@@ -1903,6 +1903,27 @@ opt_aset

/**
@c optimize
+ @e recv[str] = set
+ @j 最適化された recv[str] = set。
+ /
+DEFINEINSN
+opt
asetstr
+(CALL
INFO ci, VALUE key)
+(VALUE recv, VALUE val)
+(VALUE val)
+{
+ if (!SPECIALCONSTP(recv) && RBASICCLASS(recv) == rbcHash && BASICOPUNREDEFINEDP(BOPASET, HASHREDEFINEDOPFLAG)) {
+ rb
hashaset(recv, key, val);
+ } else {
+ PUSH(recv);
+ PUSH(rb
strresurrect(key));
+ PUSH(val);
+ CALL
SIMPLE_METHOD(recv);
+ }
+}
+
+/
*
+ @c optimize
@e optimized length
@j 最適化された recv.length()。
*/
diff --git a/test/ruby/testhash.rb b/test/ruby/testhash.rb
index 4244b86..ef606bf 100644
--- a/test/ruby/testhash.rb
+++ b/test/ruby/test
hash.rb
@@ -209,6 +209,13 @@ class TestHash < Test::Unit::TestCase
assert_equal(256, h[z])
end

  • def testASETfstring_key
  • a, b = {}, {}
  • a["abc"] = 1
  • b["abc"] = 1
  • assert_same a.keys[0], b.keys[0]
  • end + def testNEWHASHfstringkey a = {"ABC" => :t} b = {"ABC" => :t} @@ -897,7 +904,7 @@ class TestHash < Test::Unit::TestCase h = @cls[] h.compareby_identity h["a"] = 1
  • h["a"] = 2
  • h["a".dup] = 2 assert_equal(["a",1], h.assoc("a")) end

#9 Updated by Eric Wong 4 months ago

Btw, I took some time to work on this further. Only very lightly
tested (make check passes)

# GC::Profiler.report doesn't even show anything with this patch applied
# because GC never happens.
h = {}
GC::Profiler.enable
10000000.times do
h["HI"] = 0
h["HI"]
end
GC::Profiler.report

---------------------------8<-------------------------------
Subject: [PATCH] prefreeze literal strings for hash aset/aref

Based on a patch by Charlie Somerville and Aman Gupta:

http://mid.gmane.org/redmine.journal-43505.20131208105635@ruby-lang.org


The following changes since commit 12b09864056bfb961f06b0ef675b9fc2fabb9238:

 * properties. (2014-01-03 01:51:05 +0000)

are available in the git repository at:

 git://80x24.org/ruby.git opt_aref_aset_str

for you to fetch changes up to 2906ef4bf2aaa0873f198cc1a949c1cc7740be7f:

 prefreeze literal strings for hash aset/aref (2014-01-03 03:44:51 +0000)

Eric Wong (1):
prefreeze literal strings for hash aset/aref

compile.c              | 36 ++++++++++++++++++++++++++++++++++++
hash.c                 |  2 +-
insns.def              | 41 +++++++++++++++++++++++++++++++++++++++++
test/ruby/test_hash.rb |  9 ++++++++-
4 files changed, 86 insertions(+), 2 deletions(-)

diff --git a/compile.c b/compile.c
index 5b28401..f12f40d 100644
--- a/compile.c
+++ b/compile.c
@@ -4330,6 +4330,23 @@ iseqcompileeach(rbiseqt iseq, LINKANCHOR *ret, NODE * node, int poped)
}
break;
}
+ if (node->nd
mid == idAREF &&
+ node->ndrecv != (NODE *)1 &&
+ node->nd
args &&
+ ndtype(node->ndargs) == NODEARRAY &&
+ node->nd
args->ndalen == 1 &&
+ nd
type(node->ndargs->ndhead) == NODESTR)
+ {
+ VALUE str = rb
fstring(node->ndargs->ndhead->ndlit);
+ node->nd
args->ndhead->ndlit = str;
+ COMPILE(ret, "recv", node->ndrecv);
+ ADD
INSN2(ret, line, optarefstr,
+ newcallinfo(iseq, idAREF, 1, 0, 0), str);
+ if (poped) {
+ ADD
INSN(ret, line, pop);
+ }
+ break;
+ }
case NODEFCALL:
case NODE
VCALL:{ /
VCALL: variable or call /
/

@@ -5300,6 +5317,25 @@ iseqcompileeach(rbiseqt *iseq, LINK_ANCHOR *ret, NODE * node, int poped)
VALUE flag = 0;
VALUE argc;

  • if (node->nd_mid == idASET &&
  • node->nd_recv != (NODE *)1 &&
  • node->nd_args &&
  • ndtype(node->ndargs) == NODE_ARRAY &&
  • node->ndargs->ndalen == 2 &&
  • ndtype(node->ndargs->ndhead) == NODESTR)
  • {
  • VALUE str = rbfstring(node->ndargs->ndhead->ndlit);
  • node->ndargs->ndhead->nd_lit = str;
  • COMPILE(ret, "recv", node->nd_recv);
  • COMPILE(ret, "value", node->ndargs->ndnext->nd_head);
  • ADDINSN2(ret, line, optaset_str,
  • new_callinfo(iseq, idASET, 2, 0, 0), str);
  • if (poped) {
  • ADD_INSN(ret, line, pop);
  • }
  • break;
  • } + INITANCHOR(recv); INITANCHOR(args); argc = setupargs(iseq, args, node->ndargs, &flag); diff --git a/hash.c b/hash.c index 0eca4b9..3cf7d8d 100644 --- a/hash.c +++ b/hash.c @@ -2390,7 +2390,7 @@ static VALUE rbhashcomparebyid_p(VALUE hash);
    • h1["a"] #=> 100
    • h1.comparebyidentity
    • h1.comparebyidentity? #=> true
  • * h1["a"] #=> nil # different objects.
    • h1["a".dup] #=> nil # different objects.
    • h1[:c] #=> "c" # same symbols are all same. * */ diff --git a/insns.def b/insns.def index ad4bba6..616838d 100644 --- a/insns.def +++ b/insns.def @@ -1903,6 +1903,47 @@ opt_aset

    /**
    @c optimize

  • @e recv[str] = set

  • @j 最適化された recv[str] = set。

  • */
    +DEFINEINSN
    +opt
    asetstr
    +(CALL
    INFO ci, VALUE key)
    +(VALUE recv, VALUE val)
    +(VALUE val)
    +{

  • if (!SPECIALCONSTP(recv) && RBASICCLASS(recv) == rbcHash && BASICOPUNREDEFINEDP(BOPASET, HASHREDEFINEDOP_FLAG)) {

  • rbhashaset(recv, key, val);

  • } else {

  • PUSH(recv);

  • PUSH(rbstrresurrect(key));

  • PUSH(val);

  • CALLSIMPLEMETHOD(recv);

  • }
    +}
    +
    +/**

  • @c optimize

  • @e recv[str]

  • @j 最適化された recv[str]。

  • */
    +DEFINEINSN
    +opt
    arefstr
    +(CALL
    INFO ci, VALUE key)
    +(VALUE recv)
    +(VALUE val)
    +{

  • if (!SPECIALCONSTP(recv) && RBASICCLASS(recv) == rbcHash && BASICOPUNREDEFINEDP(BOPAREF, HASHREDEFINEDOP_FLAG)) {

  • val = rbhasharef(recv, key);

  • } else {

  • PUSH(recv);

  • PUSH(rbstrresurrect(key));

  • CALLSIMPLEMETHOD(recv);

  • }
    +}
    +
    +/**

  • @c optimize
    @e optimized length
    @j 最適化された recv.length()。
    */
    diff --git a/test/ruby/testhash.rb b/test/ruby/testhash.rb
    index 70c0442..f5af4dd 100644
    --- a/test/ruby/testhash.rb
    +++ b/test/ruby/test
    hash.rb
    @@ -209,6 +209,13 @@ class TestHash < Test::Unit::TestCase
    assert_equal(256, h[z])
    end

  • def testASETfstring_key

  • a, b = {}, {}

  • a["abc"] = 1

  • b["abc"] = 1

  • assert_same a.keys[0], b.keys[0]

  • end
    +
    def testNEWHASHfstringkey
    a = {"ABC" => :t}
    b = {"ABC" => :t}
    @@ -946,7 +953,7 @@ class TestHash < Test::Unit::TestCase
    h = @cls[]
    h.compare
    by_identity
    h["a"] = 1

  • h["a"] = 2

  • h["a".dup] = 2
    assert_equal(["a",1], h.assoc("a"))
    end

#10 Updated by Koichi Sasada 3 months ago

(2014/01/03 12:49), Eric Wong wrote:

Btw, I took some time to work on this further. Only very lightly
tested (make check passes)

What the difference between charliesome/tmm1's patch?
(just curious, I want to know)

--
// SASADA Koichi at atdot dot net

#11 Updated by Eric Wong 3 months ago

SASADA Koichi ko1@atdot.net wrote:

(2014/01/03 12:49), Eric Wong wrote:

Btw, I took some time to work on this further. Only very lightly
tested (make check passes)

What the difference between charliesome/tmm1's patch?
(just curious, I want to know)

Mine replaces ndlit in node directly (seems OK, since other
rb
fstring uses do that). I also implement aref, not just aset.

This is my first time working in compile.c and I'm not familiar with
this code at all, but it still seems to be still working :)
Other than my trivial loop, I haven't tested performance.

#12 Updated by Koichi Sasada 3 months ago

(2014/01/06 15:49), Eric Wong wrote:

Mine replaces ndlit in node directly (seems OK, since other
rb
fstring uses do that). I also implement aref, not just aset.

This is my first time working in compile.c and I'm not familiar with
this code at all, but it still seems to be still working :)
Other than my trivial loop, I haven't tested performance.

I got it.

My concern is performance regression with huge entries of fstring table
with this technique. Maybe we can avoid such regression with smart data
structure (for example, do not use st).

--
// SASADA Koichi at atdot dot net

#13 Updated by Eric Wong 3 months ago

SASADA Koichi ko1@atdot.net wrote:

I got it.

My concern is performance regression with huge entries of fstring table
with this technique. Maybe we can avoid such regression with smart data
structure (for example, do not use st).

Yes, sttableentry is gigantic, I think we should go back to unordered
st for things where order does not matter. But we can probably do
better for the fstring table, even.

#14 Updated by Aman Gupta 3 months ago

My concern is performance regression with huge entries of fstring table
with this technique. Maybe we can avoid such regression with smart data
structure (for example, do not use st).

The new optasetstr and optarefstr instructions only affect string literals, and all strings literals are already in the fstring table in 2.1.
I don't think there is any possible performance regression with this technique.

#15 Updated by Eric Wong about 1 month ago

Eric Wong normalperson@yhbt.net wrote:

The design is based on st, but uses linked-list of cache-sized arrays
for chaining, so it's as if each bucket is an st-packed array.

Fwiw, I don't think this is a good design for our internal data
structures. I'll experiment with others in a few weeks/months
where the lookup node is embedded in the struct itself. It would
use offsetof (via container_of macro) to get back the original
object, meaning we avoid extra pointer chasing. This won't
be good for things which use VALUEs as keys, but probably good
for method tables.

I'm not sure if we can/should change hash.c and its st.c usage, yet,
due to public API compatibility.

Also available in: Atom PDF