Backport #8142

[patch] iseq: reduce array allocations for simple sequences

Added by Aman Gupta about 1 year ago. Updated 10 months ago.

[ruby-core:53619]
Status:Closed
Priority:Normal
Assignee:Tomoyuki Chikanaga

Description

Allocate iseq->mark_ary on demand, only if needed.

In my application, this reduces long lived arrays on the heap significantly.

:TARRAY=>88166 # before
:T
ARRAY=>62932 # after

diff --git a/compile.c b/compile.c
index 9360f5b..aafae05 100644
--- a/compile.c
+++ b/compile.c
@@ -416,7 +416,7 @@ static int
iseqaddmarkobject(rbiseqt *iseq, VALUE v)
{
if (!SPECIAL
CONSTP(v)) {
- rb
arypush(iseq->markary, v);
+ rbiseqaddmarkobject(iseq, v);
}
return COMPILEOK;
}
diff --git a/insns.def b/insns.def
index 979aa1c..bb9fc3f 100644
--- a/insns.def
+++ b/insns.def
@@ -1237,7 +1237,7 @@ setinlinecache
(VALUE val)
{
if (ic->ic
value.value == Qundef) {
- rbarypush(GETISEQ()->markary, val);
+ rbiseqaddmarkobject(GETISEQ(), val);
}
ic->ic
value.value = val;
ic->icvmstat = GETVMSTATEVERSION() - rubyvmconstmissingcount;
diff --git a/iseq.c b/iseq.c
index 288d3bf..eab237a 100644
--- a/iseq.c
+++ b/iseq.c
@@ -237,6 +237,17 @@ setrelation(rbiseq_t *iseq, const VALUE parent)
}
}

+void
+rbiseqaddmarkobject(rbiseqt *iseq, VALUE obj)
+{
+ if (!RTEST(iseq->markary)) {
+ iseq->mark
ary = rbarytmpnew(3);
+ OBJ
UNTRUST(iseq->markary);
+ RBASIC(iseq->mark
ary)->klass = 0;
+ }
+ rbarypush(iseq->markary, obj);
+}
+
static VALUE
prepare
iseqbuild(rbiseqt *iseq,
VALUE name, VALUE path, VALUE absolute
path, VALUE firstlineno,
@@ -259,9 +270,7 @@ prepare
iseqbuild(rbiseq_t *iseq,
}

 iseq->defined_method_id = 0;
  • iseq->markary = rbarytmpnew(3);
  • OBJUNTRUST(iseq->markary);
  • RBASIC(iseq->mark_ary)->klass = 0;
  • iseq->mark_ary = 0;

    /*
    @@ -2060,8 +2069,7 @@ rbiseqbuildforruby2cext(
    iseq->location.label = rbstrnew2(name);
    iseq->location.path = rbstrnew2(path);
    iseq->location.firstlineno = firstlineno;

  • iseq->markary = rbarytmpnew(3);

  • OBJUNTRUST(iseq->markary);

  • iseq->mark_ary = 0;
    iseq->self = iseqval;

    iseq->iseq = ALLOCN(VALUE, iseq->iseqsize);
    diff --git a/iseq.h b/iseq.h
    index 0790529..4de0816 100644
    --- a/iseq.h
    +++ b/iseq.h
    @@ -23,6 +23,7 @@ VALUE rbiseqbuildfromary(rbiseqt *iseq, VALUE locals, VALUE args,
    VALUE exception, VALUE body);

    /* iseq.c */
    +void rbiseqaddmarkobject(rbiseqt *iseq, VALUE obj);
    VALUE rbiseqload(VALUE data, VALUE parent, VALUE opt);
    VALUE rbiseqparameters(const rbiseqt *iseq, int isproc);
    struct st
    table *rubyinsnmakeinsntable(void);

Associated revisions

Revision 41682
Added by Tomoyuki Chikanaga 10 months ago

merge revision(s) 40336: [Backport #8142]

* compile.c (iseq_add_mark_object): Use new rb_iseq_add_mark_object().

* insns.def (setinlinecache): Ditto.

* iseq.c (rb_iseq_add_mark_object): New function to allocate
  iseq->mark_ary on demand. [Bug #8142]

* iseq.h (rb_iseq_add_mark_object): Ditto.

* iseq.c (prepare_iseq_build): Avoid allocating mark_ary until needed.

* iseq.c (rb_iseq_build_for_ruby2cext): Ditto.

History

#1 Updated by Koichi Sasada about 1 year ago

(2013/03/22 14:30), tmm1 (Aman Gupta) wrote:

Allocate iseq->mark_ary on demand, only if needed.

In my application, this reduces long lived arrays on the heap significantly.

Ah, I got it. Nice!! I'll introduce it.

##
BTW, in the future, I want avoid any live object relations from iseq.
For example, the program "str = 'hello'" (compiled iseq) has one
relation to a String object. However, this String object can be replaced
with non-VALUE memory object (not a VALUE, but a memory dump). If I can
replace with this technique, iseq->mark_ary is no longer needed.

--
// SASADA Koichi at atdot dot net

#2 Updated by Aman Gupta about 1 year ago

For example, the program "str = 'hello'" (compiled iseq) has one
relation to a String object. However, this String object can be replaced
with non-VALUE memory object (not a VALUE, but a memory dump).

Do you have any existing patch for this technique? I would like to try, for example to convert putstring instruction.

In my application there are lots of long lived strings. Many of these strings come from string literals in code.

GC.start
ObjectSpace.countobjects[:TSTRING]
=> 311117

ObjectSpace.each_object(String).count
=> 305230

ObjectSpace.each_object(String).select{ |s| s.frozen? }.size
=> 233336

Also I see a high level of frozen string duplication. Some of this is due to duplication of common iseq->location.label (like "initialize")

ObjectSpace.each_object(String).select{ |s| s.frozen? }.uniq.size
=> 118043

#3 Updated by Koichi Sasada about 1 year ago

(2013/03/22 17:19), tmm1 (Aman Gupta) wrote:

Do you have any existing patch for this technique? I would like to try, for example to convert putstring instruction.

In my application there are lots of long lived strings. Many of these strings come from string literals in code.

Yes. This is why I want to try it.
And this is good to hear we have killer user :)

Already, we have rough design about it.
This design includes Symbol GC :)
The key idea is making static string (or binary) table with reference
counter
.
We had (Heroku Matz team) discussed about it.

However, now I'm trying GC improvements (restricted gen gc).
So priority is not best.
I want to finish before RubyKaigi2013.
(Event Driven Development)

--
// SASADA Koichi at atdot dot net

#4 Updated by Aman Gupta about 1 year ago

However, now I'm trying GC improvements (restricted gen gc).

This is great news. With an incremental mark, long lived objects are less of a problem.

What is your plan for restricted GC?

#5 Updated by Koichi Sasada about 1 year ago

(2013/03/25 14:29), tmm1 (Aman Gupta) wrote:

What is your plan for restricted GC?

I'll make another feature request.

# On a preliminary evaluation, simple (ideal) micro benchmark was
# 20-30% faster.

--
// SASADA Koichi at atdot dot net

#6 Updated by Aman Gupta about 1 year ago

I'll make another feature request.

# On a preliminary evaluation, simple (ideal) micro benchmark was
# 20-30% faster.

Great. I tried some GC experiments recently (additional bit per object to track longlife generation), but without write barrier it was very tricky to implement. I look forward to seeing your idea.

However, this String object can be replaced
with non-VALUE memory object (not a VALUE, but a memory dump).

I am doing some experiments with this technique. For the putstring instruction, this is very simple- replace rbstrresurrect() with rbstrnew2() using memory dumped value.

But in DSTR, putobject instruction is used instead of putstring (https://github.com/ruby/ruby/commit/49371b54). Replacing this with rbstrnew every time will increase GC pressure, so it is not ideal.

One solution is to make memory dump re-use struct RString, so it can emulate a string object (special API to allocate strings outside the ruby heap, using ALLOC(struct RString)). These objects can also contain an extra reference count field.

But if putobject instruction gives out reference to these unmanaged object, then reference count is not enough to free. An additional GC mark/sweep will be required after refcount==0, to make sure no one is still referencing the string. This is still possible (similar technique is used for unlinked method entries?). I have some patches for this approach, but I am curious what you think. Is this a bad idea?

#7 Updated by Aman Gupta about 1 year ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r40336.
Aman, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


iseq: reduce array allocations for simple sequences

  • compile.c (iseqaddmarkobject): Use new rbiseqaddmark_object().

  • insns.def (setinlinecache): Ditto.

  • iseq.c (rbiseqaddmarkobject): New function to allocate
    iseq->mark_ary on demand. [Bug #8142]

  • iseq.h (rbiseqaddmarkobject): Ditto.

  • iseq.c (prepareiseqbuild): Avoid allocating mark_ary until needed.

  • iseq.c (rbiseqbuildforruby2cext): Ditto.

#8 Updated by Aman Gupta 12 months ago

  • Tracker changed from Bug to Backport
  • Project changed from ruby-trunk to Backport200
  • Status changed from Closed to Assigned
  • Assignee changed from Koichi Sasada to Tomoyuki Chikanaga

#9 Updated by Tomoyuki Chikanaga 10 months ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r41682.
Aman, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


merge revision(s) 40336: [Backport #8142]

* compile.c (iseq_add_mark_object): Use new rb_iseq_add_mark_object().

* insns.def (setinlinecache): Ditto.

* iseq.c (rb_iseq_add_mark_object): New function to allocate
  iseq->mark_ary on demand. [Bug #8142]

* iseq.h (rb_iseq_add_mark_object): Ditto.

* iseq.c (prepare_iseq_build): Avoid allocating mark_ary until needed.

* iseq.c (rb_iseq_build_for_ruby2cext): Ditto.

Also available in: Atom PDF