Backport #8048

require() features_index bloats size of ruby heap

Added by Aman Gupta about 1 year ago. Updated 12 months ago.

[ruby-core:53216]
Status:Closed
Priority:Normal
Assignee:Tomoyuki Chikanaga

Description

The new features_index for require() in 2.0.0 creates a lot of extra ARRAY and STRING objects. In a big rails application, there are ~70k extra strings and ~70k extra arrays on the heap after boot.

$ RAILSENV=production ruby -e' puts RUBYVERSION; require "./config/environment"; p [:NUMLOADEDFEATURES, $LOADEDFEATURES.size]; GC.start; p ObjectSpace.countobjects.select{ |k,v| [:TSTRING, :TARRAY, :THASH].include? k } '
1.9.3
[:NUM
LOADEDFEATURES, 2933]
{:T
STRING=>291147, :TARRAY=>97402, :THASH=>9439}

$ RAILSENV=production ruby -e' puts RUBYVERSION; require "./config/environment"; p [:NUMLOADEDFEATURES, $LOADEDFEATURES.size]; GC.start; p ObjectSpace.countobjects.select{ |k,v| [:TSTRING, :TARRAY, :THASH].include? k } '
2.0.0
[:NUM
LOADEDFEATURES, 2945]
{:T
STRING=>363183, :TARRAY=>167871, :THASH=>9640}

The following patch reduces the number of arrays used by the features_index:

diff --git a/load.c b/load.c
index ea22d5b..1868518 100644
--- a/load.c
+++ b/load.c
@@ -186,11 +186,16 @@ featuresindexaddsingle(VALUE shortfeature, VALUE offset)
{
VALUE featuresindex, thisfeatureindex;
features
index = getloadedfeaturesindexraw();
- if ((thisfeatureindex = rbhashlookup(featuresindex, shortfeature)) == Qnil) {
- thisfeatureindex = rbarynew();
- rbhashaset(featuresindex, shortfeature, thisfeatureindex);
+ thisfeatureindex = rbhashlookup(featuresindex, shortfeature);
+
+ if (thisfeatureindex == Qnil) {
+ rbhashaset(featuresindex, shortfeature, offset);
+ } else if (RBTYPEP(thisfeatureindex, TFIXNUM)) {
+ this
featureindex = rbarynew3(2, thisfeatureindex, offset);
+ rb
hashaset(featuresindex, shortfeature, thisfeatureindex);
+ } else if (RB
TYPEP(thisfeatureindex, TARRAY)) {
+ rbarypush(thisfeatureindex, offset);
}
- rbarypush(thisfeatureindex, offset);
}

/* Add to the loaded-features index all the required entries for
@@ -389,8 +394,16 @@ rbfeaturep(const char *feature, const char *ext, int rb, int expanded, const c
or ends in '/'. This includes both match forms above, as well
as any distractors, so we may ignore all other entries in features.
*/
- for (i = 0; thisfeatureindex != Qnil && i < RARRAYLEN(thisfeatureindex); i++) {
- long index = FIX2LONG(rb
aryentry(thisfeatureindex, i));
+ for (i = 0; this
featureindex != Qnil; i++) {
+ long index;
+ if (RB
TYPEP(thisfeatureindex, TARRAY)) {
+ if (i >= RARRAYLEN(thisfeatureindex)) break;
+ index = FIX2LONG(rb
aryentry(thisfeatureindex, i));
+ } else {
+ if (i > 0) break;
+ index = FIX2LONG(this
featureindex);
+ }
+
v = RARRAY
PTR(features)[index];
f = StringValuePtr(v);
if ((n = RSTRING_LEN(v)) < len) continue;


Related issues

Related to ruby-trunk - Bug #9201: [patch] remove GC overhead for loaded_features_index Closed 12/03/2013

Associated revisions

Revision 40397
Added by Tomoyuki Chikanaga 12 months ago

merge revision(s) 39644,39646,39647,39874,39898: [Backport #8048]

* load.c (features_index_add_single, rb_feature_p): store single index
  as Fixnum to reduce the number of arrays for the indexes.  based on
  the patch by tmm1 (Aman Gupta) in  [Bug #8048].

* load.c (rb_feature_p), vm_core.h (rb_vm_struct): turn
  loaded_features_index into st_table.  patches by tmm1 (Aman Gupta)
  in  and  [Bug #8048]

History

#1 Updated by Nobuyoshi Nakada about 1 year ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r39644.
Aman, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


load.c: reduce indexes arrays

  • load.c (featuresindexaddsingle, rbfeature_p): store single index as Fixnum to reduce the number of arrays for the indexes. based on the patch by tmm1 (Aman Gupta) in [Bug #8048].

#2 Updated by Aman Gupta about 1 year ago

Thank you.

Is there some reason loadedfeaturesindex cannot use stinitstrtable()? Will you accept an additional patch to this effect?

#3 Updated by Aman Gupta about 1 year ago

diff --git a/load.c b/load.c
index 6330e50..3010b18 100644
--- a/load.c
+++ b/load.c
@@ -171,7 +171,7 @@ resetloadedfeaturessnapshot(void)
rb
aryreplace(vm->loadedfeaturessnapshot, vm->loadedfeatures);
}

-static VALUE
+static struct sttable *
get
loadedfeaturesindexraw(void)
{
return GET
VM()->loadedfeaturesindex;
@@ -186,12 +186,19 @@ getloadingtable(void)
static void
featuresindexaddsingle(VALUE shortfeature, VALUE offset)
{
- VALUE featuresindex, thisfeatureindex;
+ struct st
table *featuresindex;
+ VALUE this
featureindex = Qnil;
+ char *short
featurecstr;
+
+ Check
Type(offset, TFIXNUM);
+ Check
Type(shortfeature, TSTRING);
+ shortfeaturecstr = RSTRINGPTR(shortfeature);
+
featuresindex = getloadedfeaturesindexraw();
- this
featureindex = rbhashlookup(featuresindex, shortfeature);
+ st
lookup(featuresindex, (stdatat)shortfeaturecstr, (stdatat *)&thisfeature_index);

 if (NIL_P(this_feature_index)) {
  • rbhashaset(featuresindex, shortfeature, offset);
  • stinsert(featuresindex, (stdatat)rubystrdup(shortfeaturecstr), (stdatat)offset); } else if (RBTYPEP(thisfeatureindex, TFIXNUM)) { VALUE featureindexes[2]; @@ -199,7 +206,7 @@ featuresindexaddsingle(VALUE shortfeature, VALUE offset) featureindexes[1] = offset; thisfeatureindex = rbarytmpnew(numberof(featureindexes)); rbarycat(thisfeatureindex, featureindexes, numberof(featureindexes));
  • rbhashaset(featuresindex, shortfeature, thisfeatureindex);
  • stinsert(featuresindex, (stdatat)shortfeaturecstr, (stdatat)thisfeatureindex); } else { CheckType(thisfeatureindex, TARRAY); @@ -254,7 +261,14 @@ featuresindexadd(VALUE feature, VALUE offset) } }

-static VALUE
+static int
+loadedfeaturesindexcleari(stdatat key, stdatat val, stdatat arg)
+{
+ xfree((char) key);
+ return STDELETE;
+}
+
+static st
table *
getloadedfeaturesindex(void)
{
VALUE features;
@@ -264,7 +278,7 @@ get
loadedfeaturesindex(void)
if (!rbarysharedwithp(vm->loadedfeaturessnapshot, vm->loaded_features)) {
/
The sharing was broken; something (other than us in rbprovidefeature())
modified loadedfeatures. Rebuild the index. */
- rb
hashclear(vm->loadedfeaturesindex);
+ st
foreach(vm->loadedfeaturesindex, loadedfeaturesindexcleari, 0);
features = vm->loadedfeatures;
for (i = 0; i < RARRAY
LEN(features); i++) {
VALUE entry, asstr;
@@ -357,10 +371,10 @@ loaded
featurepathi(stdatat v, stdatat b, stdatat f)
static int
rbfeaturep(const char feature, const char *ext, int rb, int expanded, const char *fn)
{
- VALUE features, featuresindex, featureval, thisfeatureindex, v, p, loadpath = 0;
+ VALUE features, feature
val, thisfeatureindex = Qnil, v, p, loadpath = 0;
const char *f, *e;
long i, len, elen, n;
- st
table *loadingtbl;
+ st
table *loadingtbl, *featuresindex;
stdatat data;
int type;

@@ -379,7 +393,7 @@ rbfeaturep(const char *feature, const char *ext, int rb, int expanded, const c
featuresindex = getloadedfeaturesindex();

 feature_val = rb_str_new(feature, len);
  • thisfeatureindex = rbhashlookup(featuresindex, featureval);
  • stlookup(featuresindex, (stdatat)feature, (stdatat )&thisfeatureindex); / We search features for an entry such that either "#{features[i]}" == "#{loadpath[j]}/#{feature}#{e}" for some j, or @@ -1120,8 +1134,7 @@ Initload() rbdefinevirtualvariable("$LOADEDFEATURES", getloadedfeatures, 0); vm->loadedfeatures = rbarynew(); vm->loadedfeaturessnapshot = rbarytmpnew(0);
  • vm->loadedfeaturesindex = rbhashnew();
  • RBASIC(vm->loadedfeaturesindex)->klass = 0;
  • vm->loadedfeaturesindex = stinitstrtable();

    rbdefineglobalfunction("load", rbfload, -1);
    rb
    defineglobalfunction("require", rbfrequire, 1);
    diff --git a/vm.c b/vm.c
    index 9126502..b4132cd 100644
    --- a/vm.c
    +++ b/vm.c
    @@ -1565,7 +1565,6 @@ rbvmmark(void *ptr)
    RUBYMARKUNLESSNULL(vm->expandedloadpath);
    RUBY
    MARKUNLESSNULL(vm->loadedfeatures);
    RUBY
    MARKUNLESSNULL(vm->loadedfeaturessnapshot);

  • RUBYMARKUNLESSNULL(vm->loadedfeaturesindex);
    RUBY
    MARKUNLESSNULL(vm->topself);
    RUBY
    MARKUNLESSNULL(vm->coverages);
    rbgcmarklocations(vm->specialexceptions, vm->specialexceptions + rubyspecialerrorcount);
    @@ -1573,6 +1572,9 @@ rbvmmark(void *ptr)
    if (vm->loadingtable) {
    rb
    marktbl(vm->loadingtable);
    }

  • if (vm->loadedfeaturesindex) {

  •   rb_mark_tbl(vm->loaded_features_index);
    
  • }

    rbvmtracemarkeventhooks(&vm->eventhooks);

diff --git a/vmcore.h b/vmcore.h
index 1838f2a..2ac57fc 100644
--- a/vmcore.h
+++ b/vm
core.h
@@ -364,7 +364,7 @@ typedef struct rbvmstruct {
VALUE expandedloadpath;
VALUE loadedfeatures;
VALUE loaded
featuressnapshot;
- VALUE loaded
featuresindex;
+ struct st
table *loadedfeaturesindex;
struct sttable *loadingtable;

 /* signal */

#4 Updated by Aman Gupta about 1 year ago

  • CheckType(shortfeature, T_STRING);
  • shortfeaturecstr = RSTRINGPTR(shortfeature);

This should be StringValueCStr(), because shortfeature is created from rbstr_substr().

Is it possible for features to contain a NUL byte? Maybe it is preferable to add stinitbintable() which can accept a string length.

#5 Updated by Nobuyoshi Nakada about 1 year ago

tmm1 (Aman Gupta) wrote:

Is it possible for features to contain a NUL byte?

No, feature names are checked with FilePathValue().

Maybe it is preferable to add stinitbintable() which can accept a string length.

Since st interfaces take only one parameter for the key, it will need a struct, but we already have such struct, RString.
Perhaps we could make a custom st_table which uses the functions in string.c.

#6 Updated by Aman Gupta about 1 year ago

nobu-san,

What do you think of applying StringValueCStr() version of this patch? Since feature names cannot contain NULL bytes, it should be OK I think.

#7 Updated by Tomoyuki Chikanaga 12 months ago

  • Tracker changed from Bug to Backport
  • Project changed from ruby-trunk to Backport200
  • Status changed from Closed to Assigned
  • Assignee set to Tomoyuki Chikanaga

I will backport this because #8165 depend on this feature.

#8 Updated by Tomoyuki Chikanaga 12 months ago

... and add r39646, r39647 to reduce conflict.

#9 Updated by Tomoyuki Chikanaga 12 months ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r40397.
Aman, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


merge revision(s) 39644,39646,39647,39874,39898: [Backport #8048]

* load.c (features_index_add_single, rb_feature_p): store single index
  as Fixnum to reduce the number of arrays for the indexes.  based on
  the patch by tmm1 (Aman Gupta) in  [Bug #8048].

* load.c (rb_feature_p), vm_core.h (rb_vm_struct): turn
  loaded_features_index into st_table.  patches by tmm1 (Aman Gupta)
  in  and  [Bug #8048]

Also available in: Atom PDF