Project

General

Profile

Feature #12024

Add String.buffer, for creating strings with large capacities

Added by jeremyevans0 (Jeremy Evans) almost 3 years ago. Updated almost 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:73487]

Description

If you know you are going to need to create a large string,
it's better to create it with a large capacity. Otherwise,
ruby will need to continuously resize the string as it grows.
For example, if you will be producing a string that is
100000 bytes, String.buffer(100000) will avoid 10 separate
resizes compared to using String.new.

Performance-wise, String.new is 1.33x slower than
String.buffer(100000) if appending in 1000 byte chunks,
and 1.64x slower than String.buffer(1000) if appending
in 100 byte chunks.

To make sure this works correctly with String subclasses,
a static rb_str_buf_new_with_class function is added, which
both String.buffer and rb_str_buf_new now call.


Files


Related issues

Related to Ruby trunk - Feature #905: Add String.new(fixnum) to preallocate large bufferClosedActions

Associated revisions

Revision d46e2aea
Added by naruse (Yui NARUSE) almost 3 years ago

  • string.c (rb_str_init): introduce String.new(capacity: size) [Feature #12024]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53850 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 53850
Added by naruse (Yui NARUSE) almost 3 years ago

  • string.c (rb_str_init): introduce String.new(capacity: size) [Feature #12024]

Revision 53850
Added by naruse (Yui NARUSE) almost 3 years ago

  • string.c (rb_str_init): introduce String.new(capacity: size) [Feature #12024]

Revision 53850
Added by naruse (Yui NARUSE) almost 3 years ago

  • string.c (rb_str_init): introduce String.new(capacity: size) [Feature #12024]

Revision 53850
Added by naruse (Yui NARUSE) almost 3 years ago

  • string.c (rb_str_init): introduce String.new(capacity: size) [Feature #12024]

Revision ed5401a6
Added by normal over 2 years ago

string.c: reduce malloc overhead for default buffer size

  • string.c (STR_BUF_MIN_SIZE): reduce from 128 to 127 [ruby-core:76371] [Feature #12025]
  • string.c (rb_str_buf_new): adjust for above reduction

From Jeremy Evans code@jeremyevans.net:

This changes the minimum buffer size for string buffers from 128 to

  1. The underlying C buffer is always 1 more than the ruby buffer, so this changes the actual amount of memory used for the minimum string buffer from 129 to 128. This makes it much easier on the malloc implementation, as evidenced by the following code (note that time -l is used here, but Linux systems may need time -v).

$ cat bench_mem.rb
i = ARGV.first.to_i
Array.new(1000000){" " * i}
$ /usr/bin/time -l ruby bench_mem.rb 128
3.10 real 2.19 user 0.46 sys
289080 maximum resident set size
72673 minor page faults
13 block output operations
29 voluntary context switches
$ /usr/bin/time -l ruby bench_mem.rb 127
2.64 real 2.09 user 0.27 sys
162720 maximum resident set size
40966 minor page faults
2 block output operations
4 voluntary context switches

To try to ensure a power-of-2 growth, when a ruby string capacity
needs to be increased, after doubling the capacity, add one. This
ensures the ruby capacity will be odd, which means actual amount
of memory used will be even, which is probably better than the
current case of the ruby capacity being even and the actual amount
of memory used being odd.

A very similar patch was proposed 4 years ago in feature #5875. It
ended up being rejected, because no performance increase was shown.
One reason for that is that ruby does not use STR_BUF_MIN_SIZE
unless rb_str_buf_new is called, and that previously did not have
a ruby API, only a C API, so unless you were using a C extension
that called it, there would be no performance increase.

With the recently proposed feature #12024, String.buffer is added,
which is a ruby API for creating string buffers. Using
String.buffer(100) wastes much less memory with this patch, as the
malloc implementation can more easily deal with the power-of-2
sized memory usage. As measured above, memory usage is 44% less,
and performance is 17% better.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@55686 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 55686
Added by normalperson (Eric Wong) over 2 years ago

string.c: reduce malloc overhead for default buffer size

  • string.c (STR_BUF_MIN_SIZE): reduce from 128 to 127 [ruby-core:76371] [Feature #12025]
  • string.c (rb_str_buf_new): adjust for above reduction

From Jeremy Evans code@jeremyevans.net:

This changes the minimum buffer size for string buffers from 128 to

  1. The underlying C buffer is always 1 more than the ruby buffer, so this changes the actual amount of memory used for the minimum string buffer from 129 to 128. This makes it much easier on the malloc implementation, as evidenced by the following code (note that time -l is used here, but Linux systems may need time -v).

$ cat bench_mem.rb
i = ARGV.first.to_i
Array.new(1000000){" " * i}
$ /usr/bin/time -l ruby bench_mem.rb 128
3.10 real 2.19 user 0.46 sys
289080 maximum resident set size
72673 minor page faults
13 block output operations
29 voluntary context switches
$ /usr/bin/time -l ruby bench_mem.rb 127
2.64 real 2.09 user 0.27 sys
162720 maximum resident set size
40966 minor page faults
2 block output operations
4 voluntary context switches

To try to ensure a power-of-2 growth, when a ruby string capacity
needs to be increased, after doubling the capacity, add one. This
ensures the ruby capacity will be odd, which means actual amount
of memory used will be even, which is probably better than the
current case of the ruby capacity being even and the actual amount
of memory used being odd.

A very similar patch was proposed 4 years ago in feature #5875. It
ended up being rejected, because no performance increase was shown.
One reason for that is that ruby does not use STR_BUF_MIN_SIZE
unless rb_str_buf_new is called, and that previously did not have
a ruby API, only a C API, so unless you were using a C extension
that called it, there would be no performance increase.

With the recently proposed feature #12024, String.buffer is added,
which is a ruby API for creating string buffers. Using
String.buffer(100) wastes much less memory with this patch, as the
malloc implementation can more easily deal with the power-of-2
sized memory usage. As measured above, memory usage is 44% less,
and performance is 17% better.

Revision 55686
Added by normal over 2 years ago

string.c: reduce malloc overhead for default buffer size

  • string.c (STR_BUF_MIN_SIZE): reduce from 128 to 127 [ruby-core:76371] [Feature #12025]
  • string.c (rb_str_buf_new): adjust for above reduction

From Jeremy Evans code@jeremyevans.net:

This changes the minimum buffer size for string buffers from 128 to

  1. The underlying C buffer is always 1 more than the ruby buffer, so this changes the actual amount of memory used for the minimum string buffer from 129 to 128. This makes it much easier on the malloc implementation, as evidenced by the following code (note that time -l is used here, but Linux systems may need time -v).

$ cat bench_mem.rb
i = ARGV.first.to_i
Array.new(1000000){" " * i}
$ /usr/bin/time -l ruby bench_mem.rb 128
3.10 real 2.19 user 0.46 sys
289080 maximum resident set size
72673 minor page faults
13 block output operations
29 voluntary context switches
$ /usr/bin/time -l ruby bench_mem.rb 127
2.64 real 2.09 user 0.27 sys
162720 maximum resident set size
40966 minor page faults
2 block output operations
4 voluntary context switches

To try to ensure a power-of-2 growth, when a ruby string capacity
needs to be increased, after doubling the capacity, add one. This
ensures the ruby capacity will be odd, which means actual amount
of memory used will be even, which is probably better than the
current case of the ruby capacity being even and the actual amount
of memory used being odd.

A very similar patch was proposed 4 years ago in feature #5875. It
ended up being rejected, because no performance increase was shown.
One reason for that is that ruby does not use STR_BUF_MIN_SIZE
unless rb_str_buf_new is called, and that previously did not have
a ruby API, only a C API, so unless you were using a C extension
that called it, there would be no performance increase.

With the recently proposed feature #12024, String.buffer is added,
which is a ruby API for creating string buffers. Using
String.buffer(100) wastes much less memory with this patch, as the
malloc implementation can more easily deal with the power-of-2
sized memory usage. As measured above, memory usage is 44% less,
and performance is 17% better.

Revision 55686
Added by normal over 2 years ago

string.c: reduce malloc overhead for default buffer size

  • string.c (STR_BUF_MIN_SIZE): reduce from 128 to 127 [ruby-core:76371] [Feature #12025]
  • string.c (rb_str_buf_new): adjust for above reduction

From Jeremy Evans code@jeremyevans.net:

This changes the minimum buffer size for string buffers from 128 to

  1. The underlying C buffer is always 1 more than the ruby buffer, so this changes the actual amount of memory used for the minimum string buffer from 129 to 128. This makes it much easier on the malloc implementation, as evidenced by the following code (note that time -l is used here, but Linux systems may need time -v).

$ cat bench_mem.rb
i = ARGV.first.to_i
Array.new(1000000){" " * i}
$ /usr/bin/time -l ruby bench_mem.rb 128
3.10 real 2.19 user 0.46 sys
289080 maximum resident set size
72673 minor page faults
13 block output operations
29 voluntary context switches
$ /usr/bin/time -l ruby bench_mem.rb 127
2.64 real 2.09 user 0.27 sys
162720 maximum resident set size
40966 minor page faults
2 block output operations
4 voluntary context switches

To try to ensure a power-of-2 growth, when a ruby string capacity
needs to be increased, after doubling the capacity, add one. This
ensures the ruby capacity will be odd, which means actual amount
of memory used will be even, which is probably better than the
current case of the ruby capacity being even and the actual amount
of memory used being odd.

A very similar patch was proposed 4 years ago in feature #5875. It
ended up being rejected, because no performance increase was shown.
One reason for that is that ruby does not use STR_BUF_MIN_SIZE
unless rb_str_buf_new is called, and that previously did not have
a ruby API, only a C API, so unless you were using a C extension
that called it, there would be no performance increase.

With the recently proposed feature #12024, String.buffer is added,
which is a ruby API for creating string buffers. Using
String.buffer(100) wastes much less memory with this patch, as the
malloc implementation can more easily deal with the power-of-2
sized memory usage. As measured above, memory usage is 44% less,
and performance is 17% better.

Revision 55686
Added by normal over 2 years ago

string.c: reduce malloc overhead for default buffer size

  • string.c (STR_BUF_MIN_SIZE): reduce from 128 to 127 [ruby-core:76371] [Feature #12025]
  • string.c (rb_str_buf_new): adjust for above reduction

From Jeremy Evans code@jeremyevans.net:

This changes the minimum buffer size for string buffers from 128 to

  1. The underlying C buffer is always 1 more than the ruby buffer, so this changes the actual amount of memory used for the minimum string buffer from 129 to 128. This makes it much easier on the malloc implementation, as evidenced by the following code (note that time -l is used here, but Linux systems may need time -v).

$ cat bench_mem.rb
i = ARGV.first.to_i
Array.new(1000000){" " * i}
$ /usr/bin/time -l ruby bench_mem.rb 128
3.10 real 2.19 user 0.46 sys
289080 maximum resident set size
72673 minor page faults
13 block output operations
29 voluntary context switches
$ /usr/bin/time -l ruby bench_mem.rb 127
2.64 real 2.09 user 0.27 sys
162720 maximum resident set size
40966 minor page faults
2 block output operations
4 voluntary context switches

To try to ensure a power-of-2 growth, when a ruby string capacity
needs to be increased, after doubling the capacity, add one. This
ensures the ruby capacity will be odd, which means actual amount
of memory used will be even, which is probably better than the
current case of the ruby capacity being even and the actual amount
of memory used being odd.

A very similar patch was proposed 4 years ago in feature #5875. It
ended up being rejected, because no performance increase was shown.
One reason for that is that ruby does not use STR_BUF_MIN_SIZE
unless rb_str_buf_new is called, and that previously did not have
a ruby API, only a C API, so unless you were using a C extension
that called it, there would be no performance increase.

With the recently proposed feature #12024, String.buffer is added,
which is a ruby API for creating string buffers. Using
String.buffer(100) wastes much less memory with this patch, as the
malloc implementation can more easily deal with the power-of-2
sized memory usage. As measured above, memory usage is 44% less,
and performance is 17% better.

History

Updated by matz (Yukihiro Matsumoto) almost 3 years ago

I like the idea.

Matz.

#2

Updated by mame (Yusuke Endoh) almost 3 years ago

  • Related to Feature #905: Add String.new(fixnum) to preallocate large buffer added

Updated by mame (Yusuke Endoh) almost 3 years ago

I think this ticket is a duplicate of #905.

--
Yusuke Endoh mame@ruby-lang.org

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago

I wasn't aware of #905 when I created this, I probably should have searched first.

This feature is similar to #905, but the implementation is different. The implementation in #905 is an instance method that changes the capacity of an existing string. This is a class method that creates a new string with the given capacity. This does use the method name recommended by matz in #905, by accident, but I'm guessing that makes it a good name.

Updated by mame (Yusuke Endoh) almost 3 years ago

I'm unsure if this pre-allocation is really effective since recent "realloc" implementations take just O(1). What operating system are you using?

In #905, pre-allocation was actually helpful in JRuby because JVM did not provide such a good "realloc" feature. But the speed up was not confirmed in Linux. So the issue was, "should MRI provide the feature for script compatibility, even if it is meaningless in Linux?"

However, even if the computational order is O(1), it may have a large constant term. I'm not against the proposal if a real-world quantitative evaluation shows use case where it is actually effective.

--
Yusuke Endoh mame@ruby-lang.org

Updated by kosaki (Motohiro KOSAKI) almost 3 years ago

On Tue, Jan 26, 2016 at 7:12 PM, merch-redmine@jeremyevans.net wrote:

Issue #12024 has been updated by Jeremy Evans.

I wasn't aware of #905 when I created this, I probably should have searched first.

This feature is similar to #905, but the implementation is different. The implementation in #905 is an instance method that changes the capacity of an existing string. This is a class method that creates a new string with the given capacity. This does use the method name recommended by matz in #905, by accident, but I'm guessing that makes it a good name.

Class method is really bad idea because of thread unsafe.

bad scenario example:

Thread-A Thread-B


String.buffer(10000)
s1 = String.new
s2 = String.new
String.buffer(10)

In above scenario, s2 accidentally get 10000 capability. This is huge
memory waste.
And as mame said, we already talk about this feature doesn't improve
performance in #905.

-1.

Updated by naruse (Yui NARUSE) almost 3 years ago

%  cat test.rb
# frozen_string_literal: true
require 'benchmark'
N = 50_000_000
Benchmark.bm 5 do |r|
  r.report "buffer" do
    i = 0
    buf = String.buffer(N)
    while i < N
      buf << 'a'
      i += 1
    end
  end
  r.report "normal" do
    i = 0
    buf = String.new
    while i < N
      buf << 'a'
      i += 1
    end
  end
end

On FreeBSD

%  ./ruby test.rb
            user     system      total        real
buffer  6.765625   0.039062   6.804688 (  6.802679)
normal  6.796875   0.062500   6.859375 (  6.864754)

Updated by naruse (Yui NARUSE) almost 3 years ago

Motohiro KOSAKI wrote:

Class method is really bad idea because of thread unsafe.

It return a new string object.

Updated by naruse (Yui NARUSE) almost 3 years ago

More effective example.

%  cat test.rb; ./ruby test.rb
# frozen_string_literal: true
require 'benchmark'
M = 100
N = 25
Benchmark.bm 5 do |r|
  r.report "buffer" do
    j = 0
    while j < M
      i = 0
      buf = String.buffer(2**N)
      buf << "a"
      while i < N
        buf << buf
        i += 1
      end
      j += 1
    end
  end
  r.report "normal" do
    j = 0
    while j < M
    i = 0
    buf = String.new
    buf << "a"
    while i < N
      buf << buf
      i += 1
    end
      j += 1
    end
  end
end
ruby 2.4.0dev (2016-01-27 trunk 53675) [x86_64-freebsd10.2]
            user     system      total        real
buffer  0.976562   1.070312   2.046875 (  2.048577)
normal  2.085938   2.867188   4.953125 (  4.954006)

Updated by darix (Marcus Rückert) almost 3 years ago

Yui NARUSE wrote:

Motohiro KOSAKI wrote:

Class method is really bad idea because of thread unsafe.

It return a new string object.

why not something like "String.new(:buffersize => 10000)" then?

Updated by mame (Yusuke Endoh) almost 3 years ago

Yui NARUSE wrote:

More effective example.

What an artificial example that pinpoints realloc overhead :-)

--
Yusuke Endoh mame@ruby-lang.org

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago

Here's the benchmark I wrote for this:

require 'benchmark/ips'

eval("def a; b = ' ' * 1000; String.new " + "<< b " * 100 + "end")
eval("def c; b = ' ' * 1000; String.buffer(100000) " + "<< b " * 100 + "end")

Benchmark.ips do |x|
  x.report("String.new - 100000/1000"){a}
  x.report("String.buffer - 100000/1000"){c}
  x.compare!
end

eval("def a; b = ' ' * 100; String.new " + "<< b " * 10 + "end")
eval("def c; b = ' ' * 100; String.buffer(1000) " + "<< b " * 10 + "end")

Benchmark.ips do |x|
  x.report("String.new - 1000/100"){a}
  x.report("String.buffer - 1000/100"){c}
  x.compare!
end

The problem with the benchmark in note 7 is the bottleneck is high ruby execution overhead, due to individual character appends. The problem with the benchmark in note 9 is that is attempts to specifically exploit realloc issues. This benchmark is more realistic, I think.

As to why String.buffer(10000) instead of String.new(:buffersize=>10000), see https://bugs.ruby-lang.org/issues/905#note-2

Updated by akr (Akira Tanaka) almost 3 years ago

Jeremy Evans wrote:

As to why String.buffer(10000) instead of String.new(:buffersize=>10000), see https://bugs.ruby-lang.org/issues/905#note-2

I think String.new(:buffersize=>10000) is better because it can be combined with :encoding such as:
String.new(:encoding=>foo, :buffersize=>10000).
This way avoids multiplicative methods explosion: with/without encoding, with/without buffersize, ...
I can't imagine other attributes for String, though.

I guess matz doesn't apply the logic of https://bugs.ruby-lang.org/issues/905#note-2 here.
Keyword arguments may be considered different from type based dispatch.

Updated by mame (Yusuke Endoh) almost 3 years ago

Thank you for your benchmark. I tried it and confirmed the speed up in Linux. Now I have no strong objection against your proposal.

realloc takes O(1) if the size is much larger than page size (4096B in Linux), but the benchmark starts from a much shorter string, which requires memcpy. I thought this constant term was not so big, but seems it is. Kosaki-san may have an opinion about this.

I have no strong objection, but still I have a question. I guess that creating a large string by concatenating many short strings is natural use case in Ruby. However, are there so much cases where we can predict the final length of a generated string before starting creating it?

I think the most typical application that can benefit from this feature is a template engine, but it is difficult to tell the length of a string being generated before it is actually generated.

--
Yusuke Endoh mame@ruby-lang.org

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago

Template engines would be the main use case for this. Note that you don't need to know the exact length of the string. If you know your average generated string is 8k, you can use a 8k buffer by default. If the string generated is only 2k, I'm guessing it still increases performance, and doesn't really cost you any memory in the long term as such strings are garbage collected quickly. If the page generated is 100k, starting with an 8k buffer will still increase performance as it saves you many reallocs.

Updated by duerst (Martin Dürst) almost 3 years ago

On 2016/01/30 21:17, mame@ruby-lang.org wrote:

I have no strong objection, but still I have a question. I guess that creating a large string by concatenating many short strings is natural use case in Ruby. However, are there so much cases where we can predict the final length of a generated string before starting creating it?

Usually, when doing something like

result = ""
loop do
  result << some_string
end
result

this starts to get slow quite quickly. The standard advice (which I have
myself used often) is to convert this to:

result = []
loop do
  result << some_string
end
result.join

This often speeds up things considerably.

I haven't completely analyzed the code for Array#join, but on first
impression, it seems to do exactly what you suggest: calculate the
overall length first, then do the actual copying.

Updated by matz (Yukihiro Matsumoto) almost 3 years ago

The original proposal #905 was 7 years old. But now we have keyword argument, so adding capacity: keyword to String#new is better idea, I think.

Matz.

#18

Updated by naruse (Yui NARUSE) almost 3 years ago

  • Status changed from Open to Closed

Applied in changeset r53850.


  • string.c (rb_str_init): introduce String.new(capacity: size) [Feature #12024]

Also available in: Atom PDF