Project

General

Profile

Actions

Feature #12024

closed

Add String.buffer, for creating strings with large capacities

Added by jeremyevans0 (Jeremy Evans) about 8 years ago. Updated about 8 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:73487]

Description

If you know you are going to need to create a large string,
it's better to create it with a large capacity. Otherwise,
ruby will need to continuously resize the string as it grows.
For example, if you will be producing a string that is
100000 bytes, String.buffer(100000) will avoid 10 separate
resizes compared to using String.new.

Performance-wise, String.new is 1.33x slower than
String.buffer(100000) if appending in 1000 byte chunks,
and 1.64x slower than String.buffer(1000) if appending
in 100 byte chunks.

To make sure this works correctly with String subclasses,
a static rb_str_buf_new_with_class function is added, which
both String.buffer and rb_str_buf_new now call.


Files


Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #905: Add String.new(fixnum) to preallocate large bufferClosedmatz (Yukihiro Matsumoto)Actions

Updated by matz (Yukihiro Matsumoto) about 8 years ago

I like the idea.

Matz.

Actions #2

Updated by mame (Yusuke Endoh) about 8 years ago

  • Related to Feature #905: Add String.new(fixnum) to preallocate large buffer added

Updated by mame (Yusuke Endoh) about 8 years ago

I think this ticket is a duplicate of #905.

--
Yusuke Endoh

Updated by jeremyevans0 (Jeremy Evans) about 8 years ago

I wasn't aware of #905 when I created this, I probably should have searched first.

This feature is similar to #905, but the implementation is different. The implementation in #905 is an instance method that changes the capacity of an existing string. This is a class method that creates a new string with the given capacity. This does use the method name recommended by matz in #905, by accident, but I'm guessing that makes it a good name.

Updated by mame (Yusuke Endoh) about 8 years ago

I'm unsure if this pre-allocation is really effective since recent "realloc" implementations take just O(1). What operating system are you using?

In #905, pre-allocation was actually helpful in JRuby because JVM did not provide such a good "realloc" feature. But the speed up was not confirmed in Linux. So the issue was, "should MRI provide the feature for script compatibility, even if it is meaningless in Linux?"

However, even if the computational order is O(1), it may have a large constant term. I'm not against the proposal if a real-world quantitative evaluation shows use case where it is actually effective.

--
Yusuke Endoh

Updated by kosaki (Motohiro KOSAKI) about 8 years ago

On Tue, Jan 26, 2016 at 7:12 PM, wrote:

Issue #12024 has been updated by Jeremy Evans.

I wasn't aware of #905 when I created this, I probably should have searched first.

This feature is similar to #905, but the implementation is different. The implementation in #905 is an instance method that changes the capacity of an existing string. This is a class method that creates a new string with the given capacity. This does use the method name recommended by matz in #905, by accident, but I'm guessing that makes it a good name.

Class method is really bad idea because of thread unsafe.

bad scenario example:

Thread-A Thread-B

String.buffer(10000)
s1 = String.new
s2 = String.new
String.buffer(10)

In above scenario, s2 accidentally get 10000 capability. This is huge
memory waste.
And as mame said, we already talk about this feature doesn't improve
performance in #905.

-1.

Updated by naruse (Yui NARUSE) about 8 years ago

%  cat test.rb
# frozen_string_literal: true
require 'benchmark'
N = 50_000_000
Benchmark.bm 5 do |r|
  r.report "buffer" do
    i = 0
    buf = String.buffer(N)
    while i < N
      buf << 'a'
      i += 1
    end
  end
  r.report "normal" do
    i = 0
    buf = String.new
    while i < N
      buf << 'a'
      i += 1
    end
  end
end

On FreeBSD

%  ./ruby test.rb
            user     system      total        real
buffer  6.765625   0.039062   6.804688 (  6.802679)
normal  6.796875   0.062500   6.859375 (  6.864754)

Updated by naruse (Yui NARUSE) about 8 years ago

Motohiro KOSAKI wrote:

Class method is really bad idea because of thread unsafe.

It return a new string object.

Updated by naruse (Yui NARUSE) about 8 years ago

More effective example.

%  cat test.rb; ./ruby test.rb
# frozen_string_literal: true
require 'benchmark'
M = 100
N = 25
Benchmark.bm 5 do |r|
  r.report "buffer" do
    j = 0
    while j < M
      i = 0
      buf = String.buffer(2**N)
      buf << "a"
      while i < N
        buf << buf
        i += 1
      end
      j += 1
    end
  end
  r.report "normal" do
    j = 0
    while j < M
    i = 0
    buf = String.new
    buf << "a"
    while i < N
      buf << buf
      i += 1
    end
      j += 1
    end
  end
end
ruby 2.4.0dev (2016-01-27 trunk 53675) [x86_64-freebsd10.2]
            user     system      total        real
buffer  0.976562   1.070312   2.046875 (  2.048577)
normal  2.085938   2.867188   4.953125 (  4.954006)

Updated by darix (Marcus Rückert) about 8 years ago

Yui NARUSE wrote:

Motohiro KOSAKI wrote:

Class method is really bad idea because of thread unsafe.

It return a new string object.

why not something like "String.new(:buffersize => 10000)" then?

Updated by mame (Yusuke Endoh) about 8 years ago

Yui NARUSE wrote:

More effective example.

What an artificial example that pinpoints realloc overhead :-)

--
Yusuke Endoh

Updated by jeremyevans0 (Jeremy Evans) about 8 years ago

Here's the benchmark I wrote for this:

require 'benchmark/ips'

eval("def a; b = ' ' * 1000; String.new " + "<< b " * 100 + "end")
eval("def c; b = ' ' * 1000; String.buffer(100000) " + "<< b " * 100 + "end")

Benchmark.ips do |x|
  x.report("String.new - 100000/1000"){a}
  x.report("String.buffer - 100000/1000"){c}
  x.compare!
end

eval("def a; b = ' ' * 100; String.new " + "<< b " * 10 + "end")
eval("def c; b = ' ' * 100; String.buffer(1000) " + "<< b " * 10 + "end")

Benchmark.ips do |x|
  x.report("String.new - 1000/100"){a}
  x.report("String.buffer - 1000/100"){c}
  x.compare!
end

The problem with the benchmark in note 7 is the bottleneck is high ruby execution overhead, due to individual character appends. The problem with the benchmark in note 9 is that is attempts to specifically exploit realloc issues. This benchmark is more realistic, I think.

As to why String.buffer(10000) instead of String.new(:buffersize=>10000), see https://bugs.ruby-lang.org/issues/905#note-2

Updated by akr (Akira Tanaka) about 8 years ago

Jeremy Evans wrote:

As to why String.buffer(10000) instead of String.new(:buffersize=>10000), see https://bugs.ruby-lang.org/issues/905#note-2

I think String.new(:buffersize=>10000) is better because it can be combined with :encoding such as:
String.new(:encoding=>foo, :buffersize=>10000).
This way avoids multiplicative methods explosion: with/without encoding, with/without buffersize, ...
I can't imagine other attributes for String, though.

I guess matz doesn't apply the logic of https://bugs.ruby-lang.org/issues/905#note-2 here.
Keyword arguments may be considered different from type based dispatch.

Updated by mame (Yusuke Endoh) about 8 years ago

Thank you for your benchmark. I tried it and confirmed the speed up in Linux. Now I have no strong objection against your proposal.

realloc takes O(1) if the size is much larger than page size (4096B in Linux), but the benchmark starts from a much shorter string, which requires memcpy. I thought this constant term was not so big, but seems it is. Kosaki-san may have an opinion about this.

I have no strong objection, but still I have a question. I guess that creating a large string by concatenating many short strings is natural use case in Ruby. However, are there so much cases where we can predict the final length of a generated string before starting creating it?

I think the most typical application that can benefit from this feature is a template engine, but it is difficult to tell the length of a string being generated before it is actually generated.

--
Yusuke Endoh

Updated by jeremyevans0 (Jeremy Evans) about 8 years ago

Template engines would be the main use case for this. Note that you don't need to know the exact length of the string. If you know your average generated string is 8k, you can use a 8k buffer by default. If the string generated is only 2k, I'm guessing it still increases performance, and doesn't really cost you any memory in the long term as such strings are garbage collected quickly. If the page generated is 100k, starting with an 8k buffer will still increase performance as it saves you many reallocs.

Updated by duerst (Martin Dürst) about 8 years ago

On 2016/01/30 21:17, wrote:

I have no strong objection, but still I have a question. I guess that creating a large string by concatenating many short strings is natural use case in Ruby. However, are there so much cases where we can predict the final length of a generated string before starting creating it?

Usually, when doing something like

result = ""
loop do
  result << some_string
end
result

this starts to get slow quite quickly. The standard advice (which I have
myself used often) is to convert this to:

result = []
loop do
  result << some_string
end
result.join

This often speeds up things considerably.

I haven't completely analyzed the code for Array#join, but on first
impression, it seems to do exactly what you suggest: calculate the
overall length first, then do the actual copying.

Updated by matz (Yukihiro Matsumoto) about 8 years ago

The original proposal #905 was 7 years old. But now we have keyword argument, so adding capacity: keyword to String#new is better idea, I think.

Matz.

Actions #18

Updated by naruse (Yui NARUSE) about 8 years ago

  • Status changed from Open to Closed

Applied in changeset r53850.


  • string.c (rb_str_init): introduce String.new(capacity: size)
    [Feature #12024]
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0