Project

General

Profile

Actions

Bug #20585

closed

Size of memory allocated by String.new(:capacity) is different from the specified value

Added by os (Shigeki OHARA) 5 months ago. Updated 5 months ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.3.2 (2024-05-30 revision e5a195edf6) [x86_64-freebsd14.0]
[ruby-core:118345]

Description

IMHO, if :capacity is specified in String.new, capa will be its value.

In fact, Ruby 3.2 seems to allocate the size as specified.

% cat string_capacity.rb
unless /\A3\.[23]\./ =~ RUBY_VERSION
  raise NotImplementedError, 'Not Supported Ruby Version'
end

require 'inline'

class String
  def super_inspect
    self.class.superclass.instance_method(:inspect).bind(self).call
  end
  inline do |builder|
    builder.include '<stdio.h>'
    builder.add_compile_flags '-Wall'
    builder.c_raw <<~CODE
      VALUE capacity(int argc, VALUE *argv, VALUE self) {
        struct RString *rstring = RSTRING(self);

        if (! (RBASIC(self)->flags & RSTRING_NOEMBED)) {
          return rb_to_symbol(rb_str_new_cstr("EMBED"));
        } else {
          if (RBASIC(self)->flags & ELTS_SHARED) {
            return rb_to_symbol(rb_str_new_cstr("SHARED"));
          } else {
            return LONG2NUM(rstring->as.heap.aux.capa);
          }
        }
        return Qnil; /* NOTREACHED */
      }
    CODE
  end
end
% irb -I. -rstring_capacity
irb(main):001:0> [RUBY_PLATFORM, RUBY_VERSION]
=> ["x86_64-freebsd14.0", "3.2.4"]
irb(main):002:0> String.new('', capacity: 1024).capacity
=> 1024
irb(main):003:0> String.new('*'*1024, capacity: 1024).capacity
=> 1024
irb(main):004:0>

This is what I expect.

However, Ruby 3.3 seems to behave differently.

% irb -I. -rstring_capacity
irb(main):001> [RUBY_PLATFORM, RUBY_VERSION]
=> ["x86_64-freebsd14.0", "3.3.2"]
irb(main):002> String.new('', capacity: 1024).capacity
=> 1023
irb(main):003> String.new('*'*1024, capacity: 1024).capacity
=> 2047
irb(main):004>
  • If only :capacity is specified, one byte less is allocated.
  • If the initial string and its bytesize are specified, about twice the size is allocated.

Is this intentional?

Updated by byroot (Jean Boussier) 5 months ago

Most of this comes from: https://github.com/ruby/ruby/pull/8825

Long story short, capacity is a bit confusing because since Ruby strings are null terminated, there is always at least one extra byte needed. So it's debatable whether the terminating byte is accounted for in the capacity.

I see how when using String.new(capacity:), the goal is to avoid reallocation, so if you precomputed the final string size, that might defeat the purpose. The other side of the coin though, is that if you use sizes like 4096 hoping to fit in a specific size in memory, the extra terminator byte make it not behave as you'd hoped.

If the initial string and its bytesize are specified, about twice the size is allocated.

I need to dig more to answer this one.

Updated by byroot (Jean Boussier) 5 months ago

  • Backport changed from 3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN to 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED

If the initial string and its bytesize are specified, about twice the size is allocated.

Alrigth, this was just a fallout of the other change. The smaller buffer would cause the string to grow in size when the original string was copied, so doubling.

I opened: https://github.com/ruby/ruby/pull/11018

Actions #3

Updated by byroot (Jean Boussier) 5 months ago

  • Status changed from Open to Closed

Applied in changeset git|83f57ca3d225ce06abbc5eef6aec37de4fa36d58.


String.new(capacity:) don't substract termlen

[Bug #20585]

This was changed in 36a06efdd9f0604093dccbaf96d4e2cb17874dc8 because
String.new(1024) would end up allocating 1025 bytes, but the problem
with this change is that the caller may be trying to right size a String.

So instead, we should just better document the behavior of capacity:.

Updated by Dan0042 (Daniel DeLorme) 5 months ago

What about allocating capacity+1 unless capacity is a power of two?

Updated by k0kubun (Takashi Kokubun) 5 months ago

  • Backport changed from 3.1: DONTNEED, 3.2: DONTNEED, 3.3: REQUIRED to 3.1: DONTNEED, 3.2: DONTNEED, 3.3: DONE
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0