Feature #20878
openA new C API to create a String by adopting a pointer: `rb_enc_str_adopt(const char *ptr, long len, long capa, rb_encoding *enc)`
Description
Context¶
A common use case when writing C extensions is to generate text or bytes into a buffer, and to return it back
wrapped into a Ruby String. Examples are JSON.generate(obj) -> String
, and all other format serializers,
compression libraries such as ZLib.deflate
, etc, but also methods such as Time.strftime
,
Current Solution¶
Work in a buffer and copy the result¶
The most often used solution is to work with a native buffer and to manage a native allocated buffer,
and once the generation is done, call rb_str_new*
to copy the result inside memory managed by Ruby.
It works, but isn't very efficient because it cause an extra copy and an extra free()
.
On ruby/json
macro-benchmarks, this represent around 5% of the time spent in JSON.generate
.
static void fbuffer_free(FBuffer *fb)
{
if (fb->ptr && fb->type == FBUFFER_HEAP_ALLOCATED) {
ruby_xfree(fb->ptr);
}
}
static VALUE fbuffer_to_s(FBuffer *fb)
{
VALUE result = rb_utf8_str_new(FBUFFER_PTR(fb), FBUFFER_LEN(fb));
fbuffer_free(fb);
return result;
}
Work inside RString allocated memory¶
Another way this is currently done, is to allocate an RString
using rb_str_buf_new
,
and write into it with various functions such as rb_str_catf
,
or writing past RString.len
through RSTRING_PTR
and then resize it with rb_str_set_len
.
The downside with this approach is that it contains a lot of inefficiencies, as rb_str_set_len
will perform
numerous safety checks, compute coderange, and write the string terminator on every invocation.
Another major inneficiency is that this API make it hard to be in control of the buffer
growth, so it can result in a lot more realloc()
calls than manually managing the buffer.
This method is used by Kernel#sprintf
, Time#strftime
etc, and when I attempted to improve Time#strftime
performance, this problem showed up as the biggest bottleneck:
- https://github.com/ruby/ruby/pull/11547
- https://github.com/ruby/ruby/pull/11544
- https://github.com/ruby/ruby/pull/11542
Proposed API¶
I think a more effcient way to do this would be to work with a native buffer, and then build a RString
that "adopt" the memory region.
Technically, you can currently do this by reaching directly into RString
members, but I don't think it's clean,
and a dedicated API would be preferable:
/**
* Similar to rb_str_new(), but it adopts the pointer instead of copying.
*
* @param[in] ptr A memory region of `capa` bytes length. MUST have been allocated with `ruby_xmalloc`
* @param[in] len Length of the string, in bytes, not including the
* terminating NUL character, not including extra capacity.
* @param[in] capa The usable length of `ptr`, in bytes, including the
* terminating NUL character.
* @param[in] enc Encoding of `ptr`.
* @exception rb_eArgError `len` is negative.
* @return An instance of ::rb_cString, of `len` bytes length, `capa - 1` bytes capacity,
* and of `enc` encoding.
* @pre At least `capa` bytes of continuous memory region shall be
* accessible via `ptr`.
* @pre `ptr` MUST have been allocated with `ruby_xmalloc`.
* @pre `ptr` MUST not be manually freed after `rb_enc_str_adopt` has been called.
* @note `enc` can be a null pointer. It can also be seen as a routine
* identical to rb_usascii_str_new() then.
*/
rb_enc_str_adopt(const char *ptr, long len, long capa, rb_encoding *enc);
An alternative to the adopt
term, could be move
.
Files
Updated by Eregon (Benoit Daloze) about 1 month ago
LGTM, +1.
Maybe simply rb_str_adopt()
for the name?
That way it's closer to rb_str_new()
, and these days all String C API taking a C string should also take an encoding anyway so we don't need enc_
and enc-less variants.
Updated by byroot (Jean Boussier) about 1 month ago
Maybe simply rb_str_adopt() for the name?
I don't have a strong opinion here, I just went with the current convention.
On another note:
ptr
MUST have been allocated withruby_xmalloc
.
I'm actually not sure this really need to be a MUST, I suppose what is a MUST is that the pointer should be freeable
with ruby_xfree
, but that's it.
Updated by nobu (Nobuyoshi Nakada) about 1 month ago
I think it is unsafe for memory leak, in comparison with "RString allocated memory".
Updated by byroot (Jean Boussier) about 1 month ago
I think it is unsafe for memory leak, in comparison with "RString allocated memory".
I'm sorry I don't follow, could you expand on what you mean is unsafe? The entire "adopt" idea?
Updated by shyouhei (Shyouhei Urabe) about 1 month ago
byroot (Jean Boussier) wrote in #note-4:
I think it is unsafe for memory leak, in comparison with "RString allocated memory".
I'm sorry I don't follow, could you expand on what you mean is unsafe? The entire "adopt" idea?
There is no reason for us to believe that the const char *ptr
was allocated by malloc. It could be done by mmap or dlopen or anything. Ruby cannot garbage collect the string because it simply doesn't know how. Memory leak here is kind of inevitable.
Updated by nobu (Nobuyoshi Nakada) about 1 month ago
byroot (Jean Boussier) wrote in #note-4:
I think it is unsafe for memory leak, in comparison with "RString allocated memory".
I'm sorry I don't follow, could you expand on what you mean is unsafe? The entire "adopt" idea?
Whenever you allocate a new object, there is a risk of a memory error.
In that case, who will look after the pointer that is about to be "adopted"?
Updated by byroot (Jean Boussier) about 1 month ago
There is no reason for us to believe that the const char *ptr was allocated by malloc.
The proposed function documentation state that the pointer MUST have been allocated with ruby_xmalloc
.
henever you allocate a new object, there is a risk of a memory error. In that case, who will look after the pointer that is about to be "adopted"?
I see. From my understanding, the only possible error is OutOfMemory, what if rb_enc_str_adopt
would directly call ruby_xfree
on the pointer in such case? Would that cover your concern?
Updated by shyouhei (Shyouhei Urabe) about 1 month ago
byroot (Jean Boussier) wrote in #note-7:
There is no reason for us to believe that the const char *ptr was allocated by malloc.
The proposed function documentation state that the pointer MUST have been allocated with
ruby_xmalloc
.
If that's okay that's okay. For instance a return value of asprintf cannot be "adopt"ed then because obviously, that's not allocated by ruby_xmalloc.
Updated by rhenium (Kazuki Yamaguchi) about 1 month ago
byroot (Jean Boussier) wrote:
Work inside RString allocated memory¶
[...]
The downside with this approach is that it contains a lot of inefficiencies, asrb_str_set_len
will perform
numerous safety checks, compute coderange, and write the string terminator on every invocation.
I thought rb_str_set_len()
was supposed to be the efficient alternative to rb_str_resize()
meant for such a purpose.
I think an assert on the capacity or filling the terminator is cheap enough that it won't matter. That it computes coderange is news to me - I found it was since commit 6b66b5fdedb2c9a9ee48e290d57ca7f8d55e01a2 / [Bug #19902] in 2023. I think correcting coderange after directly modifying the RString-managed buffer is the caller's responsibility. Perhaps it could be reversed?
Updated by byroot (Jean Boussier) about 1 month ago
If that's okay that's okay. For instance a return value of asprintf cannot be "adopt"ed then because obviously, that's not allocated by ruby_xmalloc.
Yes, that's why I'm wondering if this requirement should be relaxed to "MUST be freeable by ruby_xfree
", which I believe would be true for asprintf
.
I think an assert on the capacity or filling the terminator is cheap enough that it won't matter.
It seemed to matter when I profiled. In some cases like strftime
the string is written byte by byte, so it basically double the cost of appending a byte.
Updated by kddnewton (Kevin Newton) about 1 month ago
I would use this in Prism as well. There are many cases where we allocate a string in the parser and then when we reify the Ruby AST we have to copy the string over. But the string content was allocated with ruby_xmalloc. So it would be nice to just hand over the string content without having to make a copy.
Personally I would prefer move as a naming convention, just because it mirrors what I would expect from std::move.
Updated by mdalessio (Mike Dalessio) about 1 month ago
This would likely be useful in Nokogiri as well. The two key places I have in mind are
- returning a large serialization string generated within libxml2 (which is configured to use
ruby_xmalloc
by default) - assembling an HTML5-compliant serialization within the extension (which currently uses
rb_enc_str_buf_cat
)
Updated by byroot (Jean Boussier) 28 days ago
Proposed implementation: https://github.com/ruby/ruby/pull/12143
Updated by nobu (Nobuyoshi Nakada) 27 days ago
Rather I want to propose an opposite:
char *rb_str_new_buffer(volatile VALUE *new_string, long size, rb_encoding *enc);
Updated by byroot (Jean Boussier) 27 days ago
How would that work? e.g. when you need to resize it?
Updated by nobu (Nobuyoshi Nakada) 9 days ago
byroot (Jean Boussier) wrote in #note-15:
How would that work? e.g. when you need to resize it?
VALUE string;
char *buffer = rb_str_new_buffer(&string, size, enc);
memcpy(buffer, somestring, length);
// ...
rb_str_modify_expand(string, 10); // expand 10 bytes
buffer = RSTRING_PTR(string); // re-get the pointer
Updated by byroot (Jean Boussier) 9 days ago
Right, so that's not really different from https://bugs.ruby-lang.org/issues/20878#Work-inside-RString-allocated-memory. IT's something that's already done, that new function would just be a shortcut for:
VALUE str = rb_str_buf_new(capa);
char *buffer = RSTRING_PTR(str);
Updated by nobu (Nobuyoshi Nakada) 8 days ago
Yes, and enc
.
Finally you want to allocate the String for a String-manageable pointer, why not allocate a managing String from the beginning?
Updated by byroot (Jean Boussier) 8 days ago · Edited
why not allocate a managing String from the beginning?
I explained it in the issue body. If you want to append one character to an RString, you need something like:
void
buf_append_c(VALUE buf, char c)
{
long capa = rb_str_capacity(buf);
if (RSTRING_LEN(buf) + 1 > capa) {
rb_str_modify_expand(buf, capa); // double capa
}
char *ptr;
long len;
RSTRING_GETMEM(buf, ptr, len);
ptr[len] = c;
// Lenght must be set right away in case GC
// triggers and tries to re-embed the buffer.
rb_str_set_len(buf, len + 1);
}
First that a lot more complicated than just working with a raw malloced buffer, you need some pretty good knowledge of Ruby inner workings not to make a mistake. For example, you could save some metadata like capacity
, but any time GC triggers, it's potentially no longer valid.
Second, all the rb_str_*
function will do a lot of costly sanity checking.
If I profile a simple script that's calling Time#strftime
, which is internally using the APIs you suggest:
time = Time.now
i = 10_000_000
while i > 0
i -= 1
time.strftime("%FT%T.%6N")
end
It looks like this: https://share.firefox.dev/3ZNdAfg
A ton of time is spent in:
-
rb_str_set_len
(9.8%
) -
rb_str_resize
(6.8%
) -
RB_FL_TEST_RAW
(to getRSTRING_PTR
etc) (5.9%
)
All together, that's more than the time spent doing the actual formatting work in BSD_vfprintf
, this seems like a major overhead to me.
If at least the API was easier to work with, I wouldn't mind so much, but in my opinion it's actually harder to work with.
Updated by nobu (Nobuyoshi Nakada) 7 days ago
byroot (Jean Boussier) wrote in #note-19:
why not allocate a managing String from the beginning?
I explained it in the issue body. If you want to append one character to an RString, you need something like:
It is same as rb_str_cat(buf, &c, 1)
.
First that a lot more complicated than just working with a raw malloced buffer, you need some pretty good knowledge of Ruby inner workings not to make a mistake. For example, you could save some metadata like
capacity
, but any time GC triggers, it's potentially no longer valid.
I can't get your point here.
Your proposal does need the knowledge more, I think.
A ton of time is spent in:
rb_str_set_len
(9.8%
)rb_str_resize
(6.8%
)RB_FL_TEST_RAW
(to getRSTRING_PTR
etc) (5.9%
)All together, that's more than the time spent doing the actual formatting work in
BSD_vfprintf
, this seems like a major overhead to me.
Recursive format in Time#strftime
may have a room for improvement.
Updated by byroot (Jean Boussier) 7 days ago
It is same as rb_str_cat(buf, &c, 1).
Yes and:
- You can't always use
rb_str_cat
, sometimes you have to pass a pointer to an existing API. -
rb_str_cat
does all the checks I mentioned and even more.
I can't get your point here.
I'm proposing a way to build strings that is both more convenient and more efficient.
The typical use case being ruby/json
fbuffer.h
and similar buffers in other gems such as msgpack
etc.
Updated by mame (Yusuke Endoh) 7 days ago
Discussed at the dev meeting.
Yes, that's why I'm wondering if this requirement should be relaxed to “MUST be freeable by ruby_xfree”, which I believe would be true for asprintf.
No, ruby_xfree
is not a simple delegator to system free, depending on environment and configuration.
However, in many environments (at the moment), it only delegates to system free, so it would be very hard to notice if it inadvertently depends on the implementation. This proposed API risks promoting such misuse.
So, it must be proved that this API is really unavoidable. We would like you to try to use String objects as buffers instead of memory pointer. If that brings performance problems, consider again.
Updated by byroot (Jean Boussier) 7 days ago
We would like you to try to use String objects as buffers instead of memory pointer.
Yes, I was planning to try to convert JSON's fbuffer
to use RString to show the impact. I'll update here once I have a working implementation.
Updated by byroot (Jean Boussier) 7 days ago
Done: https://github.com/byroot/json/pull/1
It's basically twice as slow on almost all benchmarks. My implementation is rather naive, I'm sure a bit of performance can be reclaimed by using various tricks, but it's risky as you have to be careful when GC trigger and kinda defeat the purpose of using a higher level API.
Updated by byroot (Jean Boussier) 7 days ago
No, ruby_xfree is not a simple delegator to system free, depending on environment and configuration.
I see, I didn't know about that compilation flag, thanks for letting me know that.
One solution I see (but that I'm not very fond of) is to pass the free
function that must be used to adopt
, e.g.
str = rb_enc_str_adopt(buf->ptr, buf->len, buf->capa, rb_utf8_encoding(), ruby_xfree);
// or
str = rb_enc_str_adopt(buf->ptr, buf->len, buf->capa, rb_utf8_encoding(), free);
This way rb_enc_str_adopt
can check if adopting the string is legal, and if it isn't deoptimize it into a copy?
Updated by rhenium (Kazuki Yamaguchi) 6 days ago
byroot (Jean Boussier) wrote in #note-19:
First that a lot more complicated than just working with a raw malloced buffer, you need some pretty good knowledge of Ruby inner workings not to make a mistake. For example, you could save some metadata like
capacity
, but any time GC triggers, it's potentially no longer valid.
I think I understood what you meant. From what I remember, compaction should exclude objects pinned by rb_gc_mark()
or referenced from the machine stack. In the Time#strftime
example, much fewer rb_str_set_len()
calls should be necessary since the String would be always on the stack.
Perhaps a more explicit way to prevent the re-embedding should be provided (for example by having the compaction code check the rb_str_locktmp()
status).
The proposed API seems tricky to me in that it requires the user to allocate a buffer including the terminator length. The NUL terminator is an implementation detail left out from the public API so far, and I believe it's something we've wanted to eventually get rid of for the SHARABLE_MIDDLE_SUBSTRING
optimization. I'm not sure if exposing it is a good idea.
Updated by byroot (Jean Boussier) 6 days ago
I think I understood what you meant. From what I remember, compaction should exclude objects pinned by rb_gc_mark() or referenced from the machine stack.
You are correct.
In the Time#strftime example, much fewer rb_str_set_len() calls should be necessary since the String would be always on the stack.
I don't think many of these could really be eluded, because you need to call it before every other call to a rb_str_
method, e.g:
// from strftime.c
rb_str_set_len(ftime, s-start); // must set the length before calling rb_str_append
rb_str_append(ftime, tmp);
RSTRING_GETMEM(ftime, s, len); // rb_str_append have changed the length and potentially the pointer.
The NUL terminator is an implementation detail left out from the public API so far [...] I'm not sure if exposing it is a good idea.
That's a very good point. I think it could be rephrased to say the NUL terminator is optional. Given the pointer is semantically adopter as soon as the function is called, it could do a realloc
is needed to add the terminator. Making it an implementation detail.
Updated by shyouhei (Shyouhei Urabe) 1 day ago
One thing pointed out in the last developer meeting was that future MMTK might want to break "asprintf return values can be reclaimable using ruby_xfree" assumption at process startup, by choosing different memory management schemes.