Feature #15923

New independent string without memcpy

Added by puchuu (Andrew Aladjev) 4 days ago. Updated 2 days ago.

Target version:


Hello. I've just tried to implement extension for ruby that will provide large binary strings.

I've inspected latest ruby source code and found 2 functions: rb_str_new and rb_str_new_static .

  • rb_str_new allocates new memory and uses memcpy to copy from source string to new memory.
  • rb_str_new_static uses existing source string as it is, but adds STR_NOFREE flag.

Is it possible to create independent string from source string without memcpy that will be freed automatically? Thank you.


Updated by shyouhei (Shyouhei Urabe) 4 days ago

puchuu (Andrew Aladjev) wrote:

Is it possible to create independent string from source string without memcpy that will be freed automatically?

In C there are several ways to free a memory region, depending how that string was allocated.
"Every string must be able to be freed using free()" is simply a wrong assertion.

So no, there is no way for ruby to automatically free a memory allocated by others.
C is not made that way.

Updated by luke-gru (Luke Gruber) 3 days ago

I think what puchuu is asking is if he can pass a malloc'd string to a ruby function that will create a new string object that frees the given underlying buffer when the string object is destructed. Having read the code, I didn't come upon such a case but I imagine it's possible with a slight hack (untested by me, however):

VALUE str = rb_str_new_static(buffer, buflen); /* no malloc or memcpy done here, just ownership change of buffer */
RUBY_FL_UNSET(str, STR_NOFREE); /* STR_NOFREE isn't actually defined in internal.h unfortunately, it's currently same as FL_USER18, but could change. */

Perhaps a new ruby string creation function would be useful? Something like rb_str_new_take(). Just a thought.

Of course the allocator used to allocate the buffer would have to be the same as Ruby's allocator or bad things will happen...

Updated by nobu (Nobuyoshi Nakada) 3 days ago

ruby_xfree != free.
Using the former on malloc'ed buffer can cause a crash.

Updated by luke-gru (Luke Gruber) 2 days ago

Thank you Nobu, I thought that might be the case but was unaware as I'm not familiar with the GC subsystem. Also I think shyouhei was saying the same thing, I was just too dense to understand the specifics of what he was saying :)

Having taken a cursory look, it seems ruby is adding some bookkeeping information at the start of every memory buffer allocated by ruby_xmalloc and family. It returns the memory after this bookkeeping information (the actual buffer size asked for), and when this buffer is given to ruby_xfree, ruby calculates the actual starting point by moving backwards 1 bookkeeping structure, then passes this to free.

So, you would have to allocate using ruby_xmalloc and friends anyway, in which case it seems useless to provide such a function like rb_str_new_take.

Also available in: Atom PDF