Feature #15923
closed
New independent string without memcpy
Added by puchuu (Andrew Aladjev) almost 5 years ago.
Updated over 4 years ago.
Description
Hello. I've just tried to implement extension for ruby that will provide large binary strings.
I've inspected latest ruby source code and found 2 functions: rb_str_new and rb_str_new_static .
-
rb_str_new allocates new memory and uses memcpy to copy from source string to new memory.
-
rb_str_new_static uses existing source string as it is, but adds STR_NOFREE flag.
Is it possible to create independent string from source string without memcpy that will be freed automatically? Thank you.
puchuu (Andrew Aladjev) wrote:
Is it possible to create independent string from source string without memcpy that will be freed automatically?
In C there are several ways to free a memory region, depending how that string was allocated.
"Every string must be able to be freed using free()" is simply a wrong assertion.
So no, there is no way for ruby to automatically free a memory allocated by others.
C is not made that way.
I think what puchuu is asking is if he can pass a malloc'd string to a ruby function that will create a new string object that frees the given underlying buffer when the string object is destructed. Having read the code, I didn't come upon such a case but I imagine it's possible with a slight hack (untested by me, however):
VALUE str = rb_str_new_static(buffer, buflen); /* no malloc or memcpy done here, just ownership change of buffer */
RUBY_FL_UNSET(str, STR_NOFREE); /* STR_NOFREE isn't actually defined in internal.h unfortunately, it's currently same as FL_USER18, but could change. */
Perhaps a new ruby string creation function would be useful? Something like rb_str_new_take()
. Just a thought.
Of course the allocator used to allocate the buffer would have to be the same as Ruby's allocator or bad things will happen...
ruby_xfree
!= free
.
Using the former on malloc'ed buffer can cause a crash.
Thank you Nobu, I thought that might be the case but was unaware as I'm not familiar with the GC subsystem. Also I think shyouhei was saying the same thing, I was just too dense to understand the specifics of what he was saying :)
Having taken a cursory look, it seems ruby is adding some bookkeeping information at the start of every memory buffer allocated by ruby_xmalloc
and family. It returns the memory after this bookkeeping information (the actual buffer size asked for), and when this buffer is given to ruby_xfree
, ruby calculates the actual starting point by moving backwards 1 bookkeeping structure, then passes this to free
.
So, you would have to allocate using ruby_xmalloc
and friends anyway, in which case it seems useless to provide such a function like rb_str_new_take
.
Instead of working on a separate buffer then asking Ruby to take ownership, you could make changes to the buffer of a string:
VALUE new_string = rb_str_new("", 0);
rb_str_resize(new_string, size_you_want);
do_work(RSTRING_PTR(new_string), RSTRING_LEN(new_string));
Would this be good enough?
It should be OK when passing the buffer from callers, but doesn't work with a library which returns a buffer allocated inside.
FYI: you can allocate the buffer by rb_str_new(NULL, size_you_want)
at once.
nobu (Nobuyoshi Nakada) wrote:
It should be OK when passing the buffer from callers, but doesn't work with a library which returns a buffer allocated inside.
FYI: you can allocate the buffer by rb_str_new(NULL, size_you_want)
at once.
Thanks all, I see. Ruby has some kind of internal memory allocation mechanism and it is not recommended to use strings allocated outside.
Integration of rb_str_resize
into buffer growth mechanism is a good but complex solution. I will keep string copy.
- Status changed from Open to Rejected
I didn't all comments, but it seems solved.
Please reopen it if it is my mistake.
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0