Project

General

Profile

Actions

Feature #21861

open

C API: expose `ruby_xfree_sized`, `ruby_xrealloc_sized`, etc

Feature #21861: C API: expose `ruby_xfree_sized`, `ruby_xrealloc_sized`, etc

Added by byroot (Jean Boussier) 17 days ago. Updated 12 days ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:124677]

Description

Context

For the longest time now, Ruby had internal API such as ruby_sized_xfree, ruby_sized_xrealloc and ruby_sized_xrealloc2.
These are identical to their non-sized counterpart, expect they accept the original size of the buffer being free or reallocated.

This information is then fed to the garbage collector to improve heuristics as to when GC should run (malloc_increase_bytes and oldmalloc_increase_bytes).

When the old buffer size isn't provided, Ruby tries to obtain it via malloc_usable_size, which isn't available on all platform, but also isn't quite correct as it's generally over reporting. It also happen to be quite slow on some allocators, such as mallocng, the allocator used by muslc hence by alpine linux.

C23 free_sized

The C23 standard added the free_sized(void *ptr, size_t oldsize) API, and it's now available in glibc 2.43, other popular allocators like jemalloc and tcmalloc also have it since a few years.

Original spec: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2699.htm

Interestingly the standard doesn't include a realloc_sized API, I'm unsure why.

For most allocators, providing the size to free_sized allow to skip some step and provide a speedup. I've read some 30% claims, but I haven't benchmarked myself.

One thing to note is that according to the standard, passing the wrong size is undefined behavior:

If ptr is the result obtained from a call to malloc(size), realloc(old_ptr, size), or calloc(nmemb, memb_size), where nmemb * memb_size is equal to size, this function behaves equivalently to free(ptr). Otherwise, the result is undefined.

Recently I refactored most internal calls to xfree inside Ruby itself to use ruby_sized_xfree instead, and would like to have it use free_sized when available.

Also when RUBY_DEBUG is defined, xmalloc does register the original allocated size, and enforce that the size provided to ruby_sized_xfree does match. This allowed to uncover several small bugs, notably in string.c where we'd sometime end up with an incorrect capacity attribute after an encoding change.

Benefits

Aside from the performance gains on some allocators, and the gains from having more accurate GC statistics, I believe these sized interfaces help ensure correctness of the code.

Except for a few exceptions, most code dealing with a buffer must know its size to avoid going out of bound, so providing it to ruby_xfree_sized allows to assert the size was correctly tracked in debug builds. I also believe ASAN or UBSAN are likely to also enforce this soon (if they don't already), so it could help catch more bugs early.

Proposal

I would like to expose these sized APIs to C extensions, so that they can benefit from it as well:

  • void ruby_xfree_sized(void *ptr, size_t oldsize)
  • void *ruby_xrealloc_sized(void *ptr, size_t newsize, size_t oldsize)
  • void *ruby_xrealloc2_sized(void *ptr, size_t n, size_t size, size_t old_n)

I also think I would be useful to expose the corresponding macros:

  • RB_REALLOC_SIZED_N
  • RB_FREE_SIZED (shorthand to free a simple pointer: ruby_xfree_sized(ptr, sizeof(*ptr))).
  • RB_FREE_SIZED_N

Related issues 1 (0 open1 closed)

Related to Ruby - Bug #21868: Prism doesn't use the ruby allocatorClosedprismActions

Updated by byroot (Jean Boussier) 15 days ago Actions #1

  • Related to Bug #21868: Prism doesn't use the ruby allocator added

Updated by byroot (Jean Boussier) 12 days ago Actions #2

  • Description updated (diff)
Actions

Also available in: PDF Atom