Project

General

Profile

Actions

Feature #21853

open

Make Embedded TypedData a public API

Feature #21853: Make Embedded TypedData a public API

Added by byroot (Jean Boussier) 26 days ago. Updated 24 days ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:124635]

Description

As part of Ruby 3.3, we added a private RUBY_TYPED_EMBEDDABLE flag to the TypedData API to allow TypedData to use variable width allocation.

Technically, we inadvertently exposed that flag in public headers so third party extensions can make use of it, but it's not considered public API as it's not documented, so it would be a poor decision.

This API has both memory and speed benefits as it allow to avoid some malloc/free churn, reduce pointer chasing, etc.

For instance, when we converted Time to be embedded, it improved allocation performance by 30% and also reduced memory usage by 20%: https://github.com/ruby/ruby/commit/aa6642de630cfc10063154d84e45a7bff30e9103

I believe numerous third party native extensions could benefit from it (I would certainly make use of it in ruby/json),
now that we used it internally for several years, I'd like to work on making it a public API for Ruby 4.1

Updated by Eregon (Benoit Daloze) 26 days ago Actions #1 [ruby-core:124636]

I'm thinking about this in the context of TruffleRuby, where RTypedData never moves (it's allocated via system calloc()).
I think the best then would be to ignore this new flag entirely, and so the public API should be done in a way that it can be implemented as if it's not embedded.

Related: https://github.com/truffleruby/truffleruby/issues/4130
So on TruffleRuby I think we could always use the same allocation for the RTypedData + data struct, when using TypedData_Make_Struct(), effectively the same as embedded TypedData but never moving.
But not when using TypedData_Wrap_Struct() since that uses an existing data pointer.

Updated by byroot (Jean Boussier) 26 days ago Actions #2 [ruby-core:124637]

So on TruffleRuby I think we could always use the same allocation for the RTypedData + data struct, when using TypedData_Make_Struct(), effectively the same as embedded TypedData but never moving.

I don't think so, because you still need to support DATA_PTR(obj) = ptr, which isn't allowed for embedded typed datas.

Updated by Eregon (Benoit Daloze) 24 days ago ยท Edited Actions #3 [ruby-core:124647]

Good point! How do embedded typed datas handle this, do they raise an exception in such a case?
Seems tricky given the DATA_PTR(obj) API returning a pointer.

I'd actually love if we had a separate API for changing the data pointer as a macro or function (e.g. RTYPEDDATA_SET_DATA(obj, new_data_pointer) to follow RTYPEDDATA_GET_DATA), so we know better when it can be changed.
Currently we have to workaround in TruffleRuby that after every native call that accesses a T_DATA we have to check if the data pointer has changed :/

Of course we wouldn't be able to remove DATA_PTR() yet, but we could maybe deprecate it and/or at some point make it return a const pointer or so to prevent writes.

Updated by byroot (Jean Boussier) 24 days ago Actions #4 [ruby-core:124648]

How do embedded typed datas handle this, do they raise an exception in such a case?

Unfortunately not. It end up with data corruption.

I'd actually love if we had a separate API for changing the data pointer as a macro or function

Makes sense.

Actions

Also available in: PDF Atom