Feature #15408: Deprecate ObjectSpace._id2ref - Ruby - Ruby Issue Tracking System

Actions

Copy link

Feature #15408

closed

Deprecate ObjectSpace._id2ref

Added by headius (Charles Nutter) over 6 years ago. Updated 3 months ago.

Status:

Closed

Assignee:

headius (Charles Nutter)

Target version:

[ruby-core:90464]

Description

Ruby currently provides the object_id method to get a "identifier" for a given object. According to the documentation, this ID is the same for every object_id call against a given object, and guaranteed not to be the same as any other active (i.e. alive) object. However, no guarantee is made about the ID being reused for a future object after the original has been garbage collected.

As a result, object_id can't be used to uniquely identify any object that might be garbage collected, since that ID may be associated with a completely different object in the future.

Ruby also provides a method to go from an object_id to the object reference itself: ObjectSpace._id2ref. This method has been in Ruby for decades and is often used to implement a weak hashmap from ID to reference, since holding the ID will not keep the object alive. However due to the problems with object_id not actually being unique, it's possible for _id2ref to return a different object than originally had that ID as object slots are reused in the heap.

The only way to implement object_id safely (with idempotency guarantees) would be to assign to all objects a monotonically-increasing ID. Alternatively, this ID could be assigned lazily only for those objects on which the code calls object_id. JRuby implements object_id in this way currently.

The only way to implement _id2ref safely would be to have a mapping in memory from those monotonically-increasing IDs to the actual objects. This would have to be a weak mapping to prevent the objects from being garbage collected. JRuby currently only supports _id2ref via a flag, since the additional overhead of weakly tracking every requested object_id is extremely high. An alternative for MRI would be to implement _id2ref as a heap scan, as it is implemented in Rubinius. This would make it entirely unpractical due to the cost of scanning the heap for every ID lookup.

I propose that both methods should immediately be deprecated for removal in Ruby 3.0.

They do not do what people expect.
They cannot reliably do what they claim to do.
They eventually lead to difficult-to-diagnose bugs in every possible use case.

Put simply, both methods have always been broken in MRI and making them unbroken would render them useless.

Actions

Copy link

#1 [ruby-core:90465]

Updated by headius (Charles Nutter) over 6 years ago

I should point out that even the monotonically-increasing ID will eventually break once enough objects have been created to roll over a 64-bit integer limit unless object_id can be a Bignum, which would mean holding references Bignums for every ID from that point on.

Actions

Copy link

#2 [ruby-core:90469]

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

100% agree. These methods show too much internals (bare C pointer values). Maybe I can compromise on them moving into ext/objspace, but at least they should be hidden from the core API.

Actions

Copy link

#3 [ruby-core:90472]

Updated by ko1 (Koichi Sasada) over 6 years ago

I use object_id for debugging to compare identities.
comparison methods can be enough.

Actions

Copy link

#4 [ruby-core:90473]

Updated by headius (Charles Nutter) over 6 years ago

I use object_id for debugging to compare identities.
comparison methods can be enough.

I believe you are saying that using equal? is an acceptable alternative to comparing object_id's, yes?

Actions

Copy link

#5 [ruby-core:90474]

Updated by headius (Charles Nutter) over 6 years ago

These methods show too much internals (bare C pointer values).

It is possible to implement them as lazy monotonically increasing integers, but you either have to accept that they will overflow into bignum or wrap around at some point. JRuby has opted for wrap-around currently, so even our impl is not truly idempotent.

...they should be hidden from the core API.

I'd love to see a hard break sooner than later, but I think deprecation will quickly get the Ruby community to clean up uses.

It would also be a good idea for WeakMap to become a standard, public class rather than one hidden inside ObjectSpace, since that's typically the use case for _id2ref. Libraries can assign their own keys for objects in whatever way they choose (e.g. monotonically increasing integer).

Actions

Copy link

#6 [ruby-core:90475]

Updated by headius (Charles Nutter) over 6 years ago

Oh, I'd also rather not even see these features moved into ext/objspace since that just means people will start adding require 'objspace' so they can keep using the features. They should always warn, and ideally just disappear altogether.

Or as in JRuby, you have to enable them at the command line, with no alternative (to make it an explicit opt-in that no library can override).

Actions

Copy link

#7 [ruby-core:90476]

Updated by headius (Charles Nutter) over 6 years ago

There are currently 64 references in ext+lib to object_id, __id__, or _id2ref.

[] ~/projects/ruby $ git grep object_id lib | wc -l
      40

[] ~/projects/ruby $ git grep __id__ lib | wc -l
       9

[] ~/projects/ruby $ git grep object_id ext | wc -l
      14

[] ~/projects/ruby $ git grep __id__ ext | wc -l
       0

[] ~/projects/ruby $ git grep _id2ref lib | wc -l
       1

[] ~/projects/ruby $ git grep _id2ref ext | wc -l
       0

Actions

Copy link

#8 [ruby-core:90477]

Updated by ahorek (Pavel Rosický) over 6 years ago

https://github.com/rspec/rspec-expectations/blob/848fcb426cca1977b75909d0cb8a10f03a104b36/lib/rspec/matchers/built_in/equal.rb#L76
object_id is used even in rspec for debugging.
I also find it useful to estimate how many objects were created, but it may be wrong usage. Equal? can be enough for most cases.

_id2ref should be hidden.

Actions

Copy link

#9 [ruby-core:90478]

Updated by headius (Charles Nutter) over 6 years ago

My proposal would make WeakMap a standard Ruby feature, so you can implement your own ID system and weakly track objects in the right way. Maybe we need IDMap that returns a guaranteed-unique generated key when you insert an object.

https://github.com/headius/weakling/blob/master/lib/weakling/collections.rb

Actions

Copy link

#10 [ruby-core:90479]

Updated by headius (Charles Nutter) over 6 years ago

object_id is used even in rspec for debugging.

All objects need to have a base hashcode; that's what we should be logging instead. In Java, this is accessible using System.identityHashCode(obj). JRuby uses this for the base object hashcode (i.e. we do not trigger object_id to be created just for inspect).

I also find it useful to estimate how many objects were created, but it may be wrong usage. Equal? can be enough for most cases.

I'm not sure I understand this. It wouldn't be accurate to just track seen object_id's anyway since they'll get reused.

Actions

Copy link

#11 [ruby-core:90482]

Updated by spatulasnout (B Kelly) over 6 years ago

On 12/12/2018 4:53 PM, headius@headius.com wrote:

I propose that both methods should immediately be deprecated for removal in Ruby 3.0.

Agree on id2ref; strongly disagree on object_id.

They do not do what people expect.

object_id does precisely what I expect.

They cannot reliably do what they claim to do.

If so, let's fix the documentation of object_id.

They eventually lead to difficult-to-diagnose bugs in every possible use case.

How?

Sources of the embedded ruby portion of a robust C++ desktop application continuously
developed for 15+ years:

$ git grep object_id | wc -l
111

Put simply, both methods have always been broken in MRI and making them unbroken would render them useless.

object_id has never been broken. No need to tar it with id2ref's failings.

I should point out that even the monotonically-increasing ID will eventually break once enough objects have been created to roll over a 64-bit integer limit

((2**64)/1_000_000_000)/60.0/60.0/24.0/365.25
=> 584.5420460681421

(Merely an aside, since I maintain object_id is fine as-is. But this
appears to be a billion objects per second for 584 years.)

tl;dr: id2ref can go; please leave object_id alone.

Regards,

Bill

Actions

Copy link

#12 [ruby-core:90484]

Updated by spatulasnout (B Kelly) over 6 years ago

((2**64)/1_000_000_000)/60.0/60.0/24.0/365.25
=> 584.5420460681421

(Merely an aside, since I maintain object_id is fine as-is. But this
appears to be a billion objects per second for 584 years.)

Sorry, subtract the two bits needed to distinguish Fixnum from other
objects. Still: 146 years?

Actions

Copy link

#13 [ruby-core:90485]

Updated by duerst (Martin Dürst) over 6 years ago

headius (Charles Nutter) wrote:

I should point out that even the monotonically-increasing ID will eventually break once enough objects have been created to roll over a 64-bit integer limit unless object_id can be a Bignum, which would mean holding references Bignums for every ID from that point on.

I'd like to mention that in that case, the Bignum object will also need an object_id, which will also be a Bignum object,...

Fortunately, the numbers that @spatulasnout (B Kelly) has given show that this scenario is rather irrelevant.

On the other hand, if object_id is just a form of the pointer, then I wonder whether we're safe for transient heaps and the like, where objects are being moved around.

Actions

Copy link

#14 [ruby-core:90487]

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

duerst (Martin Dürst) wrote:

On the other hand, if object_id is just a form of the pointer, then I wonder whether we're safe for transient heaps and the like, where objects are being moved around.

Just to tell you that transient heaps are for non-object allocations, never for objects. They include pointers to objects, though.

ObjectSpace is a collection of objects
An array is an object, which resides inside of an objspace, which has a pointer to a memory region allocated in a transient heap.
That region inside of a transient heap has pointers to other objects, but never objects themselves.

+--objspace--+
|            |
| +-array--+ |   +-----theap-----+
| |        | |   |               |
| | (ptr) --------> +--region--+ |
| |        | |   |  |   ...    | |
| +--------+ |   |  |          | |
| | other  <---------- (ptr)   | |
| | random | |   |  |          | |
| | object | |   |  +----------+ |
| +--------+ |   |  | other    | |
| |        | |   |  | random   | |
     ...         |  | region   | |
| |        | |         ...
| +--------+ |   |  |          | |
|            |   |  +----------+ |
+------------+   |               |
                 +---------------+

Anyways this is an implementation detail that should never be visible from any ruby programs.

Actions

Copy link

#15 [ruby-core:90490]

Updated by headius (Charles Nutter) over 6 years ago

object_id does precisely what I expect.

Then your expectations do not include that it's actually an ID, since it's literally just a pointer into the heap. In the short term, that pointer will likely be occupied by other objects. Longer term for Ruby that pointer value will actually change as the heap gets compacted or objects are moved to other generations.

The problems of object_id could possibly be solved with a rename or better documentation. I think it needs to either be:

A strictly-monotonically increasing value. As you pointer out, it would be difficult with current Ruby implementations on current hardware to blow out the 62-bit limit. However that integer needs to be atomically updated for every object that needs an ID. It would still be best to do this lazily only for objects where it's needed. This is exactly the JRuby implementation right now (though we don't lose those two bits).

A pseudo-random hash value never guaranteed to be unique but guaranteed to have a reasonable hash distribution. This is the JVM's identityHashCode which JRuby uses for the base hash value for all objects. MRI currently uses object_id both as a base hash and as an unreliable pseudo-ID.

If _id2ref goes away and object_id becomes one of the above, that's likely acceptable. I don't like changing how object_id works in such a drastic way without naming it something more appropriate, though.

If so, let's fix the documentation of object_id.

Or we fix object_id to actually be an ID. Or we get rid of it and replace it with something more like a base hash. Both are better options than leaving it in place, since it's not an ID, it's not idempotent, and it doesn't do what most people expect.

They eventually lead to difficult-to-diagnose bugs in every possible use case.
How?

Nearly all uses of object_id I have seen treat it as a reliable alias for the object itself. All such code is broken. Exceptions include object_id used solely for base object hash calculation or inspect output, neither of which really require uniqueness.

Sources of the embedded ruby portion of a robust C++ desktop application continuously developed for 15+ years

This proves nothing without knowing how object_id is being used. What are you using those object_ids for? Show us please.

object_id has never been broken. No need to tar it with id2ref's failings.

One of the primary use cases of object_id is pairing it with _id2ref. As I've said a couple times, _id2ref most definitely needs to go away. Once it does, we have to decide what object_id is really supposed to be, because it can't be what it is now and be safely usable for more than logging or debugging.

Sorry, subtract the two bits needed to distinguish Fixnum from other objects. Still: 146 years?

Assuming 64-bit systems, you're right, it would take a long time with current Ruby implementations. That's why we thought it acceptable to implement it this way in JRuby many years ago, since JRuby has always had 64-bit Fixnums.

On the other hand, if object_id is just a form of the pointer...

Hopefully it's clear by now that it can't just be a form of the pointer, since the pointers are reused today and will be reused even more in the future.

Actions

Copy link

#16 [ruby-core:90496]

Updated by ko1 (Koichi Sasada) over 6 years ago

headius (Charles Nutter) wrote:

I use object_id for debugging to compare identities.
comparison methods can be enough.

I believe you are saying that using equal? is an acceptable alternative to comparing object_id's, yes?

Yes. The problem is I always forget which is identity comparison, equal?, eql?, eq?, ==.
:p

Actions

Copy link

#17 [ruby-core:90504]

Updated by Eregon (Benoit Daloze) over 6 years ago

Agreed we should remove ObjectSpace._id2ref, since it's fundamentally broken on MRI with the current MRI object_id semantics (could get another object if that memory address is reused after GC'ing the old object).

Removing #object_id doesn't make sense to me if we remove _id2ref: #object_id has the same semantics as System.identityHashCode(), and it is idempotent, but not unique (at least on MRI).
The Ruby documentation doesn't guarantee uniqueness:

The same number will be returned on all calls to object_id for a given object,
and no two active objects will share an id.

That's the same as System.identityHashCode().
I think we do need a identity-based hash for hashtables and hashing arbitrary objects which do not redefine #hash.

So, how about deprecating _id2ref in 2.6 or 2.7, and removing it in 3.0?

It seems only drb is using _id2ref in the standard library. We will need to find a replacement for that usage.

Aside: I wonder how _id2ref works in MRI if objects are moved by the GC (since #object_id is just the address), or maybe objects are never moved in MRI?

Actions

Copy link

#18 [ruby-core:90505]

Updated by shevegen (Robert A. Heiler) over 6 years ago

I originally wanted to write a very long reply but I think it becomes too difficult
to keep track of what is being said, so just my opinion in a somewhat more condensed
form than before:

I have no strong opinion on _id2ref. I think I used it only once or twice in 13 years.
I am not convinced that .object_id should be removed. We would lose some introspection
here, wouldn't we?

I would recommend to postpone deprecation after ruby 3.x IF it is decided to remove
object_id. But I am not convinced that it should be removed - it is not only a question
of implementation details, or how people use it "incorrectly", but it is whether we
should be able to query object ids, find them in ObjectSpace etc...

I would suggest this to ask matz in an upcoming developer meeting since part of the
question is the intent of how matz may suggest or think that ruby users may use
object_id. I don't have much code that relies on object_id, so an any change would
not affect me that much - but I am in disfavour of removing it eventually without
really knowing what the alternatives are. Or whether we would just lose functionality
without any real gain, with which I would disagree very strongly. So my own opinion
is much closer as to what spatulasnout wrote.

PS: It is a bit difficult to reason in pro/con when it comes to "incorrect" usage
of ruby code. Ultimately the parser allows something or does not; and how matz
designed ruby + ruby's philosophy plays in, which of course determines what the
parser allows (compare to python requiring () for method calls and ruby not
"caring" that much, unless it is ambiguous). Anything aside from that is heavily
subject to a personal opinion, and this becomes difficult to reason about.

The world is not going to end if object_id is removed; but it is not going to
end when object_id remains, either.

For sake of completion, I would also like to point out that the pickaxe mentioned
object_id several times, so I assume quite some people know and have used object_id
(and probably not many used _id2ref); and the name used to be .id if I recall
correctly before it was changed to object_id. But anyway, I highly recommend to
have this be discussed some time in 2019 at a developer meeting.

PSS: Perhaps for alternatives, such as WeakMap, it could be tested extensively,
perhaps as a separate gem, if only to provide a proof-of-concept.

Actions

Copy link

#19 [ruby-core:90507]

Updated by mame (Yusuke Endoh) over 6 years ago

Eregon (Benoit Daloze) wrote:

It seems only drb is using _id2ref in the standard library. We will need to find a replacement for that usage.

I've heard that pycall is also using _id2ref. This API seems to be useful to create a remote wrapper of a Ruby object out of the Ruby process; 1) Ruby passes object_id of some object to another process (say, Python), 2) Python creates a wrapper object that contains the Ruby object_id, and 3) Python asks the Ruby process to invoke a method of the object that has the object_id (_id2ref is actually useful here). I'm unsure how to garbage-collect the wrapped objects, though. Maybe they are assumed to be always marked during the session. Or a kind of distributed GC may be needed. (I'm not familiar with this area.)

Aside: I wonder how _id2ref works in MRI if objects are moved by the GC (since #object_id is just the address), or maybe objects are never moved in MRI?

Objects are never moved in MRI.

Actions

Copy link

#20 [ruby-core:90508]

Updated by headius (Charles Nutter) over 6 years ago

I'm glad most of us are in agreement about _id2ref!

object_id ... is idempotent,

Yeah you're right here...I realize the lack of idempotency applies to using it in _id2ref, since it will eventually return a different result over time, but a given object currently does maintain a consistent object_id.

and no two active objects will share an id.

That's the same as System.identityHashCode().

System.identityHashCode makes no uniqueness guarantees at all. It's absolutely possible for two objects to have the same identityHashCode, especially because it's only a 32-bit signed integer.

object_id guarantees uniqueness against other currently-alive objects, since it's the pointer to each object.

I mentioned above, I'd be mostly satisfied if object_id were reduced to an identity hash code OR if it were generated and guaranteed unique to the lifetime run of the process. It's somewhere in the middle right now and that's where the problems come from.

I wonder how _id2ref works in MRI if objects are moved by the GC

ko1 and tenderlove know current status of this, but up until recently no objects were moved in MRI ever. Now with generational GC and compaction, they'll absolutely be moved around, so object_id must change, regardless of how much people love it the way it is. The two options I've spelled out are reasonable alternatives.

At this point, my main concern is having it be called anything like "ID" without uniqueness.

If it remains "object_id" I think it needs to use the monotonically-increasing value.

If it will be reduced to an identity hashcode, it should not be named "object_id". "identity_hash" (mirrors its use in Hash) or something similar would be better/more accurate/more descriptive.

Actions

Copy link

#21 [ruby-core:90509]

Updated by headius (Charles Nutter) over 6 years ago

The world is not going to end if object_id is removed; but it is not going to end when object_id remains, either.

I don't think the world is going to end regardless of what we do. But object_id is going to become even less ID-like as we improve MRI's GC, so it really does have to change now. And obviously it can't be based on the pointer to the object once objects can move on the heap (compaction, generational), so that forces the issue right there.

Actions

Copy link

#22 [ruby-core:90510]

Updated by headius (Charles Nutter) over 6 years ago

I've heard that pycall is also using _id2ref. This API seems to be useful to create a remote wrapper of a Ruby object out of the Ruby process...

This isn't a problem. pycall needs to have its own unique ID generator for each object passed out to python, and a weak map to recover the object reference from that ID. Making WeakMap official largely covers this use case.

FWIW this is one reason why Java's native interface (JNI) requires you to explicitly request a handle you can save globally for a given object reference. Maintaining this mapping from a handle or ID back to a movable object reference is tricky and often expensive.

Actions

Copy link

#23 [ruby-core:90511]

Updated by headius (Charles Nutter) over 6 years ago

I'm unsure how to garbage-collect the wrapped objects, though

Oh this leads to another item Ruby really needs to add, related to the _id2ref removal: reference queues.

On the JVM, when you create a WeakReference, you can register it with a ReferenceQueue. When the object associated with the WeakReference is collected, the WeakReference is emptied and pushed onto the ReferenceQueue (by the GC). Later on, or in another thread, you can pull from that queue to clean up resources like wrappers, native pointers, or Hash entries.

Without ReferenceQueue, Ruby has no efficient way of cleaning up evacuated WeakRef objects (you have to scan for empty ones). I fixed this for JRuby in the weakling gem by exposing the ReferenceQueue implementation of the JVM: https://github.com/headius/weakling/blob/master/ext/org/jruby/ext/RefQueueLibrary.java#L56

With proper weak references and a reference queue, anyone can implement WeakMap efficiently on their own, and that covers most uses of _id22ref. We should still ship an official supported WeakMap, though.

Actions

Copy link

#24 [ruby-core:90512]

Updated by mrkn (Kenta Murata) over 6 years ago

Endo-san, thank you for describing the case of pycall.
But, unfortunately, I'm not using object_id and _id2ref for managing Ruby object references in Python.

Rather, in pycall, I use object_id and _id2ref for managing wrappers of Python classes and Python modules in Ruby.
The reason why I use them is I want to reuse existing wrappers, while I don't want to protect the wrappers from garbage collection.
For this purpose, pycall has mappings from a Python object pointer to an ID of Ruby object, which is a wrapper of the Python object.
The existing WeakRef can be used for this purpose, but I didn't want to use it because I want to reduce the frequency of object generation.

Actions

Copy link

#25 [ruby-core:90513]

Updated by headius (Charles Nutter) over 6 years ago

For this purpose, pycall has mappings from a Python object pointer to an ID of Ruby object, which is a wrapper of the Python object.

Could you use WeakMap to simply map the Python object pointer to the wrapper? I'm guessing Python doesn't move objects either, since they have to reference-count everything.

Actions

Copy link

#26 [ruby-core:90514]

Updated by Eregon (Benoit Daloze) over 6 years ago

mrkn (Kenta Murata) wrote:

Rather, in pycall, I use object_id and _id2ref for managing wrappers of Python classes and Python modules in Ruby.
The reason why I use them is I want to reuse existing wrappers, while I don't want to protect the wrappers from garbage collection.
For this purpose, pycall has mappings from a Python object pointer to an ID of Ruby object, which is a wrapper of the Python object.
The existing WeakRef can be used for this purpose, but I didn't want to use it because I want to reduce the frequency of object generation.

But then this is exposed to the _id2ref bug, i.e., it could return another wrapper if the wrapper is GC'd and a new wrapper gets the same address.
Then that would probably be a very serious bug exposed to pycall users.

Actions

Copy link

#27 [ruby-core:90515]

Updated by mrkn (Kenta Murata) over 6 years ago

Could you use WeakMap to simply map the Python object pointer to the wrapper?

Yes, I think I can use WeakMap for my purpose, maybe.

If my memory serves me correctly, when I implemented such a table, I investigated WeakMap. I found the following note in the comment, and then I decided to write it myself.

*  This class is mostly used internally by WeakRef, please use
*  +lib/weakref.rb+ for the public interface.

i.e., it could return another wrapper if the wrapper is GC'd and a new wrapper gets the same address.

In this case, I can check whether the Python object pointer in the returned wrapper is equal to the expected Python object pointer.

Actions

Copy link

#28 [ruby-core:90516]

Updated by headius (Charles Nutter) over 6 years ago

I found the following note in the comment

Yes, I want WeakMap to become part of the public API. I think that needs to happen to smoothly replace _id2ref.

I can check whether the Python object pointer in the returned wrapper is equal to the expected Python object pointer.

Or you can just wrap the whole thing with StandardError like the newrelic_rpm gem 😬

https://github.com/newrelic/rpm/blob/c529a7dce6afb520a587079a609d39afa83368aa/lib/new_relic/agent/transaction_time_aggregator.rb#L105

(Please don't do this)

Actions

Copy link

#29 [ruby-core:90517]

Updated by Hanmac (Hans Mackowiak) over 6 years ago

i hope you don't plan to mess with the VALUE type too

i jumped through many loops to get it right. I connected the Ruby GC with the Object handing from wxWidgets.

the wx Object holds a Holder Type which has the Ruby value inside
the Ruby value holds a pointer to the wx Object inside
then there is a RefCounter added, (so some are extra protected)
and i added the RubyHash as holder marked as global so as long as the Values stays in this map, they are not freed

=> Result:

For as long as the wx Object lives, the Holder Object inside it is alive too
When the wx Object gets deleted, the older object is deleted too, that does decrease the ref count, which when reaching zero does remove it from the Hash.
this does allow the ruby side allow the object to be freed too.

And yeah there is protection in Case the wx Object is deleted before the ruby Object, in that case the functions throw an Exception

Actions

Copy link

#30 [ruby-core:90518]

Updated by headius (Charles Nutter) over 6 years ago

i hope you don't plan to mess with the VALUE type too

Again, ko1 can explain better, but my understanding is that any direct references to VALUE (pointers to actual heap objects) will mark those objects as "shady" and they will not be moved. I don't think that protects the pointer from eventually pointing at another object, though.

object_id will have to be modified to not return the pointer value, so if you're relying on that you might have issues. But otherwise I think object_id is unrelated to VALUE or how objects are managed in C extensions.

Actions

Copy link

#31 [ruby-core:90550]

Updated by matz (Yukihiro Matsumoto) over 6 years ago

I agree with removing _id2ref (gradually for not breaking existing code, of course).
I am against removing object_id since it is used widely. But by removing _id2ref, the implementation of object_id can be separated from pointer values.

Matz.

Actions

Copy link

#32 [ruby-core:90559]

Updated by Eregon (Benoit Daloze) over 6 years ago

@naruse (Yui NARUSE) @matz (Yukihiro Matsumoto) Is it still time to deprecate ObjectSpace._id2ref for 2.6, or is it too late already in the release cycle?

Actions

Copy link

#33 [ruby-core:90561]

Updated by naruse (Yui NARUSE) over 6 years ago

Eregon (Benoit Daloze) wrote:

@naruse (Yui NARUSE) @matz (Yukihiro Matsumoto) Is it still time to deprecate ObjectSpace._id2ref for 2.6, or is it too late already in the release cycle?

For 2.6, it's too late because people don't have a chance to check the breakage with preview/rc.
Try on 2.7.

Actions

Copy link

#34 [ruby-core:90563]

Updated by Eregon (Benoit Daloze) over 6 years ago

Target version set to 2.7

Let's deprecate ObjectSpace._id2ref in 2.7 then.
I think we also need to address the usage in DRb. cc @seki (Masatoshi Seki)

Actions

Copy link

#35 [ruby-core:90950]

Updated by Eregon (Benoit Daloze) over 6 years ago

Assignee set to headius (Charles Nutter)

@headius (Charles Nutter) Would you like to start the deprecation of ObjectSpace._id2ref since you started this issue?

Actions

Copy link

#36 [ruby-core:91072]

Updated by naruse (Yui NARUSE) over 6 years ago

Target version deleted (~~2.7~~)

Target version is used by release engineering; don't use this as just a goal.
In this case set target version after drb removes ObjectSpace._id2ref.

Actions

Copy link

#37

Updated by headius (Charles Nutter) over 6 years ago

Sorry for the delay folks, I was intermittently blocked from accessing bugs.ruby-lang.org, but things are working now!

I'll move forward with fixing DRb using the logic JRuby ships in https://bugs.ruby-lang.org/issues/15711.

Remaining work here then would be to mark the method deprecated in the preferred way (for 2.7, hopefully, since it relates to https://bugs.ruby-lang.org/issues/15626 which it sounds like we want to land in 2.7).

Actions

Copy link

#38

Updated by hsbt (Hiroshi SHIBATA) over 1 year ago

Status changed from Open to Assigned

Actions

Copy link

#39 [ruby-core:121712]

Updated by byroot (Jean Boussier) 4 months ago

I opened a PR to finally deprecated ObjectSpace._id2ref: https://github.com/ruby/ruby/pull/13157

Actions

Copy link

#40

Updated by byroot (Jean Boussier) 4 months ago

Subject changed from Deprecate object_id and _id2ref to Deprecate ObjectSpace._id2ref

Actions

Copy link

#41 [ruby-core:121889]

Updated by Dan0042 (Daniel DeLorme) 3 months ago · Edited

The original reason to deprecate _id2ref is no longer relevant now that object_id is a monotonically-increasing ID. So I don't think it should be deprecated, unless there's some other benefit/reason that wasn't mentioned above.

Actions

Copy link

#42 [ruby-core:121910]

Updated by Eregon (Benoit Daloze) 3 months ago

@Dan0042 It still is, CRuby has to keep a weak map of all objects for which the object_id was asked in case _id2ref is used.
That's a big footprint and performance overhead.
It's also impossible to implement this efficiently, as the original description already established.

Actions

Copy link

#43 [ruby-core:121911]

Updated by Eregon (Benoit Daloze) 3 months ago

https://byroot.github.io/ruby/performance/2025/04/26/unlocking-ractors-object-id.html explains the problems of _id2ref in details, and how it changed.

Actions

Copy link

#44 [ruby-core:121916]

Updated by headius (Charles Nutter) 3 months ago

@Dan0042 I think @Eregon (Benoit Daloze) covered what I was going to say, but I'd also like to know why you want _id2ref to stick around. There's generally better ways (more reliable, easier for VMs to implement) to everything I have seen it used for.

Actions

Copy link

#45 [ruby-core:121921]

Updated by mame (Yusuke Endoh) 3 months ago

This was discussed at the dev meeting, and @matz (Yukihiro Matsumoto) agreed to add the deprecation warning.

Actions

Copy link

#46 [ruby-core:121931]

Updated by byroot (Jean Boussier) 3 months ago

I was hoping for a removal timeline, but alright. I'll merge my PR.

Actions

Copy link

#47

Updated by byroot (Jean Boussier) 3 months ago

Status changed from Assigned to Closed

Applied in changeset git|4de049a3f9709ab8b0b192c40e83b910b80d6de3.

Deprecate ObjectSpace._id2ref

[Feature #15408]

Matz decided to deprecate it for Ruby 2.6 or 2.7 but that never
actually happened.

Given the object_id table is a contention point for Ractors
it seems sensible to finally deprecate this API so we can
generate and store object ids more efficiently in the future.

Actions

Copy link

#48 [ruby-core:121948]

Updated by Dan0042 (Daniel DeLorme) 3 months ago

headius (Charles Nutter) wrote in #note-44:

I'd also like to know why you want _id2ref to stick around.

I think that question is backwards. There is no need for any reason to keep existing functionality; backward compatibility should be the default. The question is rather why you want _id2ref to be deprecated, since the previous reason given was no longer applicable. I don't think the memory usage of the id-to-ref mapping is much of a concern, but I agree that the synchronization issue with Ractors explained in [1] makes it fairly reasonable to deprecate this API. I imagine _id2ref could be fixed to mostly avoid synchronization by using one lookup table per Ractor + one table for shareable objects, but that sounds like a fairly complex fix and probably not worth the trouble.

[1]https://byroot.github.io/ruby/performance/2025/04/26/unlocking-ractors-object-id.html

Actions

Copy link

#49 [ruby-core:121949]

Updated by headius (Charles Nutter) 3 months ago

The question is rather why you want _id2ref to be deprecated

I want it removed, actually! 😀

You have seen all the reasons why it is problematic, but on top of all that it was never intended to be an official API in the first place (I'm not sure where but matz said so in some issue somewhere) which is why it's the only method in the API with a single leading underscore. Since this issue was filed, nearly all of my requirements have been met:

object_id is now monotonically increasing.
WeakMap is an official API.
Weakref is implemented in terms of WeakMap rather than _id2ref.
DRb had incorporated a WeakMap implementation.

With no further need for _id2ref to exist, and numerous reasons to eliminate it, it's time to start that process.

If you need the functionality of _id2ref, just put objects into a WeakMap with object_id as the key.

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Feature #15408

Deprecate ObjectSpace._id2ref

Updated by headius (Charles Nutter) over 6 years ago

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Updated by ko1 (Koichi Sasada) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by ahorek (Pavel Rosický) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by spatulasnout (B Kelly) over 6 years ago

Updated by spatulasnout (B Kelly) over 6 years ago

Updated by duerst (Martin Dürst) over 6 years ago

Updated by shyouhei (Shyouhei Urabe) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by ko1 (Koichi Sasada) over 6 years ago

Updated by Eregon (Benoit Daloze) over 6 years ago

Updated by shevegen (Robert A. Heiler) over 6 years ago

Updated by mame (Yusuke Endoh) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by mrkn (Kenta Murata) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by Eregon (Benoit Daloze) over 6 years ago

Updated by mrkn (Kenta Murata) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by Hanmac (Hans Mackowiak) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by matz (Yukihiro Matsumoto) over 6 years ago

Updated by Eregon (Benoit Daloze) over 6 years ago

Updated by naruse (Yui NARUSE) over 6 years ago

Updated by Eregon (Benoit Daloze) over 6 years ago

Updated by Eregon (Benoit Daloze) over 6 years ago

Updated by naruse (Yui NARUSE) over 6 years ago

Updated by headius (Charles Nutter) over 6 years ago

Updated by hsbt (Hiroshi SHIBATA) over 1 year ago

Updated by byroot (Jean Boussier) 4 months ago

Updated by byroot (Jean Boussier) 4 months ago

Updated by Dan0042 (Daniel DeLorme) 3 months ago · Edited

Updated by Eregon (Benoit Daloze) 3 months ago

Updated by Eregon (Benoit Daloze) 3 months ago

Updated by headius (Charles Nutter) 3 months ago

Updated by mame (Yusuke Endoh) 3 months ago

Updated by byroot (Jean Boussier) 3 months ago

Updated by byroot (Jean Boussier) 3 months ago

Updated by Dan0042 (Daniel DeLorme) 3 months ago

Updated by headius (Charles Nutter) 3 months ago