Feature #14555
openOpenStruct performance doesn't have to be slow...
Description
Coming from a JS background, it seems super baffling that OpenStruct is incredibly slow*, slower than even a standard hash. When I looked at the implementation, I could see one very odd design choice: the implementation didn't even try to use MRI's caching mechanism. It just used a hash for all data, and it created closures for reading and writing properties. Hashes aren't on their own substantially slower than instance variables especially for larger objects, (I checked, and both instance variables and hashes use a similar storage mechanism), but the way the readers and writers are implemented prevents MRI from caching and optimizing the member accesses. If you have to enter a closure and read an up-value on each property access, it's bound to be slow. (This is difficult to optimize in any language or runtime, even with a very advanced JIT/compiler that does polymorphic inlining of closures. I don't even expect Haskell to be that intelligent.)
Here's a few options I propose:
Fix the Ruby module¶
This is probably the easiest and most conservative option. Unless changing private methods or adding undocumented instance variables is considered breaking, I doubt this would be breaking enough to hit a patch release.
- You could use a table purely for remapping keys to their instance variable equivalent.
- You could use the table to iterate keys without giving up the performance benefits of a hash.
- You could use instance variables to store the actual values, and just call
attr_accessor
on the singleton class to add them.
This would bring speed at least close to the other two options, making repeated accesses highly optimized. This might be enough for the common case, but it's also the more memory-intensive (each key is duplicated).
My local experiment was about ~150 SLoC**, compared to the existing one's ~140.
Make it a builtin C type instead¶
This is less conservative (and also moderately breaking), but it's still pretty straightforward and puts it close to standard objects in performance. As long as you maintain a no-op 'ostruct'
module, this might be viable for a minor revision.
- The C interface appears to combine both the read and write methods for attributes into a single pseudo-method, which makes (de-)registering them easy.
- The C interface offers the ability to iterate instance variables without having to create an intermediary array to do so.
- The C interface provides ways to invoke primitive operations without risk of them getting overwritten.
- As an optimization, you could query if a value is referenced elsewhere, and if not, you could just 1. move the state table and 2. convert all the keys to IDs.
This means that it became easier to implement in C using the instance variable APIs rather than the hash APIs. Also, this allows the backing store to take part in MRI's optimizations for objects with only a few instance variables. This option provides very highly optimized code paths, and doesn't have nearly the memory issues you'd get otherwise.
My local experiment for this is about ~275 SLoC**, which is a bit bigger, but still isn't absurd (it's smaller than I would've expected.)
Make it a core primitive type¶
This is the least conservative and the most complex to implement, but it would put it on par with objects in performance, if not faster. This would definitely need to hit a minor revision, but with the same caveats as the previous option.
- Method access for those doesn't have to go through the slow path of
method_missing
. It can be caught between singleton and method dispatch. - Assignment can just directly create the property without having to even look up the method unless the object is a subclass instance or has a singleton assignment property.
- Access can just directly read the property unless 1. it's a subclass instance, 2. it has singleton methods, or 3. the property is missing. In the last case, it only needs to attempt to call the parent method, returning
nil
if no such method exists. - Other methods would need to go through normal dispatch.
- Keyword arguments would not need allocated when initializing. They could just be used directly to build the initial store.
- MJIT could optimize this much more easily than the other two options.
This would make OpenStructs viable for temporary data and not run into performance cliffs any longer. Creation and modification would both be cheap, and it would cost less than even a standard class. Most of the wins here beyond the other two are that of moderately faster access and far cheaper creation.
I don't have a local experiment available for this, due to the scope of such a change.
* JS object literals are basically a looser version of this, and modern engines can optimize this to oblivion. (It's easier to optimize literals than class creation.)
** I've attached the experimental implementations I created. Note that they're wholly untested and I was just trying to experiment with them at a high level before filing this bug.
Files
No data to display