Feature #17316
openOn memoization
Description
I have seen so many attempts to memoize a value in the form:
@foo ||= some_heavy_calculation(...)
improperly, i.e., even when the value can potentially be falsy. This practice is wide spread, and since in most cases memoization is about efficiency and it would not be critical if it does not work correctly, people do not seem to care so much about correcting the wrong usage.
In such case, the correct form would be:
unless instance_variable_defined?(:@foo)
@foo = some_heavy_calculation(...)
end
but this looks too long, and perhaps that is keeping people away from using it.
What about allowing Kernel#instance_variable_set
to take a block instead of the second argument, in which case the assignment should be done only when the instance variable is not defined?
instance_variable_set(:@foo){some_heavy_calculation(...)}
Or, if that does not look right or seems to depart from the original usage of instance_variable_set
, then what about having a new method?
memoize(:foo){some_heavy_calculation(...)}
Updated by marcandre (Marc-Andre Lafortune) almost 5 years ago
Memoization is tricky, not just for nil
/false
values. What about freezing that object? What about calling Ractor.make_shareable
on it?
I just released a small gem to deal with memoization that:
- works with
nil
/false
results. - works for methods accepting arguments
- works for frozen objects
- is Ractor-ready in that the object can be made Ractor-shareable.
Gem is here: https://github.com/marcandre/ractor-cache
Comments welcome :-)
I think more strategies might be useful, for example accessing the cache via a Ractor/SharedHash, but haven't implemented that.
Updated by sawa (Tsuyoshi Sawada) almost 5 years ago
marcandre (Marc-Andre Lafortune) wrote in #note-3:
I just released a small gem to deal with memoization
Looks interesting.
Updated by marcandre (Marc-Andre Lafortune) almost 5 years ago
What about allowing Kernel#instance_variable_set to take a block instead of the second argument, in which case the assignment should be done only when the instance variable is not defined?
I would like Kernel#instance_variable_get
(not _set
) to accept a block like Hash#fetch
for when the instance variable is not set.
Updated by sawa (Tsuyoshi Sawada) almost 5 years ago
marcandre (Marc-Andre Lafortune) wrote in #note-5:
What about allowing Kernel#instance_variable_set to take a block instead of the second argument, in which case the assignment should be done only when the instance variable is not defined?
I would like
Kernel#instance_variable_get
(not_set
) to accept a block likeHash#fetch
for when the instance variable is not set.
That also makes sense. Either is fine with me.
Updated by Dan0042 (Daniel DeLorme) over 4 years ago
marcandre (Marc-Andre Lafortune) wrote in #note-3:
Gem is here: https://github.com/marcandre/ractor-cache
Comments welcome :-)
Since you say so... :-)
An additional strategy might to wrap the @cache in a Ractor::LVar (if/once available). I tend to use memoization to cache DB access rather than long calculations, and for a given class I would probably not use all (or even a majority) of memoized methods at once. So pre-computing values before deep-freezing is not a good option for me.
But I find it interesting that this memoization stuff keeps getting reimplemented.
https://rubygems.org/search?utf8=%E2%9C%93&query=memoization
Not to mention all the people (including me) who have implemented this in their private code.
And everyone tends to have a slightly different implementation based on the features they need.
For example my own implementation is compatible with shallow-freezing and falsy values, but not with methods that take arguments; instead I wanted cache-busting based on dependent values. And multiple-assignment aliases.
memo ->{id}, #memo-busting lambda
:foo, :bar, #aliases for foobar[0] and foobar[1]
def foobar
obj = get_foobar_from_db(id)
[obj.foo, obj.bar]
end
All this to say that since the specifics can vary, it's probably better to leave that level of memoization to gems and individual developers. I can somewhat agree with something simple like instance_variable_get(:@v){ @v = calc() }
... but then again we can already do this just as easily now with return @v if defined? @v; @v = calc()
Updated by sebyx07 (Sebastian Buza) over 4 years ago
IMO there should be an operator in the language directly to keep it more dry.
def my_method # current implementation
return @cache if defined? @cache
@cache = some_heavy_calculation
end
def my_new_method
@cache ?= some_heavy_calculation
end
Updated by marcandre (Marc-Andre Lafortune) over 4 years ago
Dan0042 (Daniel DeLorme) wrote in #note-8:
marcandre (Marc-Andre Lafortune) wrote in #note-3:
Gem is here: https://github.com/marcandre/ractor-cache
Comments welcome :-)Since you say so... :-)
An additional strategy might to wrap the @cache in a Ractor::LVar (if/once available).
Indeed. I refactored it to use Ractor.current[]
and a WeakMap
. I removed the other ways as I can't think of a case where this isn't the best way to go.
Updated by joel@drapper.me (Joel Drapper) over 3 years ago
I've been experimenting with doing memoization by passing a block to attr_reader
/ attr_accessor
, e.g.
attr_reader(:foo) { something_slow }
or
attr_reader :foo do
something_slow
end
I prototyped this in Ruby to get a feel for what it's like to use. https://gist.github.com/joeldrapper/7e35f2f5f906344195c121801ddd28d4
Updated by marksiemers (Mark Siemers) 16 days ago
ยท Edited
I agree with Sebastian that an operator is the best way to keep this code elegant. My proposal is to:
Introduce a new operator @||
that does instance variable aware memoization succinctly
# Allow this syntax
def result
@result @||= expensive_calculation
end
# As functionally equivalent to this
def result
if instance_variable_defined?(:@result)
@result
else
@result = expensive_calculation
end
end
Reasons for this suggestion:
-
@|
is already not valid syntax, so there should not be any issue with breaking legacy code - Use of the
@
symbol is a strong indicator that this impacts and is useful for instance variables - Keeping the
||
as part of the operator is familiar syntax for memoization - It is succinct, keeps things on one line, adds only one character, does not require blocks or method calls on separate lines (e.g.
memoize(:result)
) - If searching for memoization in the codebase - searching for
||=
will still work
A big motivation for this comes from a recent change to rubocop-rails, which is now enforcing "Rails/FindByOrAssignmentMemoization" (see: https://rails.rubystyle.guide/\#find-by-memoization)
A more rails-specific example below of this proposal:
# Previously allowed (though not performant in the case of nil returned by find_by)
def foo
@foo ||= Foo.find_by(id:)
end
# The new rule enforces this syntax
def foo
if instance_variable_defined?(:@foo)
@foo
else
@foo = Foo.find_by(id:)
end
end
Here is the proposal in the context of rails find_by
def foo
@foo @||= Foo.find_by(id:)
end
Here are some other ideas, but I don't think any of them are as good as @||
:
@result |||= expensive_calculation
@result -||= expensive_calculation
@result +||= expensive_calculation
@result _||= expensive_calculation
@result &||= expensive_calculation
@result %||= expensive_calculation
Updated by Dan0042 (Daniel DeLorme) 16 days ago
marksiemers (Mark Siemers) wrote in #note-12:
A big motivation for this comes from a recent change to rubocop-rails, which is now enforcing "Rails/FindByOrAssignmentMemoization" (see: https://rails.rubystyle.guide/\#find-by-memoization)
A more rails-specific example below of this proposal:
# Previously allowed (though not performant in the case of nil returned by find_by) def foo @foo ||= Foo.find_by(id:) end # The new rule enforces this syntax def foo if instance_variable_defined?(:@foo) @foo else @foo = Foo.find_by(id:) end end
This is appalling. If "best practices" are going to encourage this kind of un-ruby-ish horror, this memoization issue is more urgent to solve than I had expected.
Personally I prefer a DSL like memo def foo
but this @||=
idea is pretty interesting in how it communicates this is a syntax that applies to instance variables.
Updated by ixti (Alexey Zapparov) 15 days ago
def foo if instance_variable_defined?(:@foo) @foo else @foo = Foo.find_by(id:) end end
This is appalling. If "best practices" are going to encourage this kind of un-ruby-ish horror, this memoization issue is more urgent to solve than I had expected.
Firstly, rbocop is totally optional. Secondly, even rubocop is suggesting something more elegant than the above example:
def current_user
return @current_user if defined?(@current_user)
@current_user = User.find_by(id: session[:user_id])
end
Which is, in my humble opinion, pretty common practice for memoizing falsey values in general.
Personally I prefer a DSL like
memo def foo
but this@||=
idea is pretty interesting in how it communicates this is a syntax that applies to instance variables.
Neither DSL nor proposed syntax are good IMO as they add a lot of cognitive burden. Although, I think DSL is still better if anything.
Updated by Dan0042 (Daniel DeLorme) 15 days ago
ixti (Alexey Zapparov) wrote in #note-14:
Firstly, rbocop is totally optional. Secondly, even rubocop is suggesting something more elegant than the above example:
Ah yes you're right. I read "The new rule enforces this syntax" and took it at face value. But yeah the link has the much more common idiom return @foo if defined?(@foo)
Also I got the impression this was an official style guide from the Rails team, but now I'm not so sure.
Sorry for the overreaction.
Updated by matz (Yukihiro Matsumoto) 6 days ago
I understand the motivation, but the proposed "@||=" is unacceptable. Other suggestions were made, but I don't think any of them adequately express the intent.
If there are no good operator name suggestions in the future, it would be better to explicitly use defined?
.
Matz.
Updated by matheusrich (Matheus Richard) 6 days ago
@matz (Yukihiro Matsumoto) I propose ?=
as the uninitialized assignment operator:
@foo ?= some_heavy_calculation(...)
Updated by matz (Yukihiro Matsumoto) 6 days ago
Currently, ?=
means "a single character =". We don't want to break current available syntax, unless absolutely necessary.
Matz.
Updated by Dan0042 (Daniel DeLorme) 6 days ago
The single-character syntax itself isn't a problem if limited to instance variables. foo ?=
is already valid and shouldn't change, but @foo ?=
isn't currently valid, so it could serve as "assign if ivar undefined."
The problem is consistency. All operator-equals decompose uniformly (a += b
-> a = a + b
) but ?=
and its variations (like @||=
) would be very different. What should happen with cases like:
foo ?= expr
foo[key] ?= expr
foo.bar ?= expr
If that's a syntax error, it's inconsistent with other operators like +=
or ||=
.
If it means "assign if not nil", it's inconsistent with @foo ?= expr
as "assign if undefined".
And if @foo ?= expr
is changed to mean "assign if not nil" then we're back to the nil coalescing operator which is a different topic (#13820)
An operator for "assign if undefined" is appealing, but the implications don't work out cleanly.