https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112020-11-25T21:34:04ZRuby Issue Tracking SystemRuby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887462020-11-25T21:34:04Zchrisseaton (Chris Seaton)chris@chrisseaton.com
<ul></ul><p>Thanks I've always wanted this feature. Whenever I write <code>cache.fetch(key) { cache[key] = calculation }</code> (or more often <code>cache.fetch(key) { |k| cache[k] = calculation }</code>) I think 'surely this pattern is worth making simpler.'</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887472020-11-25T22:10:48Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul></ul><p>I think the feature in general is okay, but I have two concerns:</p>
<ol>
<li>
<p>The name very easily suggests that the method is fetching a set, rather than fetching and setting at the same time.</p>
</li>
<li>
<p>Why do we need a block for the second parameter? Can't that just be an ordinary parameter?</p>
</li>
</ol> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887482020-11-25T22:16:40Zchrisseaton (Chris Seaton)chris@chrisseaton.com
<ul></ul><p><code>#fetch_or_set</code> could be a good name.</p>
<blockquote>
<p>Why do we need a block for the second parameter? Can't that just be an ordinary parameter?</p>
</blockquote>
<p>Because we don't want to do the work of calculating the initial value if it isn't needed because the value is already set.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887492020-11-25T23:25:19Zphluid61 (Matthew Kerwin)matthew@kerwin.net.au
<ul></ul><p>chrisseaton (Chris Seaton) wrote in <a href="#note-3">#note-3</a>:</p>
<blockquote>
<p><code>#fetch_or_set</code> could be a good name.</p>
<blockquote>
<p>Why do we need a block for the second parameter? Can't that just be an ordinary parameter?</p>
</blockquote>
<p>Because we don't want to do the work of calculating the initial value if it isn't needed because the value is already set.</p>
</blockquote>
<p>I can see the utilitiy, it would be good if it had a similar signature to #fetch :</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">hsh</span> <span class="o">=</span> <span class="p">{}</span>
<span class="n">hsh</span><span class="p">.</span><span class="nf">fetch_or_set</span><span class="p">(</span><span class="ss">:a</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> <span class="c1"># => 1, hsh = {:a => 1}</span>
<span class="n">hsh</span><span class="p">.</span><span class="nf">fetch_or_set</span><span class="p">(</span><span class="ss">:b</span><span class="p">)</span> <span class="p">{</span><span class="o">|</span><span class="n">key</span><span class="o">|</span> <span class="n">key</span><span class="p">.</span><span class="nf">to_s</span> <span class="p">}</span> <span class="c1"># => "b", hsh = {:a => 1, :b => "b"}</span>
</code></pre>
<p>The only reason I don't suggest it as a keyword argument to #fetch is because of the edge of case of including the kwarg and <em>not</em> including the default value/block, otherwise <code>hsh.fetch(:a, 1, store_default: true)</code> would be okay.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887502020-11-26T00:42:30ZMaxLap (Maxime Lapointe)hunter_spawn@hotmail.com
<ul></ul><p>I didn't put an example using 2 parameters instead of a block, but yes, that option is also available. It has the same signature as a normal <code>fetch</code> would. In the examples, calculation was meant to be a complex thing, possibly a method call, which we want to avoid doing if the key is already in the Hash.</p>
<p>I understand the idea that it could be about fetching a set, but I don't see what set that would be. So it doesn't feel ambiguous to me. I did consider the <code>fetch_or_set</code> alternative, but core methods normally prefer to be shorter when the general idea is pretty clear. But it's not a bad name and I would still be happy to have the feature if that was it.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887512020-11-26T04:01:56Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>XXX_or_YYY doesn't feel a good name in general.<br>
Why does it need to be a method of <code>Hash</code>, and built-in?</p>
<p>Why not a separate class?</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="k">class</span> <span class="nc">Cache</span> <span class="o"><</span> <span class="no">Hash</span>
<span class="k">def</span> <span class="nf">fetch</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="o">&</span><span class="n">block</span><span class="p">)</span>
<span class="k">super</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="p">{</span><span class="nb">self</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="k">yield</span><span class="p">(</span><span class="n">key</span><span class="p">)}</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887612020-11-26T07:57:20Zjbeschi (jacopo beschi)intrip@gmail.com
<ul></ul><p><code>fetch_set</code> mixes the concept of query with the concept of command and I think it's not a good approach.</p>
<p>Moreover I don't really see any big advantage in writing</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">cache</span><span class="p">.</span><span class="nf">fetch_set</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="p">{</span> <span class="n">calculation</span> <span class="p">}</span>
</code></pre>
<p>vs</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">cache</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="p">{</span> <span class="n">cache</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">calculation</span> <span class="p">}</span>
</code></pre>
<p>Therefore I don't understand why we should add it to the language.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887712020-11-26T14:37:03ZEregon (Benoit Daloze)
<ul></ul><p>Another name for this is <code>compute_if_absent</code>.<br>
It's notably the name used in concurrent-ruby:<br>
<a href="http://ruby-concurrency.github.io/concurrent-ruby/1.1.5/Concurrent/Map.html#compute_if_absent-instance_method" class="external">http://ruby-concurrency.github.io/concurrent-ruby/1.1.5/Concurrent/Map.html#compute_if_absent-instance_method</a></p>
<p>And in Java, where it's even defined on Map (not just ConcurrentMap):<br>
<a href="https://docs.oracle.com/javase/8/docs/api/java/util/Map.html#computeIfAbsent-K-java.util.function.Function-" class="external">https://docs.oracle.com/javase/8/docs/api/java/util/Map.html#computeIfAbsent-K-java.util.function.Function-</a></p>
<p>Having it as a built-in method, it also makes it possible to avoid computing the key's <code>#hash</code> twice, and potentially avoid redoing the lookup in the Hash.</p>
<p>Another way to do this pattern is to use the default block:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="no">RequestStore</span><span class="p">.</span><span class="nf">store</span> <span class="o">=</span> <span class="no">Hash</span><span class="p">.</span><span class="nf">new</span> <span class="k">do</span> <span class="o">|</span><span class="n">h</span><span class="p">,</span> <span class="n">k</span><span class="o">|</span>
<span class="n">h</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="o">!</span><span class="no">MonitorValue</span><span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="s1">'date >= ?'</span><span class="p">,</span> <span class="no">Time</span><span class="p">.</span><span class="nf">now</span> <span class="o">-</span> <span class="mi">5</span><span class="p">.</span><span class="nf">minutes</span><span class="p">).</span><span class="nf">exists?</span>
<span class="k">end</span>
<span class="no">RequestStore</span><span class="p">.</span><span class="nf">store</span><span class="p">[</span><span class="ss">:monitor_value_is_delayed?</span><span class="p">]</span>
</code></pre>
<p>Which already works fine.<br>
And it has the advantage that if multiple places want to read from the Hash they don't have to repeat the code.<br>
Is there a case this pattern wouldn't work and where <code>Hash#fetch_set</code> would work?</p>
<p>This pattern can be made to work with parallelism too, see <a href="https://eregon.me/blog/assets/research/thesis-thread-safe-data-representations-in-dynamic-languages.pdf" class="external">Idiomatic Concurrent Hash Operations</a>, page 83.</p>
<p>Regarding concurrency and parallelism, we need to define the semantics if we add this method.</p>
<p>Of course, the assignment should not be performed if there is already a key, it must be "put if absent" semantics<br>
(<code>cache.fetch(key) { cache[key] = calculation }</code> is actually breaking that).</p>
<p>The question is whether the given block can be executed multiple times for a given key.<br>
If not, it requires synchronization while calling the block, which can lead to deadlocks.<br>
If yes, it doesn't require synchronization while calling the block which seems safer, but it means the block can be called multiple times.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887732020-11-26T14:53:07ZMaxLap (Maxime Lapointe)hunter_spawn@hotmail.com
<ul></ul><p>I forgot to mention, but this pattern can also be more performant, as the key only needs to be hashed once for both the fetching and setting part. It's minor, but I do think of it everything I write the <code>fetch</code> pattern, and when the key is complex, this could be meaningful. (Dang, Eregon beat me to this by 2 minutes!)</p>
<hr>
<p>nobu (Nobuyoshi Nakada) wrote in <a href="#note-6">#note-6</a>:</p>
<blockquote>
<p>Why not a separate class?</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="k">class</span> <span class="nc">Cache</span> <span class="o"><</span> <span class="no">Hash</span>
<span class="k">def</span> <span class="nf">fetch</span><span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="o">&</span><span class="n">block</span><span class="p">)</span>
<span class="k">super</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="p">{</span><span class="nb">self</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="k">yield</span><span class="p">(</span><span class="n">key</span><span class="p">)}</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
</blockquote>
<p>The Hash that we are using is not always under our control. In my second example, I'm using the <code>request_store</code> gem, which exposes as <code>Hash</code> (<code>RequestStore.store</code>), as cache. It would be quite dirty to monkey patch this. I could add that custom Hash in the <code>store</code>, but that would be losing the cleaner and shorter code benefits that I'm trying to achieve with this feature request.</p>
<p>Doing this class is also less performant instead of being more. (This is a very minor point)</p>
<p>Also: If I see somewhere <code>cache.fetch(key) { calculation }</code> I will instantly be worried:</p>
<ul>
<li>Did the person forget to set the key?</li>
<li>Is it only set elsewhere?<br>
=> Oh wait, it's a custom class that does something different.</li>
<li>Will one of my colleague copy this pattern without remembering that this is a different class?</li>
</ul>
<p>So if I was to make a different class, I would still use a different name just to avoid the frictions it would cause.</p>
<hr>
<p>jbeschi (jacopo beschi) wrote in <a href="#note-7">#note-7</a>:</p>
<blockquote>
<p><code>fetch_set</code> mixes the concept of query with the concept of command and I think it's not a good approach.</p>
</blockquote>
<p>Maybe in name it appears so, but not in spirit. All this does is set a value if one isn't already there, and in common Ruby spirit, it returns something that can be useful, which is the value at the key.</p>
<p>This is where the Python name for the function comes from, <code>setdefault</code> sets a value "by default" for a single key, and is often used to replace <code>cache[key] ||= []</code> since Python doesn't support this syntax.</p>
<hr>
<p>Eregon (Benoit Daloze) wrote in <a href="#note-8">#note-8</a>:</p>
<blockquote>
<p>Another name for this is <code>compute_if_absent</code>.</p>
</blockquote>
<p>True, however my thinking is this could also be used with a 2nd argument, like <code>fetch</code>, in which case there is no "computing" to speak of.</p>
<blockquote>
<p>Another way to do this pattern is to use the default block:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="no">RequestStore</span><span class="p">.</span><span class="nf">store</span> <span class="o">=</span> <span class="no">Hash</span><span class="p">.</span><span class="nf">new</span> <span class="k">do</span> <span class="o">|</span><span class="n">h</span><span class="p">,</span> <span class="n">k</span><span class="o">|</span>
<span class="n">h</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">=</span> <span class="o">!</span><span class="no">MonitorValue</span><span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="s1">'date >= ?'</span><span class="p">,</span> <span class="no">Time</span><span class="p">.</span><span class="nf">now</span> <span class="o">-</span> <span class="mi">5</span><span class="p">.</span><span class="nf">minutes</span><span class="p">).</span><span class="nf">exists?</span>
<span class="k">end</span>
<span class="no">RequestStore</span><span class="p">.</span><span class="nf">store</span><span class="p">[</span><span class="ss">:monitor_value_is_delayed?</span><span class="p">]</span>
</code></pre>
<p>Which already works fine.<br>
And it has the advantage that if multiple places want to read from the Hash they don't have to repeat the code.<br>
Is there a case this pattern wouldn't work and where <code>Hash#fetch_set</code> would work?</p>
</blockquote>
<p>RequestStore is used for lots of different things, setting a default like that means that if I use it for <code>RequestStore.store[:last_count]</code>, and that wasn't set, then I would instead be doing this MonitorValue check. And I can only use your pattern once per Hash.</p>
<blockquote>
<p>This pattern can be made to work with parallelism too, see <a href="https://eregon.me/blog/assets/research/thesis-thread-safe-data-representations-in-dynamic-languages.pdf" class="external">Idiomatic Concurrent Hash Operations</a>, page 83.</p>
<p>Regarding concurrency and parallelism, we need to define the semantics if we add this method.</p>
<p>Of course, the assignment should not be performed if there is already a key, it must be "put if absent" semantics<br>
(<code>cache.fetch(key) { cache[key] = calculation }</code> is actually breaking that).</p>
</blockquote>
<p>I don't understand what you mean, how is it breaking that? You mean in the case of threading, where there could be 2 assignments if 2 threads go in at the same time? I don't think it's the job of the Hash to deal with this.</p>
<blockquote>
<p>The question is whether the given block can be executed multiple times for a given key.<br>
If not, it requires synchronization while calling the block, which can lead to deadlocks.<br>
If yes, it doesn't require synchronization while calling the block which seems safer, but it means the block can be called multiple times.</p>
</blockquote>
<p>I don't consider Hash to be a concurrency primitive in Ruby. So I wouldn't put any synchronization here. If synchronization is needed, it can be done from inside the block (or the block of the classic <code>fetch</code> pattern).</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887762020-11-26T15:29:53ZEregon (Benoit Daloze)
<ul></ul><p>MaxLap (Maxime Lapointe) wrote in <a href="#note-9">#note-9</a>:</p>
<blockquote>
<p>The Hash that we are using is not always under our control. In my second example, I'm using the <code>request_store</code> gem, which exposes as <code>Hash</code> (<code>RequestStore.store</code>), as cache.</p>
</blockquote>
<p>I see, <a href="https://github.com/steveklabnik/request_store" class="external">https://github.com/steveklabnik/request_store</a>, so you might want a different block per key essentially, and so a single default block isn't enough.</p>
<blockquote>
<p>It would be quite dirty to monkey patch this.</p>
</blockquote>
<p>There is <code>Hash#default_proc=</code> but it would be dirty in this case indeed.</p>
<blockquote>
<p>I don't understand what you mean, how is it breaking that? You mean in the case of threading, where there could be 2 assignments if 2 threads go in at the same time? I don't think it's the job of the Hash to deal with this.</p>
</blockquote>
<p>Yes, exactly.<br>
Many gems assume Hash is thread-safe, so in my opinion it needs to be thread-safe, and it is at least in CRuby.<br>
Even without threads you need to consider cases like the block itself would already set the key, and you shouldn't set it again.</p>
<blockquote>
<p>I don't consider Hash to be a concurrency primitive in Ruby. So I wouldn't put any synchronization here. If synchronization is needed, it can be done from inside the block (or the block of the classic <code>fetch</code> pattern).</p>
</blockquote>
<p>This method fits very clearly for caches, and caches are typically used for multiple thread, so I think we should consider the concurrency semantics.<br>
Concurrency can also happen for Hash when mutating while iterating, or some methods can be called recursively, so even without threads it needs to be considered.<br>
Agreed no synchronization when calling the block is better, and the block can make its own synchronization to avoid running twice per key if really needed.</p>
<p>On CRuby, the GIL will be enough to do a check if the key is already set just before setting it from the result of the block, but the code needs to be careful there is no Ruby call between that check and assignment, or it would break with multiple threads.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887872020-11-27T02:05:34Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul></ul><p>Eregon (Benoit Daloze) wrote in <a href="#note-8">#note-8</a>:</p>
<blockquote>
<p>Having it as a built-in method, it also makes it possible to avoid computing the key's <code>#hash</code> twice, and potentially avoid redoing the lookup in the Hash.</p>
</blockquote>
<p>It is true <strong>ideally</strong>, but no one can guarantee the hash value never change.</p>
<hr>
<p>MaxLap (Maxime Lapointe) wrote in <a href="#note-9">#note-9</a>:</p>
<blockquote>
<p>The Hash that we are using is not always under our control. In my second example, I'm using the <code>request_store</code> gem, which exposes as <code>Hash</code> (<code>RequestStore.store</code>), as cache. It would be quite dirty to monkey patch this. I could add that custom Hash in the <code>store</code>, but that would be losing the cleaner and shorter code benefits that I'm trying to achieve with this feature request.</p>
</blockquote>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="k">module</span> <span class="nn">Caching</span>
<span class="n">refine</span> <span class="no">Hash</span> <span class="k">do</span>
<span class="k">def</span> <span class="nf">cache</span><span class="p">(</span><span class="n">key</span><span class="p">)</span>
<span class="n">fetch</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="p">{</span><span class="nb">self</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="k">yield</span><span class="p">(</span><span class="n">key</span><span class="p">)}</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">using</span> <span class="no">Caching</span>
<span class="no">RequestStore</span><span class="p">.</span><span class="nf">store</span><span class="p">.</span><span class="nf">cache</span><span class="p">(</span><span class="ss">:monitor_value_is_delayed?</span><span class="p">)</span> <span class="k">do</span>
<span class="o">!</span><span class="no">MonitorValue</span><span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="s1">'date >= ?'</span><span class="p">,</span> <span class="no">Time</span><span class="p">.</span><span class="nf">now</span> <span class="o">-</span> <span class="mi">5</span><span class="p">.</span><span class="nf">minutes</span><span class="p">).</span><span class="nf">exists?</span>
<span class="k">end</span>
</code></pre> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887952020-11-27T06:05:56Zmarcandre (Marc-Andre Lafortune)marcandre-ruby-core@marc-andre.ca
<ul></ul><p>On Thu, Nov 26, 2020 at 9:05 PM <a href="mailto:nobu@ruby-lang.org" class="email">nobu@ruby-lang.org</a> wrote:</p>
<blockquote>
<p>It is true <strong>ideally</strong>, but no one can guarantee the hash value never change.</p>
</blockquote>
<p>Please, let's be serious. <code>hash[obj] ||= something_that_changes_obj</code> is nonsense code.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=887972020-11-27T07:26:55Zshyouhei (Shyouhei Urabe)shyouhei@ruby-lang.org
<ul></ul><p>+1 for the feature. I have had chances to write what is proposed here several times.</p>
<p>marcandre (Marc-Andre Lafortune) wrote in <a href="#note-12">#note-12</a>:</p>
<blockquote>
<p>On Thu, Nov 26, 2020 at 9:05 PM <a href="mailto:nobu@ruby-lang.org" class="email">nobu@ruby-lang.org</a> wrote:</p>
<blockquote>
<p>It is true <strong>ideally</strong>, but no one can guarantee the hash value never change.</p>
</blockquote>
<p>Please, let's be serious. <code>hash[obj] ||= something_that_changes_obj</code> is nonsense code.</p>
</blockquote>
<p>Yes I’m quite sure he is dead serious. There are idiots who write nonsense codes. As a language ruby has to be fool-proof.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=888962020-12-03T07:03:18Zmarcandre (Marc-Andre Lafortune)marcandre-ruby-core@marc-andre.ca
<ul></ul><p>I think this is a good feature. I like the fact that the this can not be implemented in pure Ruby as efficiently (i.e. with hash calculated once).</p>
<p>An alternative name is <code>fetch_init</code>, as it may gives a better idea that it is meant to initialize the value (once) for the key. That being said, <code>fetch_set</code> is a good name too.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=889082020-12-03T19:13:19ZEregon (Benoit Daloze)
<ul></ul><p>shyouhei (Shyouhei Urabe) wrote in <a href="#note-13">#note-13</a>:</p>
<blockquote>
<p>Yes I’m quite sure he is dead serious. There are idiots who write nonsense codes. As a language ruby has to be fool-proof.</p>
</blockquote>
<p>I think such semantics (calling #hash only once) could be part of the method documentation.<br>
OTOH calling <code>#hash</code> twice is probably not very expensive, and for some cases (e.g. Integer, String) can be optimized by a JIT or by memoizing the hash on the instance.</p>
<p>I don't like the name <code>fetch_set</code> much (does it fetch a Set?).<br>
I think <code>fetch_or_set</code> or <code>cache</code> are better.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=890712020-12-10T07:32:10Zmatz (Yukihiro Matsumoto)matz@ruby.or.jp
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Feedback</i></li></ul><p>Most of the case, <code>hash[:key] ||= init</code> works. The exception is that <code>init</code> value being false. But it should be rare.<br>
So could you explain the concrete use-case for <code>fetch_set()</code>?</p>
<p>Besides that I am not fully satisfied with the name <code>fetch_set()</code>.</p>
<p>Matz.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=890722020-12-10T07:40:36Zmarcandre (Marc-Andre Lafortune)marcandre-ruby-core@marc-andre.ca
<ul></ul><p>matz (Yukihiro Matsumoto) wrote in <a href="#note-16">#note-16</a>:</p>
<blockquote>
<p>Most of the case, <code>hash[:key] ||= init</code> works. The exception is that <code>init</code> value being false. But it should be rare.</p>
</blockquote>
<p>We could look into public codebases for better numbers; it is not the most common case, but it is definitely encountered it with some frequency (more with <code>nil</code> than <code>false</code>).</p>
<blockquote>
<p>So could you explain the concrete use-case for <code>fetch_set()</code>?</p>
</blockquote>
<p>One additional use-case would be a <em>thread-safe equivalent</em> to <code>Ractor.current[:my_key] ||= init</code>.</p>
<blockquote>
<p>Besides that I am not fully satisfied with the name <code>fetch_set()</code>.</p>
</blockquote>
<p>Did you consider <code>fetch_init</code>?</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=891582020-12-10T15:51:19ZDan0042 (Daniel DeLorme)
<ul></ul><p>marcandre (Marc-Andre Lafortune) wrote in <a href="#note-17">#note-17</a>:</p>
<blockquote>
<p>One additional use-case would be a <em>thread-safe equivalent</em> to <code>Ractor.current[:my_key] ||= init</code>.</p>
</blockquote>
<p>+1</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=891632020-12-10T18:17:04ZMaxLap (Maxime Lapointe)hunter_spawn@hotmail.com
<ul></ul><p>matz (Yukihiro Matsumoto) wrote in <a href="#note-16">#note-16</a>:</p>
<blockquote>
<p>Most of the case, <code>hash[:key] ||= init</code> works. The exception is that <code>init</code> value being false. But it should be rare.</p>
</blockquote>
<p>The problem is not only with <code>false</code>, but with <code>nil</code> too.</p>
<p>Anytime you want to cache something and nil (or false) is a possible value (often meaning the data is missing), this problem arises, as it means these cases are recalculated each time, to again indicate that the data is missing.</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="c1"># If there is no config with that internal_id, each time you try to use the </span>
<span class="c1"># cached version, which is meant to be fast, it would actually do a query to the database</span>
<span class="k">def</span> <span class="nc">self</span><span class="o">.</span><span class="nf">cached_by_internal_id</span><span class="p">(</span><span class="n">internal_id</span><span class="p">)</span>
<span class="n">my_cache</span><span class="p">[</span><span class="n">internal_id</span><span class="p">]</span> <span class="o">||=</span> <span class="no">Config</span><span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="ss">internal_id: </span><span class="n">internal_id</span><span class="p">).</span><span class="nf">first</span>
<span class="k">end</span>
</code></pre>
<p>It is very easy to forget that nil values means the cache is not going to act as a cache. It probably happens in many places, but is not critical, just makes things a little slower from redoing work many times.</p>
<hr>
<p>As for the name, there are other options that were suggested:</p>
<ul>
<li>fetch_init</li>
<li>cache</li>
</ul> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=891652020-12-10T22:34:28Zp8 (Petrik de Heus)
<ul></ul><p>Elixir has put_new and put_new_lazy:<br>
<a href="https://hexdocs.pm/elixir/Map.html#put_new_lazy/3" class="external">https://hexdocs.pm/elixir/Map.html#put_new_lazy/3</a></p>
<p>Maybe store_if_new or store_new ?</p>
<p><code>cache.store_new(key) { calculation } </code></p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=891742020-12-11T03:53:28Zaustin (Austin Ziegler)halostatue@gmail.com
<ul></ul><p>My emails appear not to be making it through the mailing list gateway into the tickets, so…</p>
<p><a href="#note-15">#note-15</a></p>
<p>I’m mostly doing Elixir these days, and one of the caching modules I use has as its main interface <code>fetch_or_set</code>, so I think that’s the best choice if this is added.</p>
<p>On the other hand, Nobu’s refinement implementation (<a href="#note-11">#note-11</a>) looks pretty good (and I say this not using refinements yet, even though they’ve been around for a while).</p>
<p><a href="#note-20">#note-20</a></p>
<p><code>put_new</code> doesn’t work like this in Elixir, but it is similar in some ways (put only if the key doesn’t exist).</p>
<p>The correct Ruby implementation for this would be something like</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="nb">hash</span><span class="p">.</span><span class="nf">key?</span><span class="p">(</span><span class="n">key</span><span class="p">)</span> <span class="o">||</span> <span class="nb">hash</span><span class="p">[</span><span class="n">key</span><span class="p">]</span> <span class="o">=</span> <span class="n">value</span>
</code></pre>
<p>The advantage of a <code>fetch_or_set</code> method is that it can be implemented atomically for multithreading runtimes.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=892722020-12-17T05:28:19Zshyouhei (Shyouhei Urabe)shyouhei@ruby-lang.org
<ul></ul><p>MaxLap (Maxime Lapointe) wrote in <a href="#note-19">#note-19</a>:</p>
<blockquote>
<p>matz (Yukihiro Matsumoto) wrote in <a href="#note-16">#note-16</a>:</p>
<blockquote>
<p>Most of the case, <code>hash[:key] ||= init</code> works. The exception is that <code>init</code> value being false. But it should be rare.</p>
</blockquote>
<p>The problem is not only with <code>false</code>, but with <code>nil</code> too.</p>
<p>Anytime you want to cache something and nil (or false) is a possible value (often meaning the data is missing), this problem arises,</p>
</blockquote>
<p>Matz says it's rare.</p>
<p>I'm not sure if that is true, but so far nobody shows any evidence about its rarity.</p> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=1069972024-02-26T15:43:09ZEregon (Benoit Daloze)
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-1 priority-4 priority-default" href="/issues/20300">Feature #20300</a>: Hash: set value and get pre-existing value in one call</i> added</li></ul> Ruby master - Feature #17342: Hash#fetch_sethttps://bugs.ruby-lang.org/issues/17342?journal_id=1070002024-02-26T15:55:42ZDan0042 (Daniel DeLorme)
<ul></ul><p>It's confusing that this ticket, <a class="issue tracker-2 status-7 priority-4 priority-default closed" title="Feature: Hash#fetch_set (Feedback)" href="https://bugs.ruby-lang.org/issues/17342">#17342</a>, is striked out even though the status is "Feedback".</p>
<p>Another name for this could be <code>Hash#add</code>; at least that's the terminology used in memcached and I believe some other caching servers.</p>