https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112019-09-05T23:25:42ZRuby Issue Tracking SystemRuby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=814092019-09-05T23:25:42Zngomez (Nancy Gomez)
<ul><li><strong>Subject</strong> changed from <i>Array .distance allow custom comparison</i> to <i>Array .difference allow custom comparison</i></li></ul> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=814112019-09-06T02:05:35Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul></ul><p>Please provide an actual usage example directly in this issue, not only by reference.</p> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=814172019-09-06T08:39:25Zshevegen (Robert A. Heiler)shevegen@gmail.com
<ul></ul><p>Seems to be the same link as <a href="https://bugs.ruby-lang.org/issues/16118" class="external">https://bugs.ruby-lang.org/issues/16118</a>; so perhaps accidental double<br>
post; could close one of these two - or perhaps both. :D</p>
<p>As there was not yet any comment on the other post, I'll comment briefly here.</p>
<p>I think the standard way for custom comparisons in ruby is to define the strategy you wish to<br>
employ in order to compare e. g elements in an array. It's a bit short-sighted to think that<br>
one has to rely solely on .difference alone for that comparison.</p>
<blockquote>
<p>such as #uniq but #difference does not :(</p>
</blockquote>
<p>If the issue is related between #uniq and #difference then I think the issue request should<br>
focus on precisely that part. This would make it easier for the ruby core team (respectively<br>
matz) to consider the merits (pro/con) of the suggestion itself. It's too difficult to<br>
try to understand what is meant in external sites otherwise, at the least I think it is.</p> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=814842019-09-09T17:12:02Zngomez (Nancy Gomez)
<ul></ul><p>As recommended I'll place in example directly here:</p>
<p>The feature I am requesting is to allow custom comparison of the Array #difference.<br>
Currently it is comparing with each item's #eql and #hash but in the case of Objects you would need to redefine these to get the intended result.</p>
<p>In my case, I can not simply redefine these as getting the difference of the two arrays of Objects is one small portion of the overall logic, the objects I am using come from a third party which also uses the #eql and #hash for other logic I rely on. So I would have to set and somehow unset my definition (which not sure I can unset anyway) or I can just not use the #difference which seems like such a waste because it would be such an elegant and clear solution.</p>
<p>Here is an example of my desired usage, let's say I have the following class:</p>
<pre><code># Defined by a third party which uses #eql and #hash for other logic
class Num
def initialize(val)
@val = val
end
def val
@val
end
end
</code></pre>
<p>And I have the following arrays:</p>
<pre><code>all = [Num.new(1), Num.new(2), Num.new(3)]
subset = [Num.new(1), Num.new(2)]
</code></pre>
<p>I want to find all the values that the <code>subset</code> is missing from <code>all</code>:</p>
<pre><code>all.difference(subset)
=> [#<Num:0x00007fcae19e9540 @val=1>, #<Num:0x00007fcae19e9518 @val=2>, #<Num:0x00007fcae19e94f0 @val=3>]
</code></pre>
<p>But of course, #difference is comparing view #eql and #hash so I attempt to define how to compare:</p>
<pre><code>all.difference(subset) { |a, b| a.val <=> b.val }
=> [#<Num:0x00007fcae19e9540 @val=1>, #<Num:0x00007fcae19e9518 @val=2>, #<Num:0x00007fcae19e94f0 @val=3>]
```
Still no luck, and I'm not 100% sure that the syntax is exactly correct anyway, so I take a look at the docs: https://ruby-doc.org/core-2.6/Array.html#method-i-difference
And notice no example shows custom comparison, and the implementation doesn't seem to support it either.
I would love it if instead this would be the result:
```
all.difference(subset) { |a, b| a.val <=> b.val }
=> [#<Num:0x00007fcae19e94f0 @val=3>]
```
Does that make sense?
</code></pre> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=815152019-09-11T14:10:47ZDan0042 (Daniel DeLorme)
<ul></ul><p>What you're asking for is a O(n²) operation. I'm not sure it's a good idea to make that so easy and transparent as a core method.</p>
<p>This is quite easy to code and makes the (inefficiency of) two loops more apparent:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">all</span><span class="p">.</span><span class="nf">reject</span><span class="p">{</span> <span class="o">|</span><span class="n">a</span><span class="o">|</span> <span class="n">subset</span><span class="p">.</span><span class="nf">find</span><span class="p">{</span> <span class="o">|</span><span class="n">s</span><span class="o">|</span> <span class="n">s</span><span class="p">.</span><span class="nf">val</span> <span class="o">==</span> <span class="n">a</span><span class="p">.</span><span class="nf">val</span> <span class="p">}</span> <span class="p">}</span>
</code></pre>
<p>The one advantage I can see to the <=> comparator though is that it could possibly result in an algorithm more efficient than O(n²). The implementation could sort the two lists and then do a diff-like operation.</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">only_all</span><span class="p">,</span><span class="n">intersection</span><span class="p">,</span><span class="n">only_subset</span> <span class="o">=</span> <span class="n">all</span><span class="p">.</span><span class="nf">uniqdiff3</span><span class="p">(</span><span class="n">subset</span><span class="p">){</span> <span class="o">|</span><span class="n">a</span><span class="p">,</span><span class="n">b</span><span class="o">|</span> <span class="n">a</span><span class="p">.</span><span class="nf">val</span> <span class="o"><=></span> <span class="n">b</span><span class="p">.</span><span class="nf">val</span> <span class="p">}</span>
<span class="n">union</span> <span class="o">=</span> <span class="n">only_all</span> <span class="o">+</span> <span class="n">intersection</span> <span class="o">+</span> <span class="n">only_subset</span>
</code></pre> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=819222019-10-06T16:41:02Zjonathanhefner (Jonathan Hefner)jonathan@hefner.pro
<ul></ul><p>Dan0042 (Daniel DeLorme) wrote:</p>
<blockquote>
<p>What you're asking for is a O(n²) operation. I'm not sure it's a good idea to make that so easy and transparent as a core method.</p>
</blockquote>
<p>Although the given example specified a two-argument block, the original post references <code>uniq</code>, which accepts a one-argument block. Using a one-argument block can result in an O(n) implementation. Here is a different example:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="no">Dependency</span> <span class="o">=</span> <span class="no">Struct</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">:name</span><span class="p">,</span> <span class="ss">:version</span><span class="p">)</span>
<span class="n">deps1</span> <span class="o">=</span> <span class="p">[</span><span class="no">Dependency</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"1.1"</span><span class="p">),</span> <span class="no">Dependency</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"bar"</span><span class="p">,</span> <span class="s2">"2.0"</span><span class="p">),</span> <span class="no">Dependency</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"qux"</span><span class="p">,</span> <span class="s2">"1.0"</span><span class="p">)]</span>
<span class="n">deps2</span> <span class="o">=</span> <span class="p">[</span><span class="no">Dependency</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">,</span> <span class="s2">"1.0"</span><span class="p">),</span> <span class="no">Dependency</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"bar"</span><span class="p">,</span> <span class="s2">"1.0"</span><span class="p">)]</span>
<span class="n">deps1</span><span class="p">.</span><span class="nf">difference</span><span class="p">(</span><span class="n">deps2</span><span class="p">,</span> <span class="o">&</span><span class="ss">:name</span><span class="p">)</span>
<span class="c1"># == [Dependency.new("qux", "1.0")]</span>
<span class="n">deps1</span><span class="p">.</span><span class="nf">difference</span><span class="p">(</span><span class="n">deps2</span><span class="p">){</span><span class="o">|</span><span class="n">dep</span><span class="o">|</span> <span class="p">[</span><span class="n">dep</span><span class="p">.</span><span class="nf">name</span><span class="p">,</span> <span class="n">dep</span><span class="p">.</span><span class="nf">version</span><span class="p">[</span><span class="sr">/^\d+/</span><span class="p">]]</span> <span class="p">}</span>
<span class="c1"># == [Dependency.new("bar", "2.0"), Dependency.new("qux", "1.0")]</span>
</code></pre> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=820132019-10-14T02:02:43Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul></ul><p>jonathanhefner (Jonathan Hefner) wrote:</p>
<blockquote>
<p>Dan0042 (Daniel DeLorme) wrote:</p>
<blockquote>
<p>What you're asking for is a O(n²) operation. I'm not sure it's a good idea to make that so easy and transparent as a core method.</p>
</blockquote>
<p>Although the given example specified a two-argument block, the original post references <code>uniq</code>, which accepts a one-argument block. Using a one-argument block can result in an O(n) implementation.</p>
</blockquote>
<p>Well, that assumes that the block return values can be ordered. But it's easy to create cases where values can be compared for equality, but not ordered. Example: <code>[1, 'abc']</code>.</p>
<p>Also, this proposal is about adding a block to <code>difference</code>, but if we do that, we should add it to <code>union</code> and <code>intersection</code> and so on, too.</p> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=824622019-11-04T19:25:30Zjonathanhefner (Jonathan Hefner)jonathan@hefner.pro
<ul></ul><p>duerst (Martin Dürst) wrote:</p>
<blockquote>
<p>Well, that assumes that the block return values can be ordered. But it's easy to create cases where values can be compared for equality, but not ordered. Example: <code>[1, 'abc']</code>.</p>
</blockquote>
<p>I'm not sure that I follow. Here is a naive implementation of what I was envisioning:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="k">def</span> <span class="nf">difference</span><span class="p">(</span><span class="o">*</span><span class="n">other_arrays</span><span class="p">,</span> <span class="o">&</span><span class="n">block</span><span class="p">)</span>
<span class="k">if</span> <span class="n">block</span>
<span class="n">rejected</span> <span class="o">=</span> <span class="n">other_arrays</span><span class="p">.</span><span class="nf">flatten</span><span class="p">(</span><span class="mi">1</span><span class="p">).</span><span class="nf">map</span><span class="p">(</span><span class="o">&</span><span class="n">block</span><span class="p">).</span><span class="nf">to_set</span>
<span class="nb">self</span><span class="p">.</span><span class="nf">reject</span><span class="p">{</span><span class="o">|</span><span class="n">element</span><span class="o">|</span> <span class="n">rejected</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="n">block</span><span class="p">.</span><span class="nf">call</span><span class="p">(</span><span class="n">element</span><span class="p">))</span> <span class="p">}</span>
<span class="k">else</span>
<span class="c1"># original implementation</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
<blockquote>
<p>Also, this proposal is about adding a block to <code>difference</code>, but if we do that, we should add it to <code>union</code> and <code>intersection</code> and so on, too.</p>
</blockquote>
<p>Agreed, that would probably be best for consistency.</p> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=824752019-11-05T07:01:18Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul></ul><p>jonathanhefner (Jonathan Hefner) wrote:</p>
<blockquote>
<p>duerst (Martin Dürst) wrote:</p>
<blockquote>
<p>Well, that assumes that the block return values can be ordered. But it's easy to create cases where values can be compared for equality, but not ordered. Example: <code>[1, 'abc']</code>.</p>
</blockquote>
<p>I'm not sure that I follow.</p>
</blockquote>
<p>In my sentence above, 'that' refers to 'an O(n) implementation'. Naive implementations don't have that problem, but they are slow (O(n²)).</p>
<blockquote>
<p>Here is a naive implementation of what I was envisioning:</p>
</blockquote> Ruby master - Feature #16146: Array .difference allow custom comparisonhttps://bugs.ruby-lang.org/issues/16146?journal_id=826242019-11-11T20:34:21Zjonathanhefner (Jonathan Hefner)jonathan@hefner.pro
<ul></ul><p>duerst (Martin Dürst) wrote:</p>
<blockquote>
<p>In my sentence above, 'that' refers to 'an O(n) implementation'. Naive implementations don't have that problem, but they are slow (O(n²)).</p>
</blockquote>
<p>But the implementation I posted is O(n). (Note that <code>rejected</code> is a Set rather than an Array, so <code>rejected.include?</code> should be O(1)).</p>