Project

General

Profile

Feature #16155

Add an Array#intersection method

Added by connorshea (Connor Shea) 11 months ago. Updated 10 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:94855]

Description

Array#union and Array#difference were added in Ruby 2.6 (see this bug), but an equivalent for & (intersection) was not.

I'd like to propose Array#intersection. This would essentially just be a more readable alias for Array#&, in the same way that Array#| and Array#- have Array#union and Array#difference.

I think it'd make sense for Ruby to have a more readable name for this method :)

Current syntax:

[ 1, 1, 3, 5 ] & [ 3, 2, 1 ]                 #=> [ 1, 3 ]
[ 'a', 'b', 'b', 'z' ] & [ 'a', 'b', 'c' ]   #=> [ 'a', 'b' ]

What I'd like to see added:

[ 1, 1, 3, 5 ].intersection([ 3, 2, 1 ])                #=> [ 1, 3 ]
[ 'a', 'b', 'b', 'z' ].intersection([ 'a', 'b', 'c' ])  #=> [ 'a', 'b' ]

mame asks about intersection in this comment on the union/difference bug, but as far as I can tell it was never addressed.

Set#intersection already exists and is an alias for Set#&, so there's precedent for such a method to exist.

Thanks for Ruby, I enjoy using it a lot! :)

Related links:

#1

Updated by shevegen (Robert A. Heiler) 11 months ago

I sort of agree with your reasoning. A slight add-on, though,
on that part:

This would essentially just be a more readable alias for Array#&

It is probably easier to search for it (e. g. a google-search for
.intersection) than for &. But if we compare relative merits then
we also have to remember an advantage for & being that it is
short/succinct.

It's an aside, though, because I agree with your reasoning anyway;
it would make sense to also have .intersection, IMO. :)

You could consider adding this to the upcoming developer meeting:

https://bugs.ruby-lang.org/issues/16152

And ask matz whether he would be fine with the functionality
itself; or, more importantly, the name for the method
(#intersection) since the functionality already exists.

Updated by connorshea (Connor Shea) 11 months ago

shevegen (Robert A. Heiler) wrote:

You could consider adding this to the upcoming developer meeting:

https://bugs.ruby-lang.org/issues/16152

And ask matz whether he would be fine with the functionality
itself; or, more importantly, the name for the method
(#intersection) since the functionality already exists.

Thanks, I will :)

After thinking about the proposal a bit more, one other thing I wanted to ask was about whether we should allow multiple arrays to be passed as arguments. Array#difference and Array#union both allow this, so it might make sense to have Array#intersection also take multiple arguments.

For example:

# intersection with multiple arguments.
[ 'a', 'b', 'b', 'z' ].intersection([ 'a', 'b', 'c' ], [ 'b' ])  #=> [ 'b' ]

Should we keep intersection as a simple alias for & or have it allow multiple arrays, to match the difference and union methods?

Updated by phluid61 (Matthew Kerwin) 11 months ago

connorshea (Connor Shea) wrote:

Should we keep intersection as a simple alias for & or have it allow multiple arrays, to match the difference and union methods?

The question should be: is it needed?

Updated by matz (Yukihiro Matsumoto) 11 months ago

Accepted. PR welcome.

Matz.

Updated by prajjwal (Prajjwal Singh) 10 months ago

Hi all,

I've added an initial implementation of this based on Array#& on github (https://github.com/ruby/ruby/pull/2533).

One thing that stands out to me is the naming inconsistency between rb_ary_and and rb_ary_intersection_multi, and I think either could be changed to be consistent with the other. Thoughts?

Updated by prajjwal (Prajjwal Singh) 10 months ago

Implementation is currently based on Array#&, which is elegant but might end up allocating a whole bunch of arrays holding intermediate results. If needed I can implement Array#intersection so it only allocates the result array once, but then I would like to rewrite Array#& in terms of Array#intersection to keep things DRY.

static VALUE
rb_ary_intersection_multi(int argc, VALUE *argv, VALUE ary)
{
    VALUE result = rb_ary_dup(ary);
    int i;

    for (i = 0; i < argc; i++) {
        result = rb_ary_and(result, argv[i]);
    }

    return result;
}

Let me know what you think.

#7

Updated by nobu (Nobuyoshi Nakada) 10 months ago

  • Status changed from Open to Closed

Also available in: Atom PDF