Feature #6515

array.c: added method that verifies if an Array is part of another

Added by Li Ellis Galardo almost 2 years ago. Updated over 1 year ago.

[ruby-core:45316]
Status:Feedback
Priority:Normal
Assignee:Yukihiro Matsumoto
Category:-
Target version:next minor

Description

This method is like the include? method but instead of receiving a value and check if the array has it, it receives an array an check if it's part of another one.

https://github.com/ruby/ruby/pull/127

array.c Magnifier (127 KB) Li Ellis Galardo, 05/30/2012 12:14 PM

test_array.rb Magnifier (60.2 KB) Li Ellis Galardo, 05/30/2012 12:14 PM

feature_6515_array_part_of.patch Magnifier (2.34 KB) Zachary Scott, 05/30/2012 09:00 PM

feature_6515_array_part_of_v2.patch Magnifier (983 Bytes) Li Ellis Galardo, 05/30/2012 11:32 PM

feature_6515_array_part_of_v3.patch Magnifier (2.11 KB) Li Ellis Galardo, 06/02/2012 03:00 AM

History

#1 Updated by Nobuyoshi Nakada almost 2 years ago

  • % Done changed from 100 to 0

Why attaching whole files?

#2 Updated by Li Ellis Galardo almost 2 years ago

Sorry, I wasn;t sure what to upload. Please remove them.

nobu (Nobuyoshi Nakada) wrote:

Why attaching whole files?

#3 Updated by Zachary Scott almost 2 years ago

I've added Li's patch from GH#127

#4 Updated by Yui NARUSE almost 2 years ago

  • Status changed from Open to Assigned
  • Assignee set to Yukihiro Matsumoto

#5 Updated by Li Ellis Galardo almost 2 years ago

I'm adding a new patch that fixed a typo. I also did some refactor to the code.

#6 Updated by Nobuyoshi Nakada almost 2 years ago

=begin
I'm not in favor with this proposal, because of its name mainly.
'Array ((|X|)) is part of array ((|Y|))' sounds like 'all elements in ((|X|)) are
arranged in ((|Y|)) successively in the same order', to me.

If the order is not your concern, why it isn't a method of (({Set}))?

Also, (({Array#&})) isn't necessary here.
(ary1 - ary2).empty?
would work enough.
=end

#7 Updated by Li Ellis Galardo almost 2 years ago

@nobu you are right. I'll change the method to (arry1 - arry2).empty?. About the method functionally I thought of it like the subset method from Set. That's why it doesn't care about the order.

#8 Updated by Nobuyoshi Nakada almost 2 years ago

  • Status changed from Assigned to Feedback

=begin
Then why don't you use (({Set#subset?})) method?

What's the rationale that (({Array})) also should have similar method?
And why different name?
=end

#9 Updated by Li Ellis Galardo almost 2 years ago

Sorry but i'm not sure what you are asking. In other words you really think it's easier to do this:

a = [1,2,3,4,5,6,7].toset
=> #
b = [1,2,3].to
set
=> #
b.subset? a
=> true

If you answer is yes, then we are doing this wrong because subset? or part_of method should be in a parent class (maybe Enumerable class ) in order for it to work for subset, array, hash and any data structure that inherit from it Enumerable.

There's a difference between subset and arrays isn't it? they should have different methods, eventually some of them will work for subset just as they work for arrays.

#10 Updated by Benoit Daloze almost 2 years ago

lellisga (Li Ellis Galardo) wrote:

a = [1,2,3,4,5,6,7].toset
=> #
b = [1,2,3].to
set
=> #
b.subset? a
=> true

If you answer is yes, then we are doing this wrong because subset? or part_of method should be in a parent class (maybe Enumerable class ) in order for it to work for subset, array, hash and any data structure that inherit from it Enumerable.

Taking you example of a the first set and b the maybe-subset:
Well, probably because #subset? is O(m) (m is the size of b), and doing it naively on Array/Enumerable is O(m*n) (n being the size of a). Using (b-a).empty? is actually O(n+m) because it creates a hash (think set in this case) underneath and iterates b. Note Array#- mentions the Set library, because it's likely what you want if you need this kind of operations. Of course you might need both Array and Set operations, in which case you may:
- transform them to set when needed: which is O(n+m + m) so O(n+m).
- use (a-b).empty? which I believe is clear and concise (although a comment might help for other readers) but creates the diff Array for nothing. You might want something like:

as = a.to_set
b.all? { |e| as.include?(e) }

Which does not allocate an intermediary Array, and is O(n+m) as well.

Often, when you want this kind of behavior, you don't need Array-specific operations, which is why using a Set directly is better (and usually more efficient).

#11 Updated by Benoit Daloze almost 2 years ago

Eregon (Benoit Daloze) wrote:

as = a.to_set
b.all? { |e| as.include?(e) }

Which does not allocate an intermediary Array, and is O(n+m) as well.

But does allocate the Set for it of course (so likely not better).

#12 Updated by Yusuke Endoh over 1 year ago

  • Target version changed from 2.0.0 to next minor

Also available in: Atom PDF