Feature #14473
closedAdd Range#subrange?
Description
Hi there,
I'd like to propose a method that returns true when a range that the method gets called on is a subrange of a range passed in as an argument.
Example:
(2..4).subrange?(1...4)
=> true
(-2..2).subrange?(-1..3)
=> false
Files
Updated by duerst (Martin Dürst) almost 7 years ago
- Tracker changed from Bug to Feature
- Backport deleted (
2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN)
Updated by greggzst (Grzegorz Jakubiak) almost 7 years ago
- Subject changed from Range#subrange? to Add Range#subrange?
Updated by owst (Owen Stephens) almost 7 years ago
+1 for this suggestion - we have a similar method in our code base, implemented approximately as follows:
Range.class_exec do
def subset?(other)
raise ArgumentError unless other.is_a?(Range)
first >= other.first && last <= other.last
end
end
As a Range
represents a set of elements, should there also be proper_subset?
, superset?
and proper_superset?
methods (and perhaps their operator aliases) similar to those on Set
(http://ruby-doc.org/stdlib-2.4.2/libdoc/set/rdoc/Set.html)?
Updated by owst (Owen Stephens) almost 7 years ago
So, something like (naive implementation in Ruby):
Range.class_exec do
def subset?(other)
raise ArgumentError unless other.is_a?(Range)
first >= other.first && last <= other.last
end
def proper_subset?(other)
raise ArgumentError unless other.is_a?(Range)
self != other && subset?(other)
end
def superset?(other)
raise ArgumentError unless other.is_a?(Range)
first <= other.first && last >= other.last
end
def proper_superset?(other)
raise ArgumentError unless other.is_a?(Range)
self != other && superset?(other)
end
end
require 'minitest/autorun'
class TestRange < MiniTest::Test
def test_subset
assert_equal true, (1..10).subset?(1..10)
assert_equal true, (2..9).subset?(1..10)
assert_equal true, (1..9).subset?(1..10)
assert_equal true, (2..10).subset?(1..10)
assert_equal false, (0..11).subset?(1..10)
assert_equal false, (0..10).subset?(1..10)
assert_equal false, (1..11).subset?(1..10)
assert_raises(ArgumentError) { (1..10).subset?('not a range') }
end
def test_proper_subset
assert_equal false, (1..10).proper_subset?(1..10)
assert_equal true, (2..9).proper_subset?(1..10)
assert_equal true, (1..9).proper_subset?(1..10)
assert_equal true, (2..10).proper_subset?(1..10)
assert_equal false, (0..11).proper_subset?(1..10)
assert_equal false, (0..10).proper_subset?(1..10)
assert_equal false, (1..11).proper_subset?(1..10)
assert_raises(ArgumentError) { (1..10).proper_subset?('not a range') }
end
def test_superset
assert_equal true, (1..10).superset?(1..10)
assert_equal false, (2..9).superset?(1..10)
assert_equal false, (1..9).superset?(1..10)
assert_equal false, (2..10).superset?(1..10)
assert_equal true, (0..11).superset?(1..10)
assert_equal true, (0..10).superset?(1..10)
assert_equal true, (1..11).superset?(1..10)
assert_raises(ArgumentError) { (1..10).superset?('not a range') }
end
def test_proper_superset
assert_equal false, (1..10).proper_superset?(1..10)
assert_equal false, (2..9).proper_superset?(1..10)
assert_equal false, (1..9).proper_superset?(1..10)
assert_equal false, (2..10).proper_superset?(1..10)
assert_equal true, (0..11).proper_superset?(1..10)
assert_equal true, (0..10).proper_superset?(1..10)
assert_equal true, (1..11).proper_superset?(1..10)
assert_raises(ArgumentError) { (1..10).proper_superset?('not a range') }
end
end
For the record, I'd be interested in taking a stab at implementing this in MRI, if the proposal is accepted.
Updated by nobu (Nobuyoshi Nakada) almost 7 years ago
It should consider exclude_end?
too.
Updated by owst (Owen Stephens) almost 7 years ago
- File 0001-range.c-add-subset-superset-methods.patch 0001-range.c-add-subset-superset-methods.patch added
Thank you nobu, I came to the same realisation when writing the tests for my attempt at an implementation (patch attached).
Please let me know what you think, when you have time. This is my first "proper" MRI patch, so all comments gratefully received!
Updated by duerst (Martin Dürst) almost 7 years ago
As long as we only consider ranges starting and ending with integers, concepts such as subset and superset make sense. But ranges can also be constructed from other numbers, and other objects in general.
What e.g. should be the result of (5..10).subset?(5.5..7.9)
?
Or what if we introduce something like Range.new(1, 10, step: 2)
, which would produce [1, 3, 5, 7, 9]
when converted to an Array
?
Updated by owst (Owen Stephens) over 6 years ago
Good questions duerst. I wonder if I should have stuck with the subrange
naming rather than subset
, due to the Numeric behaviour.
What e.g. should be the result of (5..10).subset?(5.5..7.9)?
The implementation I have used in my patch is equivalent to:
(5.5..7.9).cover?((5..10).min) && (5.5..7.9).cover?((5..10).max)
which is false, but means
(6..7).subset?(5.5..8.5)
is true as both (5.5..8.5).cover?(6)
and (5.5..8.5).cover?(7)
are true.
I think this result is what I would expect - what do you think?
Or what if we introduce something like Range.new(1, 10, step: 2), which would produce [1, 3, 5, 7, 9] when converted to an Array?
Assuming the implementation in terms of cover?
(and using the subrange?
naming) I think this would still make sense:
(2..4).subrange?(Range.new(1, 5, step: 2)) # => true
Updated by owst (Owen Stephens) over 6 years ago
- File v2-0001-range.c-add-subset-superset-methods.patch v2-0001-range.c-add-subset-superset-methods.patch added
Attached updated patch, the former had a syntax error (a missing )
).
Updated by al2o3cr (Matt Jones) over 6 years ago
Minor point: IMO a "Range with steps" isn't a Range, it's something else. It doesn't seem relevant to the discussion.
Bigger point: the tests on improper ranges in the patch seem to imply some odd consequences:
range_1 = "g".."f"
range_2 = "b".."e"
range_1.subset?(range_2) # => true
range_2.any? { |el| range_1.include? el } # => false
So range_1
claims to be a subset of range_2
, but no elements of range_2
are included in range_1
.
range_3 = "d".."a"
range_1.subset?(range_3) # => true
range_3.subset?(range_1) # => true
Generally, a <= b
and b <= a
implies a == b
. In this case, however, the early return in range_subset
means that any improper range passed to subset?
will return true
.
min
and max
also have bad performance characteristics for exclusive ranges that linear_object_p
returns false
for, as they need to generate all the intermediate elements using succ
. For instance, attempting to evaluate ("aaaaaaaaaa"..."zzzzzzzzzz").max
ran for over a minute before I gave up and aborted it. There's also explicit code (see discrete_object_p
and callsites) that makes exclusive ranges of Time
objects raise TypeError
if you call max
on them.
Updated by vo.x (Vit Ondruch) over 6 years ago
Why not modify Range#include? to accept Range object. Anyway, there is missing any justification for the RFE.
Updated by owst (Owen Stephens) over 6 years ago
Thanks for your thoughts al2o3cr, your "bigger point" is interesting:
So range_1 claims to be a subset of range_2, but no elements of range_2 are included in range_1.
My thinking here was that the empty set is a subset of all sets, so similar behaviour seemed reasonable for empty/invalid Ranges, i.e. phrase the subset "definition" as: "all elements of range_1 are covered by range_2", which is vacuously true.
You're right about the implication for <=
so maybe it's a bad idea to treat "empty"/invalid ranges as if they were empty sets (and thus all equivalent). However, the same thing is true for exclusive/inclusive ranges that contain the same elements (when considered as sets):
> (1..2).subset?(1...3)
=> true
> (1...3).subset?(1..2)
=> true
> (1..2) == (1...3)
=> false
> Set.new((1..2)) == Set.new((1...3))
=> true
With ..
vs ...
we can have non-equal Range representations of the same set of elements. I'm not sure what to suggest here. It doesn't quite feel right to restrict subset?
to only allow an argument with the same exclude_end?
value.
min and max also have bad performance characteristics for exclusive ranges that linear_object_p returns false for,
Yes, I didn't like this myself, I'm glad you've brought it up. An option would be to prevent such calls from occurring, which doesn't seem nice, but is probably preferable to performance issues?
Updated by owst (Owen Stephens) over 6 years ago
Good point v.ox, I did consider implementing subset?
as an overloading of include?
(or possibly cover?
?), as one can't have a range-of-ranges (so there is no ambiguity).
How would you distinguish between strict/non-strict - a strict:
kwarg, perhaps? Something like:
(1..3).cover?((1..3)) # => true
(1..3).cover?((1..3), strict: false) # => true
(1..3).cover?((1..3), strict: true) # => false
(1..3).cover?((1..2), strict: true) # => true
I'm not sure the level of justification required for a method to be included in the stdlib, but I can provide an example where we would (and do) use Range#subset?
: we have products that have a range of allowed values, and in some circumstances we restrict products to a dynamically-generated range of allowed values, and want to make sure that range is compatible with the product range, something like:
raise ArgumentError unless dynamic_value_range.subset?(product_value_range)
The other methods (strict_subset?
, and the *superset?
methods) were only added for symmetry.
Updated by owst (Owen Stephens) over 6 years ago
Are there any further thoughts on my latest patch? Based on the discussion above, I think I should rename the added methods: s/set/range/
and perhaps also prevent the "bad performance" cases where either range has exclude_end?
true and linear_object_p
false?
Updated by owst (Owen Stephens) over 6 years ago
- File v3-0001-range.c-add-subrange-superrange-methods.patch v3-0001-range.c-add-subrange-superrange-methods.patch added
I've attached an updated patch; it includes the renamed (by s/set/range/
) methods and a performance improvement for strict_subset?
(before, we would calculate max
on the receiver twice, which is potentially slow when the receiver is an exclude_end Range of non-Numeric objects, now it's only calculated once).
Any comments would be gratefully received!
Updated by matz (Yukihiro Matsumoto) over 6 years ago
- Status changed from Open to Feedback
Any (real-world) use-case?
Matz.
Updated by owst (Owen Stephens) over 6 years ago
Hi Matz!
A slightly contrived example - imagine we want to filter candidates for a new job position and harshly reject any who have a salary requirement that is not covered by our predetermined position salary range:
Candidate = Struct.new(:name, :desired_salary_range)
candidates = [
Candidate.new('Andrew', (15_000..20_500)),
Candidate.new('John', (25_000..35_000)),
Candidate.new('Owen', (15_000..17_500)),
Candidate.new('Zack', (5_000..9_000)),
]
position_salary_range = (10_000..20_000)
acceptable, unacceptable = candidates.partition { |c|
c.desired_salary_range.subrange?(position_salary_range)
}
puts "Unacceptable: #{unacceptable.map(&:name).join(',')}" # => Unacceptable: Andrew, John, Jack
puts "Acceptable: #{acceptable.map(&:name).join(',')}" # => Acceptable: Owen
Looks like I've got the job, as no-one else is acceptable! :-)
As a more serious example, in my job, we offer loans of between £1000 and £8000. We allow certain customers to top-up their loans with an additional advance (chosen from a range of possible advances), if the new loan (the top-up is considered as a new loan) will pay-off their existing loan and cover all possible additional advances. Therefore, to check if someone can top-up their loan we do something similar to:
class Range
def offset(n)
(first + n..last + n)
end
end
VALID_LOAN_VALUE_RANGE = (1000..8000)
TOP_UP_RANGE = (1..5) # For example; very unrealistic numbers
def top_up_message(existing_loan_settlement_value)
top_up_value_range = TOP_UP_RANGE.offset(existing_loan_settlement_value)
if top_up_value_range.subrange?(VALID_LOAN_VALUE_RANGE)
puts "you may top-up your loan with between £#{TOP_UP_RANGE.first} and £#{TOP_UP_RANGE.last}!"
else
puts "sorry, at this stage we can't top-up your loan"
end
end
puts top_up_message(7995) # => you may top-up...
puts top_up_message(7999) # => sorry...
N.b. this adds another useful Range method, Range#offset (which probably only makes sense for numeric ranges), would you accept a new Feature to add this method?
Finally, regarding naming, I think I now prefer overriding cover?
to accept a Range - when I described this feature to my colleague I used the phrase "does one range cover the other?" as an intuition, and indeed, that is how it's implemented.
Please let me know your thoughts.
Updated by shevegen (Robert A. Heiler) over 6 years ago
I have no particular opinion (neither pro nor con) on the issue itself.
On the comment:
[..] this adds another useful Range method, Range#offset (which probably only makes sense for
numeric ranges), would you accept a new Feature to add this method?
I think it would be best to add a new issue for Range#offset; you can then link into the
issue here too, from that other issue. It makes it easier for other people to comment
on it and, I think, also easier to approve it, if it is useful.
Updated by tarui (Masaya Tarui) over 6 years ago
As real-world use-case,
I want it(include,cover version) for guarding NArray's Index arguments.
NArray#[] can accept Integer and Range.
Currently, to limit the access range, We have to write complex code.
def check(idx,range)
case idx
when Integer
range.include?(idx)
when Range
range.begin < idx.begin && ( (!range.exclude_end? && !idx.exclude__end? || range.exclude_end?) ? range.end >=idx.end : range.end > idx.end)
end
end
I want to write simply
range.include?(idx)
Updated by marcandre (Marc-Andre Lafortune) over 6 years ago
Not directly related, but if your ranges are such that begin <= end, then I think you can use range.max > idx.max
for the last part.
Updated by znz (Kazuhiro NISHIYAMA) over 6 years ago
How about Range#{<,<=,>,>=}
like Hash#{<,<=,>,>=}
for method names?
Updated by matz (Yukihiro Matsumoto) over 6 years ago
- Status changed from Feedback to Open
The method name subrange?
may cause confusion that which includes which.
I propose cover?
to accept ranges.
Matz.
Updated by tarui (Masaya Tarui) over 6 years ago
- Status changed from Open to Assigned
- Assignee set to tarui (Masaya Tarui)
Updated by owst (Owen Stephens) over 6 years ago
- File v4-0001-range.c-allow-cover-to-accept-Range-argument.patch v4-0001-range.c-allow-cover-to-accept-Range-argument.patch added
Thank you for your proposal Matz, having thought about it over the last few months, I agree.
I have updated my patch accordingly (which has greatly simplified/improved it), I welcome any comments.
Updated by tarui (Masaya Tarui) over 6 years ago
Thank you for your patch owst.
I reviewed it and feel the behavior below is strange.
$ ruby -e 'p (1..3.1).cover?(1...3)'
true
$ ruby -e 'p (1..3.1).cover?(1...4)'
Traceback (most recent call last):
1: from -e:1:in `<main>'
-e:1:in `cover?': can't iterate from Float (TypeError)
At (a..b).cover?(c...d)
with b < d
,
is it reasonable to use (c...d).max
(but ignore exception) for comparison with b
?
Updated by owst (Owen Stephens) over 6 years ago
- File v5-0001-range.c-allow-cover-to-accept-Range-argument.patch v5-0001-range.c-allow-cover-to-accept-Range-argument.patch added
Hi tarui, thank you for reviewing and your suggestion/question.
In fact, with my v4 patch in the case you describe there is a bug: (1..3).cover?(1.0...4.0)
is true
not false
. This is prevented using b >= (c...d).max
as you suggest, but I see two issues:
First, the performance of Range#max
is very bad in certain cases (as al2o3c pointed out above) e.g. on my machine (Macbook Pro, 2.8 GHz i5) I now have:
$ time ruby -e "p ('aaaaa'..'zzzzz').cover?('aaaaa'...'zzzzz')"
true
ruby -e "p ('aaaaa'..'zzzzz').cover?('aaaaa'...'zzzzz')" 0.19s user 0.05s system 92% cpu 0.256 total
$ time ruby -e "p ('aaaaa'..'zzzzy').cover?('aaaaa'...'zzzzz')"
true
ruby -e "p ('aaaaa'..'zzzzy').cover?('aaaaa'...'zzzzz')" 8.12s user 0.07s system 98% cpu 8.329 total
Second, we have the following strange behaviour that is similar to that in your comment:
$ ruby -e "p (1..4).cover?(1.0...4.0)"
true
$ ruby -e "p (1..3).cover?(1.0...4.0)"
Traceback (most recent call last):
2: from -e:1:in `<main>'
1: from -e:1:in `cover?'
-e:1:in `max': cannot exclude non Integer end value (TypeError)
I have attached an updated patch, which uses max
, but does not address either of the above issues. Can you see an alternative that addresses these issues in a straightforward way?
Updated by Anonymous over 6 years ago
Thank you for the new patch.
At the first issue, I understood max method has performance issue at that case.
But how often will we encounter it? I think it is a very rare case.
Or please show the application.
At the second issue, as I already said 'ignore exception',
I think that it should return false rather than raising an exception.
Updated by owst (Owen Stephens) over 6 years ago
- File v6-0001-range.c-allow-cover-to-accept-Range-argument.patch v6-0001-range.c-allow-cover-to-accept-Range-argument.patch added
I agree that the max performance issue is likely to be a rare case; I am happy to leave it as-is.
Apologies on the second issue - I originally misunderstood "ignore" to mean "don't rescue".
I've attached another updated patch - it rescues the TypeError raised by max
, and also fixes the handling of endless/empty ranges.
Updated by tarui (Masaya Tarui) over 6 years ago
I am happy that match opinion with you.
I will commit based on your patch. It'll take a little while.
Updated by owst (Owen Stephens) over 6 years ago
Great, thank you tarui
Updated by tarui (Masaya Tarui) about 6 years ago
- Status changed from Assigned to Closed
Applied in changeset trunk|r64640.
range.c: Range#cover? accepts Range object. [Feature #14473]
* range.c (range_cover): add code for range argument.
If the argument is a Range, check it is or is not
covered by the reciver. If it can be treated as a
sequence, this method treats it that way.
* test/ruby/test_range.rb (class TestRange): add tests
for this feature.
This patch is written by Owen Stephens. thank you!