Bug #18580
closed
Range#include? inconsistency for beginless String ranges
Added by zverok (Victor Shepelev) almost 3 years ago.
Updated almost 2 years ago.
Description
The follow-up of #18577.
Range#include?
is specialized for strings. For all I can tell, this behavior is relatively new: it emerged when switching #===
to use #cover?
instead of #include?
:
- in 2.6, it was decided that String Ranges should continue to use
#include?
(#14575)
- ...but in 2.7, it was decided that they should use
#cover?
as all others (#15449)
The "despecialization" in 2.7 was actually done by adding more specialization in range_include_internal
.
This leads to the following:
# default Range behavior:
(..Date.today).include? Date.today
# (irb):10:in `each': can't iterate from NilClass (TypeError)
# String Range behavior:
(..'z').include?('w')
# => true
Why I think it is bad:
- This is a leftover of the relatively recent change; why it is 3 versions old, I doubt there is a lot of code relying on this quirk (only beginless ranges are affected)
- The more "invisible specialization", the more possibility of bugs (as #18577 shows)
- It is easy to stumble upon while learning Ruby/experimenting (strings and ranges are the most basic objects to be seen in many examples and courses), and to become confused, because there is no reasonable explanation for the semantics other than "well... for String, it is so"
- It can be said that the String ranges behavior have the same specialization as Numeric ones, but it is not so; while Numeric Ranges specialization is long-living legacy, it is at least consistent (
#include?
just behaves like #cover?
for them, always):
Numerics:
(1...3).include? 1.5
# => true, like #cover?
(...3).include? 1.5
# => true, same, the explanation is "just like #cover?"
Strings:
('a'..'z').include?('ww')
# => false, like #include? for any other type
(..'z').include?('ww')
# => true, this is deeply confusing
- Subject changed from Range#include? inconsistency for String ranges to Range#include? inconsistency for beginless String ranges
This was discussed during the February 2022 developer meeting, and @matz (Yukihiro Matsumoto) said he needs more time to consider it.
I once thought about removing each
+succ
semantics from include?
altogether for simplicity, but it is too big incompatibility.
So I decided to make include?
to raise exception for beginless/endless non-numeric ranges.
Matz.
matz (Yukihiro Matsumoto) wrote in #note-3:
So I decided to make include?
to raise exception for beginless/endless non-numeric ranges.
I understood that all non-numeric ranges raise ArgumentError
.
Only beginless/endless ranges?
nobu (Nobuyoshi Nakada) wrote in #note-4:
matz (Yukihiro Matsumoto) wrote in #note-3:
So I decided to make include?
to raise exception for beginless/endless non-numeric ranges.
I understood that all non-numeric ranges raise ArgumentError
.
Only beginless/endless ranges?
Assuming @matz (Yukihiro Matsumoto) only wants this behavior change for beginless/endless non-numeric ranges, I submitted a pull request to implement that: https://github.com/ruby/ruby/pull/6261
Beginless ranges other than those having a string end already raised TypeError in this case, and I think that makes the most sense, so I used TypeError for the exception.
I don't think we should change Range#include?
behavior for ranges with both a begin and an end, as that is likely to break a substantial amount of existing code.
jeremyevans0 (Jeremy Evans) wrote in #note-5:
Assuming @matz (Yukihiro Matsumoto) only wants this behavior change for beginless/endless non-numeric ranges, I submitted a pull request to implement that: https://github.com/ruby/ruby/pull/6261
OK, I just thought he was saying removing special handling of String all in the middle of the discussion.
About the implementation, how about splitting the function by string_use_cover
?
nobu (Nobuyoshi Nakada) wrote in #note-6:
About the implementation, how about splitting the function by string_use_cover
?
That seems like a good idea to me. I'll update the pull request to use that approach.
TBH, my position is that:
- I don't see why non-numeric
Range#include?
should be prohibited if it is possible to calculate. One example: Date.new(2022, 8, 1)...Date.new(2022, 9, 1)
currently has different semantics for cover?
(covers any Date
and DateTime
inside the period) and include?
(does this sequence of dates include the target date?); both are clear and useful.
- I don't see why we need any specialization for
String
ranges: their behavior should be just generic, like any other range.
I am honestly lost now.
- Status changed from Open to Closed
Applied in changeset git|04a92a6764bf678919cf4b68a27496a39d6b886a.
Raise TypeError for endless non-numeric range include?
Beginless ranges previously raised TypeError for this case,
except for string ranges, which had unexpected behavior:
('a'..'z').include?('ww') # false
(..'z').include?('ww') # previously true, now TypeError
Use of include? with endless ranges could previously result
in an infinite loop.
This splits off a range_string_cover_internal function from
range_include_internal.
Fixes [Bug #18580]
- Related to Bug #19533: Behavior of ===/include? on a beginless/endless range (nil..nil) changed in ruby 3.2 added
Unfortunately, this change broke an existing program: #19533.
The lesson here is that it would be best to avoid changing any behavior just for the sake of consistency unless it is proven that the behavior actually confuses a large number of people.
Also available in: Atom
PDF
Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0