Project

General

Profile

Actions

Bug #18580

closed

Range#include? inconsistency for beginless String ranges

Added by zverok (Victor Shepelev) 10 months ago. Updated 4 days ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:107547]

Description

The follow-up of #18577.

Range#include? is specialized for strings. For all I can tell, this behavior is relatively new: it emerged when switching #=== to use #cover? instead of #include?:

  • in 2.6, it was decided that String Ranges should continue to use #include? (#14575)
  • ...but in 2.7, it was decided that they should use #cover? as all others (#15449)

The "despecialization" in 2.7 was actually done by adding more specialization in range_include_internal.

This leads to the following:

# default Range behavior:
(..Date.today).include? Date.today
# (irb):10:in `each': can't iterate from NilClass (TypeError)

# String Range behavior:
(..'z').include?('w')
# => true 

Why I think it is bad:

  1. This is a leftover of the relatively recent change; why it is 3 versions old, I doubt there is a lot of code relying on this quirk (only beginless ranges are affected)
  2. The more "invisible specialization", the more possibility of bugs (as #18577 shows)
  3. It is easy to stumble upon while learning Ruby/experimenting (strings and ranges are the most basic objects to be seen in many examples and courses), and to become confused, because there is no reasonable explanation for the semantics other than "well... for String, it is so"
  4. It can be said that the String ranges behavior have the same specialization as Numeric ones, but it is not so; while Numeric Ranges specialization is long-living legacy, it is at least consistent (#include? just behaves like #cover? for them, always):

Numerics:

(1...3).include? 1.5
# => true, like #cover?

(...3).include? 1.5
# => true, same, the explanation is "just like #cover?"

Strings:

('a'..'z').include?('ww')
# => false, like #include? for any other type

(..'z').include?('ww')
# => true, this is deeply confusing
Actions #1

Updated by zverok (Victor Shepelev) 10 months ago

  • Subject changed from Range#include? inconsistency for String ranges to Range#include? inconsistency for beginless String ranges

Updated by jeremyevans0 (Jeremy Evans) 9 months ago

This was discussed during the February 2022 developer meeting, and @matz (Yukihiro Matsumoto) said he needs more time to consider it.

Updated by matz (Yukihiro Matsumoto) 3 months ago

I once thought about removing each+succ semantics from include? altogether for simplicity, but it is too big incompatibility.
So I decided to make include? to raise exception for beginless/endless non-numeric ranges.

Matz.

Updated by nobu (Nobuyoshi Nakada) 3 months ago

matz (Yukihiro Matsumoto) wrote in #note-3:

So I decided to make include? to raise exception for beginless/endless non-numeric ranges.

I understood that all non-numeric ranges raise ArgumentError.
Only beginless/endless ranges?

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

nobu (Nobuyoshi Nakada) wrote in #note-4:

matz (Yukihiro Matsumoto) wrote in #note-3:

So I decided to make include? to raise exception for beginless/endless non-numeric ranges.

I understood that all non-numeric ranges raise ArgumentError.
Only beginless/endless ranges?

Assuming @matz (Yukihiro Matsumoto) only wants this behavior change for beginless/endless non-numeric ranges, I submitted a pull request to implement that: https://github.com/ruby/ruby/pull/6261

Beginless ranges other than those having a string end already raised TypeError in this case, and I think that makes the most sense, so I used TypeError for the exception.

I don't think we should change Range#include? behavior for ranges with both a begin and an end, as that is likely to break a substantial amount of existing code.

Updated by nobu (Nobuyoshi Nakada) 3 months ago

jeremyevans0 (Jeremy Evans) wrote in #note-5:

Assuming @matz (Yukihiro Matsumoto) only wants this behavior change for beginless/endless non-numeric ranges, I submitted a pull request to implement that: https://github.com/ruby/ruby/pull/6261

OK, I just thought he was saying removing special handling of String all in the middle of the discussion.

About the implementation, how about splitting the function by string_use_cover?

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

nobu (Nobuyoshi Nakada) wrote in #note-6:

About the implementation, how about splitting the function by string_use_cover?

That seems like a good idea to me. I'll update the pull request to use that approach.

Updated by zverok (Victor Shepelev) 3 months ago

TBH, my position is that:

  1. I don't see why non-numeric Range#include? should be prohibited if it is possible to calculate. One example: Date.new(2022, 8, 1)...Date.new(2022, 9, 1) currently has different semantics for cover? (covers any Date and DateTime inside the period) and include? (does this sequence of dates include the target date?); both are clear and useful.
  2. I don't see why we need any specialization for String ranges: their behavior should be just generic, like any other range.

I am honestly lost now.

Actions #10

Updated by jeremyevans (Jeremy Evans) 4 days ago

  • Status changed from Open to Closed

Applied in changeset git|04a92a6764bf678919cf4b68a27496a39d6b886a.


Raise TypeError for endless non-numeric range include?

Beginless ranges previously raised TypeError for this case,
except for string ranges, which had unexpected behavior:

('a'..'z').include?('ww') # false
(..'z').include?('ww') # previously true, now TypeError

Use of include? with endless ranges could previously result
in an infinite loop.

This splits off a range_string_cover_internal function from
range_include_internal.

Fixes [Bug #18580]

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0