Feature #5607
closedInconsistent reaction in Range of String
Description
=begin
When I tried to access excel file, I found some inconsistent behavior about range of string.
ruby-1.9.3-p0 :001 > ("A".."AB").to_a
=> ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "AA", "AB"]
This behavior is as what I thought.
ruby-1.9.3-p0 :002 > ("X".."AB").to_a
=> []
However, I tried to access "X" to "AB", and its reaction is inconsistent with above example.
I hope that behavior would be consistent in future release.
Thanks!
=end
Updated by Eregon (Benoit Daloze) about 13 years ago
Hi,
This is indeed surprising.
Range#to_a is calling Range#each which has a special case for Strings to call String#upto, which is said to use String#succ.
However, in rb_str_upto (string.c:2995), there is a test that do not yield correspondingly to #succ mentioned in the documentation:
n = rb_str_cmp(beg, end);
if (n > 0 || (excl && n == 0)) return beg;
In your case "X" <=> "AB" returns 1, so nothing is yielded.
The assumption to yield nothing when beg > end is not producing an intuitive result in this case, because the definition of <=> is using a different comparison and so a <=> a.succ might as well be -1 or 1.
I believe this test should be changed to use a String#succ -based comparison, if this is possible.
P.S.: The documentation starts with:
Iterates through successive values, starting at str and
ending at other_str inclusive, [...]
I believe "inclusive" should be removed there, as it depends whether the exclusive option is set and is explained further.
P.S.2: I'm not sure it is right to use different methods (not only #succ) in Range#each while being undocumented. It should probably mention it uses String#upto for String and Symbol.
P.S.3: Range uses #succ and #<=>, which might not be coherent as we see in the case of String. How to solve that?
Updated by Anonymous about 13 years ago
It should be forbidden to have a Class (here Range) whose instance
methods are linked by variable axiomatic relations, depending on the
actual instance. There are too many different concepts covered by the
same name Range.
Actually, as the example shows, the situation is already strange for
Strings (even before taking in account encodings, which add to the
confusion).
_md
Benoit Daloze wrote in post #1031231:
P.S.2: I'm not sure it is right to use different methods (not only
#succ) in Range#each while being undocumented. It should probably
mention it uses String#upto for String and Symbol.P.S.3: Range uses #succ and #<=>, which might not be coherent as we see
in the case of String. How to solve that?Feature #5607: Inconsistent reaction in Range of String
http://redmine.ruby-lang.org/issues/5607Author: Yen-Nan Lin
Status: Open
Priority: Normal
Assignee:
Category:
Target version:=begin
When I tried to access excel file, I found some inconsistent behavior
about range of string.ruby-1.9.3-p0 :001 > ("A".."AB").to_a
=> ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M",
"N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "AA",
"AB"]This behavior is as what I thought.
ruby-1.9.3-p0 :002 > ("X".."AB").to_a
=> []However, I tried to access "X" to "AB", and its reaction is inconsistent
with above example.I hope that behavior would be consistent in future release.
Thanks!
=end
--
Posted via http://www.ruby-forum.com/.
Updated by hasari (Hiro Asari) about 13 years ago
See #2323. In particular, in note 2, Matz acknowledges that the situation is muddled when it comes to Ranges specified by Strings.
Updated by alexeymuranov (Alexey Muranov) about 13 years ago
This behavior of range seems consistent with
"X"<"AB" # => false
in Ruby 1.9.3.
Updated by Anonymous about 13 years ago
Yes, but if "X < AB" is false, "X" should not be between "A" and "AB".
_md
-----Message d'origine-----
De : Alexey Muranov [mailto:muranov@math.univ-toulouse.fr]
Envoyé : jeudi 10 novembre 2011 18:15
À : ruby-core@ruby-lang.org
Objet : [ruby-core:40914] [ruby-trunk - Feature #5607] Inconsistent reaction in Range of String
Issue #5607 has been updated by Alexey Muranov.
This behavior of range seems consistent with
"X"<"AB" # => false
in Ruby 1.9.3.¶
Feature #5607: Inconsistent reaction in Range of String
http://redmine.ruby-lang.org/issues/5607
Author: Yen-Nan Lin
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
=begin
When I tried to access excel file, I found some inconsistent behavior about range of string.
ruby-1.9.3-p0 :001 > ("A".."AB").to_a
=> ["A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "AA", "AB"]
This behavior is as what I thought.
ruby-1.9.3-p0 :002 > ("X".."AB").to_a
=> []
However, I tried to access "X" to "AB", and its reaction is inconsistent with above example.
I hope that behavior would be consistent in future release.
Thanks!
=end
Updated by alexeymuranov (Alexey Muranov) about 13 years ago
Anonymous wrote:
Yes, but if "X < AB" is false, "X" should not be between "A" and "AB".
_md
I agree. ("A".."AB").to_a
seems inconsistent with ordering and with ("X".."AB").to_a
.
It seems that the behavior of ("A".."AB").to_a
is aimed at particular applications. I am against that: i think that String objects, as well as Range of String objects, are not supposed to know about traditions of naming spreadsheet columns.
Update I see only one way to make ("X".."AB").to_a
return ["X", "Y", "Z", "AA", "AB"]
: let it know somehow (the object or the method) that the order used is DegLex, and that the set of admissible strings only include strings in capital letters A-Z. I made my suggestions about this in discussion of Issue #5534: http://redmine.ruby-lang.org/issues/5534#change-22110 but i am not sure how practical they are at the moment.
Updated by matz (Yukihiro Matsumoto) about 13 years ago
- Status changed from Open to Feedback
Ruby classes often play several roles, for example, Array can be array, stack or queue, according to usage of methods. Range is similar. A range is a class with starti point and end point (and flag for end-exclusion). You can use it as interval or sequence of iterated objects from start to end.
In most cases (especially for numbers) those two behave same, but for strings, they behave quite differently, you have to care about how to use ranges. The following methods treat ranges as intervals:
min, max, cover?
The other methods like the following treat ranges as seqeunces:
===, each, step, member?, include?, and methods inherited from Enumerable
matz.
Updated by duerst (Martin Dürst) almost 13 years ago
- Status changed from Feedback to Open
Yukihiro Matsumoto wrote:
Ruby classes often play several roles, for example, Array can be array, stack or queue, according to usage of methods. Range is similar. A range is a class with starti point and end point (and flag for end-exclusion). You can use it as interval or sequence of iterated objects from start to end.
In most cases (especially for numbers) those two behave same, but for strings, they behave quite differently, you have to care about how to use ranges. The following methods treat ranges as intervals:
min, max, cover?
The other methods like the following treat ranges as seqeunces:
===, each, step, member?, include?, and methods inherited from Enumerable
This makes a lot of sense so far. But the example uses #to_a, which is inherited from Enumerable. And still it treats a range as an interval, not as a sequence.
I have reopened this. I think it should be a bug, not a feature. I would have changed this to a bug if I knew how. Or should we reopen #2323?
Updated by naruse (Yui NARUSE) almost 13 years ago
Martin Dürst wrote:
This makes a lot of sense so far. But the example uses #to_a, which is inherited from Enumerable. And still it treats a range as an interval, not as a sequence.
I have reopened this. I think it should be a bug, not a feature. I would have changed this to a bug if I knew how. Or should we reopen #2323?
What is your plan?
Updated by duerst (Martin Dürst) over 12 years ago
Yui NARUSE wrote:
What is your plan?
Short version: Make it work the way Matz described it in http://bugs.ruby-lang.org/issues/5607#note-7.
I haven't yet looked at the code, but Benoit provides some good pointers. I hope to have some time to give it a try, but I don't mind if somebody else is faster than me.
Updated by akr (Akira Tanaka) over 12 years ago
I presented String#succ mechanism:
http:www.a-k-r.org/pub/string-succ-rejectkaigi2008.pdf
(in Japanese)
Updated by duerst (Martin Dürst) over 12 years ago
We have discussed this issue at today's developers' meeting in Akihabara.
We agreed that it would be desirable to fix this, but that it may not be easy to implement. To avoid endless loops, one has to be able to check whether the start of the range will reach the end with a finite number of .succs.
I have tentatively volunteered to look at this issue and try to implement it (but I can't guarantee a result, sorry).
Updated by mame (Yusuke Endoh) over 12 years ago
- Status changed from Open to Assigned
- Assignee set to duerst (Martin Dürst)
Martin-sensei,
I tentatively assign this ticket to you.
If you give up, please set the assignee to another person,
or let make it a blank. Take it easy.
--
Yusuke Endoh mame@tsg.ne.jp
Updated by duerst (Martin Dürst) over 12 years ago
On 2012/03/28 0:10, mame (Yusuke Endoh) wrote:
Issue #5607 has been updated by mame (Yusuke Endoh).
Status changed from Open to Assigned
Assignee set to duerst (Martin Dürst)Martin-sensei,
I tentatively assign this ticket to you.
If you give up, please set the assignee to another person,
or let make it a blank. Take it easy.
I actually said I would take it at
http://bugs.ruby-lang.org/issues/5607#note-12, and I thought that I had
assigned it to me, but apparently, I forgot.
Also, I have already made some progress on how to address this. The
pointer from Akira
(http:www.a-k-r.org/pub/string-succ-rejectkaigi2008.pdf, in Japanese)
was very helpful.
Regards, Martin.
Updated by mame (Yusuke Endoh) almost 4 years ago
- Has duplicate Bug #13663: `String#upto` doesn't work as expected added
Updated by mame (Yusuke Endoh) almost 4 years ago
- Status changed from Assigned to Closed
The almost same issue #13663 has been closed.