Bug #7342

String#<=> checks for a #to_str method on other but never uses it?

Added by Joshua Ballanco over 1 year ago. Updated over 1 year ago.

[ruby-core:49279]
Status:Closed
Priority:Normal
Assignee:Nobuyoshi Nakada
Category:-
Target version:2.0.0
ruby -v:2.0.0 Backport:

Description

=begin
This isn't exactly a bug, as much as a request for clarification. I was looking at the semantics of the (({<=>})) operator and noticed something curious. For most classes, when evaluating (({thing <=> other})), if (({other})) is not of a compatible type, then (({nil})) is returned.

The exceptions (as far as I can find) are String and Time. For the Time class, if (({other})) is not a kind of (({Time})), then the reverse comparison (({other <=> thing})) is tried and the inverse of this result is returned (if not nil). For String, the reverse comparison is only tried IF (({other.respondto?(:tostr)})), HOWEVER the referenced (({other.to_str})) method is never called. For example:

class NotAString
  def <=>(other)
    1
  end
  def to_str
    raise "I'm not a string!"
  end
end

"test" <=> NotAString.new #=> -1

This seems very counterintuitive to me. I would expect that if my class implemented (({to_str})), that the return value of this would be used for comparison.
=end

string_cmp.diff Magnifier (1.9 KB) Joshua Ballanco, 11/21/2012 07:54 PM

Associated revisions

Revision 38044
Added by Nobuyoshi Nakada over 1 year ago

string.c: compare with to_str

  • string.c (rbstrcmpm): try to compare with tostr result if possible before calling <=> method. [Bug #7342]

History

#1 Updated by Joshua Ballanco over 1 year ago

I would expect something like the following patch makes more sense?

diff --git a/string.c b/string.c
index c63f59a..c9eed27 100644
--- a/string.c
+++ b/string.c
@@ -2385,8 +2385,12 @@ rbstrcmp_m(VALUE str1, VALUE str2)
long result;

 if (!RB_TYPE_P(str2, T_STRING)) {
  • if (!rbrespondto(str2, rbintern("tostr"))) {
  • return Qnil;
  • if (rbrespondto(str2, rbintern("tostr"))) {
  • VALUE tmp = rbfuncall(str2, rbintern("to_str"), 0);
  • if (!RBTYPEP(tmp, T_STRING)) {
  • return Qnil;
  • }
  • result = rbstrcmp(str1, tmp); } else if (!rbrespondto(str2, rb_intern("<=>"))) { return Qnil;

#2 Updated by Jeremy Kemper over 1 year ago

When an object responds to #tostr, it's saying "I am a string." When an object responds to #tos, it's saying "I have a string representation."

So checking for #to_str here is enough to know whether str2 is a string and can be compared.

For more background on implicit vs explicit coercion in Ruby: http://briancarper.net/blog/98/

#3 Updated by Joshua Ballanco over 1 year ago

As the page you linked points out, #tostr is an implicit cast. i.e. It should be used internally to retrieve the string representation of an object. I think this is in keeping with all other uses of #tostr in Ruby source.

Another thing to note is that currently in Ruby if you have an object that provides #to_str but not #<=>, then it cannot be compared to a string object.

class Foo
def to_str
"my string"
end
end

"test" < Foo.new #=> ArgumentError: comparison of String with Foo failed

#4 Updated by Jeremy Kemper over 1 year ago

"It should be used internally to retrieve the string representation of an object." That's explicit coercion. Implicit coercion with #to_str means the object acts a string and the method needn't be called.

This method is used for more than its return value. It's in a strange limbo world between Ruby and the C API :)

The presence of #tostr indicates that the object obeys an entire String contract such that the C API can work with the object without making Ruby method calls. You note correctly that providing #tostr but not #<=> prohibits comparison. That's because by omitting #<=> you've already broken the "I am a string" contract.

Check out how time.c for another example of checking #tostr and, more generally, see rbcheckconverttype for many other examples of implicit coercion in practice: topath, toint, to_ary, etc.

#5 Updated by Nobuyoshi Nakada over 1 year ago

jballanc (Joshua Ballanco) wrote:

I would expect something like the following patch makes more sense?

You can use rbcheckfuncall().

#6 Updated by Joshua Ballanco over 1 year ago

The presence of #to_str indicates that the object obeys an entire String contract such that the C API can work with the object without making Ruby method calls.

Hmm... I was always under the impression that the distinction between #tos and #tostr is that #tos provides a (potentially lossy) string representation of any object, but #tostr will return a "string equivalent" of the object. As for the C API, the rb_str_to_str method does call #tostr if v#tostr exists and v is not already a string. I guess it would be good to get some clarification on this issue.

You can use rbcheckfuncall().

Thank you for the pointer, nobu! Actually, in looking at the implementation of String#<=> again I found some other oddities. For example, if Other#to_str is defined and Other#<=> returns a float, then "a string" <=> Other.new will return a float. I feel like this breaks the contract of #<=> as it should only ever return 1, 0, or -1. Anyhow, I've attached an updated patch that also includes some test fixes.

(Note: all tests in make test-all that passed before this patch pass after, however rubyspec will need to be updated. I will send a pull-request directly to the rubyspec project if this gets accepted.)

#7 Updated by Yusuke Endoh over 1 year ago

  • Status changed from Open to Assigned
  • Assignee set to Nobuyoshi Nakada
  • Target version set to 2.0.0

#8 Updated by Nobuyoshi Nakada over 1 year ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r38044.
Joshua, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


string.c: compare with to_str

  • string.c (rbstrcmpm): try to compare with tostr result if possible before calling <=> method. [Bug #7342]

Also available in: Atom PDF