Feature #1697

Object#<=>

Added by Marc-Andre Lafortune over 2 years ago. Updated 10 months ago.

[ruby-core:24063]
Status:Closed Start date:06/28/2009
Priority:Normal Due date:
Assignee:Yukihiro Matsumoto % Done:

100%

Category:core
Target version:1.9.2

Description

The definition of the <=> operator states:
"... And if the two operands are not comparable, it returns nil"  (The Ruby Programming Language, O'Reilly)

Attempting to compare objects, when one or both do not define the <=> operator, causes a NoMethodError to be raised. For example, "false <=> 0" raises in this way. This behavior is unexpected and runs counter to the principle defined above.

Further, "0 <=> false" returns nil. This is fundamentally inconsistent. The two comparisons are the other's converse, yet the raising of an exception in the first case implies that the programmer was in error; that the mere act of making this comparison was erroneous.

The solution is for Object to define a <=> operator. This will solve the case described above, along with the general case of comparing an object to another when one or both do not define <=>. Similarly to Time#<=>, it would return the converse of arg <=> self (i.e. nil => nil, num => -num). It needs to detect recursion, in which case it should return nil or 0 depending on the result of self == arg.

This change would make it always safe to call <=> without having to check first if it respond_to? :<=> (or rescuing NoMethodError).

The existence of Object#<=> would make it much easier for all programmers to define a good <=> operator for their classes. They can simply call super if they don't know how to handle some argument type. For example:

class MyClass
  include Comparable
  def <=> (arg)
    return super unless arg.is_a? MyClass
    # go on with comparison
  end
end

With this simple line, the developper has enabled other classes to be comparable with MyClass. No need to monkeypatch MyClass to ensure that comparing its objects with objects of class ComparableToMyClass will work. Without a 'super', implementing this becomes quite difficult and requires the use of recursion guards (which are not defined in the standard library).

Note that neither String#<=> nor Time#<=> currently use recursion guards, which is not robust and can lead to problems. For instance:

class MyClass
  include Comparable
  def <=> (arg)
    return -1 if arg.is_a? MyClass
    cmp = arg <=> self
    cmp ? -cmp : nil
  end
end

MyClass.new <=> Time.now
# ==> raises a SystemStackError

class Time
  alias_method :to_str, :to_s
end
"now" <=> Time.now
# ==> endless loop that can't be interrupted with ctrl-C.


In summary, defining Object#<=> would:
1) bring consistency between a <=> b and b <=> a
2) provide a sensible default (nil) for objects that can't be compared
3) make it easier for generic methods to call <=> (no rescue or :respond_to? needed)
4) make it much easier for developpers to write extensible <=> methods for their classes.


Side notes:

The proposition stands only for Object. BasicObject would still be available for developers preferring a class with a strict minimum of methods.

The only code that could break would have to be both checking respond_to? :<=> (or rescuing a NoMethodError) _and_ behaving differently than if the <=> method had returned nil. Such code would be quite nonsensical, given the definition of <=>

Other comparison operators like <, <=, >=, > would also gain in consistency if they were defined in terms of <=>. This way, 0 < nil and nil > 0 would raise the same errors instead of errors of different types. This is secondary to the main question: is it better to define Object#<=> or not?
My vote is on 'yes'.

(Thanks to the readers of my first draft)

Related issues

related to ruby-trunk - Bug #2047: Time#<=> Raises NoMethodError on Incomparable Argument Closed 09/05/2009
related to ruby-trunk - Bug #2276: Array#<=> should not raise when comparison fails Closed 10/26/2009

Associated revisions

Revision 25453
Added by Yukihiro Matsumoto over 2 years ago

* object.c (rb_obj_cmp): defines Object#<=>. [ruby-core:24063]

Revision 26161
Added by Marc-Andre Lafortune about 2 years ago

* lib/object.c (rb_obj_cmp): Default <=> operator returns 0 if objects are == [ruby-core:24063]

History

Updated by Rick DeNatale over 2 years ago

Well, a lot of users of <=> don't expect a nil result!

class Object
  def <=> other
    nil
  end
end

1 <=> nil     # => nil
nil <=> 1     # => nil
1 <=> false   # => nil
false <=> nil # => nil
nil <=> false # => nil
false <=> nil # => nil
[1, nil, false].sort # => 
# ~> -:15:in `sort': comparison of Fixnum with nil failed (ArgumentError)
# ~> 	from -:15

Updated by Eero Saynatkari over 2 years ago

Excerpts from Luiz Angelo Daros de Luca's message of Sun Jun 28 16:22:45 +0300 2009:
> Issue #1697 has been updated by Rick DeNatale.
> 
> 
> Well, a lot of users of <=> don't expect a nil result!

For what it is worth, when Marc-Andre brought the topic up
on #rubinius, I argued for it to always either return one
of -1, 0, 1 or then raise an error; and to never return nil
or any other value. The user may simply rescue (or not) and
differentiate between errors as needed.

Additionally, there is no need to check for nil separately
before working on the value.

One of the counterarguments was that errors are used for
"control flow" here, but I do not see it that way. I would,
in my code, probably never expect for #<=> to fail when I
use it (and never guard it with rescue): to me it indicates
a very likely design- or logic problem that should be fixed
anyway. Exceptions, pun intended, of course may exist.


Eero
--
Magic is insufficiently advanced technology.

Updated by Eero Saynatkari over 2 years ago

Excerpts from Marc-Andre's message of Mon Jun 29 15:35:52 +0300 2009:
> This really is a different and separate discussion though; one could
> also argue that  "nil =~ Date.new" should raise an error! Changing the
> semantics for <=> at this point would require a really compelling
> argument, since all existing Ruby code calling <=> or defining <=> has
> been written with the assumption that <=> should return nil when
> arguments can't be compared.

I disagree. Because there are currently no unified semantics
for #<=>, you are equally likely to find someone who relies
on a TypeError (or NoMethodError) being raised on an invalid
conversion. I think either change will require some people
to change their code, making the argument moot in terms of
which solution is "better."

(And indirectly, I think e.g. Array checks for a nil return
value and raises if it gets one.)

Is there a compelling case where not being able to compare
two objects is a valid expectation, and could not be better
addressed by actually implementing the comparison or some
other change in logic? On the flipside of that, there seems
to be potential for masking problems by evaluating to nil,
mostly in cases where #<=> is being used implicitly.

> My proposition doesn't change the semantics of <=>, rather it makes it
> is more respectful of it.

See above.


Eero
--
Magic is insufficiently advanced technology.

Updated by Nobuyoshi Nakada over 2 years ago

Hi,

At Sun, 28 Jun 2009 10:48:45 +0900,
Marc-Andre Lafortune wrote in [ruby-core:24063]:
> Other comparison operators like <, <=, >=, > would also gain
> in consistency if they were defined in terms of <=>. This
> way, 0 < nil and nil > 0 would raise the same errors instead
> of errors of different types. This is secondary to the main
> question: is it better to define Object#<=> or not?
> My vote is on 'yes'.

A patch for it.


Index: object.c
===================================================================
--- object.c	(revision 24752)
+++ object.c	(working copy)
@@ -32,5 +32,5 @@ VALUE rb_cTrueClass;
 VALUE rb_cFalseClass;

-static ID id_eq, id_eql, id_match, id_inspect, id_init_copy;
+static ID id_eq, id_eql, id_match, id_inspect, id_init_copy, id_cmp;

 /*
@@ -1116,4 +1116,14 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
 }

+/* :nodoc: */
+static VALUE
+rb_obj_cmp(VALUE obj, VALUE arg)
+{
+    if (!rb_respond_to(arg, id_cmp)) {
+	rb_notimplement();
+    }
+    return Qnil;
+}
+

 /***********************************************************************
@@ -2518,4 +2528,5 @@ Init_Object(void)
     rb_define_method(rb_mKernel, "=~", rb_obj_match, 1);
     rb_define_method(rb_mKernel, "!~", rb_obj_not_match, 1);
+    rb_define_method(rb_mKernel, "<=>", rb_obj_cmp, 1);
     rb_define_method(rb_mKernel, "eql?", rb_obj_equal, 1);
     rb_define_method(rb_mKernel, "hash", rb_obj_hash, 0);
@@ -2659,4 +2670,5 @@ Init_Object(void)
     id_inspect = rb_intern("inspect");
     id_init_copy = rb_intern("initialize_copy");
+    id_cmp = rb_intern("<=>");

     for (i=0; conv_method_names[i].method; i++) {


-- 
Nobu Nakada

Updated by Marc-Andre Lafortune over 2 years ago

I would modify slightly Nobu's patch to return 0 if the two compared objects are one and the same. I also think it is probably a better choice to return nil instead of raising an error.

With this patch in, running 'make test' doesn't generate any error. 'make test-all' generates two, on for rake because Rake::FileList has a weak way to find which methods are proper to Array, and another error because of Delegator, which doesn't inherit from BasicObject (see issue 1333) and doesn't list :<=> in its list of methods to undefine. So both of these are trivially fixed.

Although I really would like an implementation of a good double dispatch scheme, I'll leave that discussion for another time.


diff --git a/object.c b/object.c
index e81007b..fdf1ee6 100644
--- a/object.c
+++ b/object.c
@@ -1115,13 +1115,23 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
     return RTEST(result) ? Qfalse : Qtrue;
 }

-/* :nodoc: */
+/*
+ *  call-seq:
+ *     obj <=> other       => -1, 0, 1 or nil
+ *
+ *  Comparison---Returns -1 if <i>obj</i> is smaller than <i>other</i>,
+ *  0 if <i>obj</i> is the same as <i>other</i>, and +1 if <i>obj</i> is
+ *  greater than <i>other</i>. Returns <code>nil</code> if <i>obj</i>
+ *  can not be compared with <i>other</i>.
+ *  Typically, this method is overridden in descendent
+ *  classes to provide class-specific meaning.
+ */
+
 static VALUE
 rb_obj_cmp(VALUE obj, VALUE arg)
 {
-    if (!rb_respond_to(arg, id_cmp)) {
-       rb_notimplement();
-    }
+    if (obj == arg)
+       return INT2FIX(0);
     return Qnil;
 }

Updated by Yukihiro Matsumoto over 2 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100
This issue was solved with changeset r25453.
Marc-Andre, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

Updated by Marc-Andre Lafortune over 2 years ago

  • Status changed from Closed to Open
Hi Matz,

In rb_obj_cmp, the rb_obj_equal is redundant. Did you mean the following?

diff --git a/object.c b/object.c
index 9f7537f..8f20621 100644
--- a/object.c
+++ b/object.c
@@ -1123,7 +1123,7 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
 static VALUE
 rb_obj_cmp(VALUE obj1, VALUE obj2)
 {
-    if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
+    if (rb_equal(obj1, obj2))
        return INT2FIX(0);
     return Qnil;
 }

Updated by Yukihiro Matsumoto over 2 years ago

Hi,

In message "Re: [ruby-core:26318] [Feature #1697](Open) Object#<=>"
    on Mon, 26 Oct 2009 21:14:08 +0900, Marc-Andre Lafortune <redmine@ruby-lang.org> writes:

|In rb_obj_cmp, the rb_obj_equal is redundant. Did you mean the following?

It's a grain of optimization to reduce function call.

							matz.

Updated by Eero Saynatkari over 2 years ago

Excerpts from Yukihiro Matsumoto's message of Mon Oct 26 15:24:12 +0200 2009:
> Hi,
> 
> In message "Re: [ruby-core:26318] [Feature #1697](Open) Object#<=>"
>     on Mon, 26 Oct 2009 21:14:08 +0900, Marc-Andre Lafortune <redmine@ruby-lang.org> writes:
> 
> |In rb_obj_cmp, the rb_obj_equal is redundant. Did you mean the following?
> 
> It's a grain of optimization to reduce function call.

rb_obj_equal() is implemented just as ==, though, so it is
redundant unless I am missing something? I would assume that
the intended function was rb_eql() instead?

Marc-Andre's rb_equal() suggestion combines the two, although
it uses id_eq, not id_eql, so I can see where avoiding that
would be slightly better in performance.


Wishing there were fewer equality functions
or at least more variance in their names,
                                            Eero

--
Magic is insufficiently advanced technology.

Updated by Marc-Andre Lafortune over 2 years ago

Given your comment, and Eero rightly pointing out that it's better to use eql? than == for this case, 

diff --git a/object.c b/object.c
index 9f7537f..8f20621 100644
--- a/object.c
+++ b/object.c
@@ -1123,7 +1123,7 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
 static VALUE
 rb_obj_cmp(VALUE obj1, VALUE obj2)
 {
-    if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
+    if (obj1 == obj2 || rb_eql(obj1, obj2))
       return INT2FIX(0);
    return Qnil;
 }


Note: I'm not saying that obj1 == obj2 is redundant. I'm saying rb_obj_equal is.

Updated by ujihisa . about 2 years ago

  • Status changed from Open to Assigned
As Marc-Andre says, the current following code is redundant.

    static VALUE
    rb_obj_cmp(VALUE obj1, VALUE obj2)
    {
        if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
            return INT2FIX(0);
        return Qnil;
    }

where

    VALUE
    rb_obj_equal(VALUE obj1, VALUE obj2)
    {
        if (obj1 == obj2) return Qtrue;
        return Qfalse;
    }

in short, the original code means

    static VALUE
    rb_obj_cmp(VALUE obj1, VALUE obj2)
    {
        if (obj1 == obj2 || ((obj1 == obj2) ? Qtrue : Qfalse))
            return INT2FIX(0);
        return Qnil;
    }

As Marc-Andre says, there are two alternatives which are better than current code. If you prefer to allow Ruby to change the behavior of `Object#<=>` by overwriting `Object#==`, apply the patch he showed. (Current RubySpec is based on it)

Otherwise, how about applying this patch:

    diff --git a/object.c b/object.c
    index 503a7c5..3dd3290 100644
    --- a/object.c
    +++ b/object.c
    @@ -1131,7 +1131,7 @@ rb_obj_not_match(VALUE obj1, VALUE obj2)
     static VALUE
     rb_obj_cmp(VALUE obj1, VALUE obj2)
     {
    -    if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
    +    if (obj1 == obj2)
              return INT2FIX(0);
         return Qnil;
     }

I'll fix the corresponding RubySpec after you decided which behavior is preferable.

Updated by Yukihiro Matsumoto about 2 years ago

Hi,

In message "Re: [ruby-core:27300] [Feature #1697](Assigned) Object#<=>"
    on Thu, 24 Dec 2009 07:15:13 +0900, "ujihisa ." <redmine@ruby-lang.org> writes:
|
|Issue #1697 has been updated by ujihisa ..
|
|Status changed from Open to Assigned
|
|As Marc-Andre says, the current following code is redundant.
|
|    static VALUE
|    rb_obj_cmp(VALUE obj1, VALUE obj2)
|    {
|        if (obj1 == obj2 || rb_obj_equal(obj1, obj2))
|            return INT2FIX(0);
|        return Qnil;
|    }

It should be:

|        if (obj1 == obj2 || rb_equal(obj1, obj2))

							matz.

Updated by Marc-Andre Lafortune about 2 years ago

  • Status changed from Assigned to Closed
This issue was solved with changeset r26161.
Marc-Andre, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

Also available in: Atom PDF