Project

General

Profile

Bug #13397

#object_id should not be signed

Added by vo.x (Vit Ondruch) 10 months ago. Updated 7 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.4.0p0 (2016-12-24 revision 57164) [i386-linux]
[ruby-core:80545]

Description

It is surprising that #object_id returns signed value. Let me explain show two examples. Working with 32b Ruby (ruby 2.4.0p0 (2016-12-24 revision 57164) [i386-linux]) to make this issue more apparent.

$ ruby << \EOR
GC.disable
3_000_000.times { p Object.new.inspect }
EOR
"#<Object:0x57d49a5c>"
"#<Object:0x57d499a8>"
"#<Object:0x57d49930>"
"#<Object:0x57d498b8>"

... snip ...

"#<Object:0x828bf164>"
"#<Object:0x828bf0ec>"
"#<Object:0x828bf074>"
"#<Object:0x828beffc>"
"#<Object:0x828bef84>"
^C-:2:in `p': Interrupt
    from -:2:in `block in <main>'
    from -:2:in `times'
    from -:2:in `<main>'
"#<Object:0x8290b1f4>"

In this example, the "object_id", which is part of the inspect object is unsigned, since it is printed using C sprintf with %p format. There are other libraries, which tries to mimic the output [ 1 ]. The implementation is approximately following:

$ ruby << \EOR
GC.disable
class A
  DEFAULT_OBJ_ID_STR_WIDTH = 0.size == 4 ? 7 : 14

  def inspect
    id_str = (object_id << 1).to_s(16).rjust(DEFAULT_OBJ_ID_STR_WIDTH, '0') 
    "#<#{self.class.name}:0x#{id_str}>"
  end
end
3_000_000.times { p A.new.inspect }
EOR

"#<A:0x58585428>"
"#<A:0x585852d4>"
"#<A:0x585851bc>"
"#<A:0x5858507c>"
"#<A:0x58584ec4>"
"#<A:0x58584d5c>"
"#<A:0x58584c1c>"
"#<A:0x58584adc>"

... snip ...

"#<A:0x7fff4888>"
"#<A:0x7fff47c0>"
"#<A:0x7fff46f8>"
"#<A:0x7fff4630>"
"#<A:0x7fff4568>"
"#<A:0x7fff44a0>"
"#<A:0x7fff43d8>"
"#<A:0x7fff4310>"
"#<A:0x7fff4248>"
"#<A:0x7fff4180>"
"#<A:0x7fff40b8>"
"#<A:0x-7fffc034>"
"#<A:0x-7fffc110>"
"#<A:0x-7fffc1ec>"
"#<A:0x-7fffc2c8>"
"#<A:0x-7fffc3a4>"
"#<A:0x-7fffc480>"
"#<A:0x-7fffc55c>"
"#<A:0x-7fffc638>"
^C-:10:in `p': Interrupt
    from -:10:in `block in <main>'
    from -:10:in `times'
    from -:10:in `<main>'

And the output is quite surprising to me. Why the object_id should be signed value? It doesn't make any sense to me. Is this implementation wrong or is Ruby wrong?

History

#1 [ruby-core:80547] Updated by nobu (Nobuyoshi Nakada) 10 months ago

It's not to make object_ids Bignum as possible.

#2 [ruby-core:80550] Updated by vo.x (Vit Ondruch) 10 months ago

Ok, you want to prevent Bignums, but what is the suggested solution here? Do some pack("l").unpack("L") to get the expected value? Or you can provide object_hexid [ 1 ]. Or provide some %p equivalent formatter?

Because I am afraid that the "inspect" misuse is widespread. Here are some examples, even in Ruby codebase:

https://github.com/ruby/ruby/blob/trunk/lib/rubygems/dependency_list.rb#L107
https://github.com/ruby/ruby/blob/trunk/lib/rubygems/platform.rb#L115
https://github.com/ruby/ruby/blob/trunk/lib/rdoc/attr.rb#L90
https://github.com/ruby-concurrency/concurrent-ruby/blob/master/lib/concurrent/map.rb#L218
https://github.com/rails/rails/blob/master/actionview/lib/action_view/template/resolver.rb#L59

#3 [ruby-core:80811] Updated by vo.x (Vit Ondruch) 9 months ago

Last CI build of concurrent-ruby in Fedora failed again:

https://apps.fedoraproject.org/koschei/package/rubygem-concurrent-ruby?collection=f27
https://kojipkgs.fedoraproject.org/work/tasks/6296/19116296/build.log

Is there a chance to find some generic reliable solution to this? Or is it just feature and I should persuade all the project to fix their Regexps 1 or implementation of #object_id?

#4 [ruby-core:80814] Updated by nobu (Nobuyoshi Nakada) 9 months ago

vo.x (Vit Ondruch) wrote:

Is there a chance to find some generic reliable solution to this? Or is it just feature and I should persuade all the project to fix their Regexps [1] or implementation of #object_id?

They seem to have the solution already.

https://github.com/ruby-concurrency/concurrent-ruby/blob/master/lib/concurrent/edge/promises.rb#L1904-L1906

      def inspect
        "#{to_s[0..-2]} intended_time: #{@IntendedTime}>"
      end

The default Kernel#to_s returns a string with the class name and the object ID.

#5 [ruby-core:80908] Updated by vo.x (Vit Ondruch) 9 months ago

I created PR fixing the projects listed above.

But still, this seems so generic and unexpected issue, possibly influencing libraries which are part of StdLib [3, 4], I'd like to see this improved. Couldn't the #format accept "object_id" formatter for example?

[1] https://github.com/ruby-concurrency/concurrent-ruby/pull/651
[2] https://github.com/rails/rails/pull/28902
[3] https://github.com/rubygems/rubygems/pull/1908
[4] https://github.com/rdoc/rdoc/pull/451

#6 [ruby-core:80917] Updated by nobu (Nobuyoshi Nakada) 9 months ago

Such feature would be nice, but can't help libraries which support older versions.

commit 1f7154a4ae0e480774dbfe79905b21d26d5b4cbc
Author: Nobuyoshi Nakada <nobu@ruby-lang.org>
Date:   Fri Apr 28 16:35:48 2017

    `%I` for object ID

diff --git a/sprintf.c b/sprintf.c
index f2d51f1c76..758e32fed4 100644
--- a/sprintf.c
+++ b/sprintf.c
@@ -25,6 +25,7 @@

 static char *fmt_setup(char*,size_t,int,int,int,int);
 static char *ruby_ultoa(unsigned long val, char *endp, int base, int octzero);
+static char *ruby_ptoa(VALUE val, char *endp, int base, int octzero);

 static char
 sign_bits(int base, const char *p)
@@ -781,6 +782,18 @@ rb_str_format(int argc, const VALUE *argv, VALUE fmt)
        }
        break;

+     case 'I':
+       {
+       VALUE arg = GETARG();
+       prec = (SIZEOF_VOIDP * CHAR_BIT + 3) / 4;
+       CHECK(prec + 2);
+       PUSH_("0x", 2);
+       t = ruby_ptoa(arg, &buf[blen + prec], 16, 0);
+       if (t > &buf[blen]) memset(&buf[blen], '0', t - &buf[blen]);
+       blen += prec;
+       }
+       break;
+
      case 'd':
      case 'i':
      case 'o':
@@ -1255,6 +1268,18 @@ ruby_ultoa(unsigned long val, char *endp, int base, int flags)
     return BSD__ultoa(val, endp, base, octzero, xdigs);
 }

+static char *
+ruby_ptoa(VALUE val, char *endp, int base, int flags)
+{
+    const char *xdigs = lower_hexdigits;
+    int octzero = flags & FSHARP;
+#ifdef _HAVE_SANE_QUAD_
+    return BSD__uqtoa(val, endp, base, octzero, xdigs);
+#else
+    return BSD__ultoa(val, endp, base, octzero, xdigs);
+#endif
+}
+
 int
 ruby_vsnprintf(char *str, size_t n, const char *fmt, va_list ap)
 {
diff --git a/test/ruby/test_sprintf.rb b/test/ruby/test_sprintf.rb
index 1bf65f1eab..0b7a11a33b 100644
--- a/test/ruby/test_sprintf.rb
+++ b/test/ruby/test_sprintf.rb
@@ -513,4 +513,9 @@
     assert_equal before + 1, after, 'only new string is the created one'
     assert_equal '1970-01-01', val
   end
+
+  def test_object_id
+    x = Object.new
+    assert_equal(x.to_s, sprintf("#<%s:%I>", x.class, x))
+  end
 end

#7 [ruby-core:80922] Updated by vo.x (Vit Ondruch) 9 months ago

The patch is missing documentation, otherwise I love it!

Also available in: Atom PDF