Project

General

Profile

Actions

Bug #6192

closed

Integer() doesn't handle UTF-16 input

Added by john_firebaugh (John Firebaugh) about 12 years ago. Updated about 12 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 1.9.3p125 (2012-02-16 revision 34643) [x86_64-darwin11.3.0]
Backport:
[ruby-core:43566]

Description

Integer("2007".encode("UTF-16le"))
ArgumentError: string contains null byte
from (irb):209:in Integer' from (irb):209 from /Users/john/.rvm/rubies/ruby-1.9.3-p125/bin/irb:16:in '

Updated by john_firebaugh (John Firebaugh) about 12 years ago

Related, String#to_i:

"2007".encode("UTF-16le").to_i
=> 2

Updated by drbrain (Eric Hodel) about 12 years ago

=begin
I made this patch:

Index: bignum.c

--- bignum.c (revision 35117)
+++ bignum.c (working copy)
@@ -11,6 +11,7 @@

#include "ruby/ruby.h"
#include "ruby/util.h"
+#include "ruby/encoding.h"
#include "internal.h"

#include <math.h>
@@ -24,6 +25,7 @@
VALUE rb_cBignum;

static VALUE big_three = Qnil;
+static VALUE sym_replace = Qnil;

#if defined MINGW32
#define USHORT _USHORT
@@ -773,8 +775,21 @@ rb_str_to_inum(VALUE str, int base, int
long len;
VALUE v = 0;
VALUE ret;

  • VALUE encopts;

  • rb_encoding *enc;

    StringValue(str);

  • enc = rb_enc_from_index(ENCODING_GET((str)));

  • if (enc != rb_usascii_encoding()) {

  •  encopts = rb_hash_new();
    
  •  rb_hash_aset(encopts, sym_replace, rb_str_new2(" "));
    
  •  rb_obj_freeze(encopts);
    
  •  str = rb_str_conv_enc_opts(str, enc, rb_usascii_encoding(), 0, encopts);
    
  • }

  • if (badcheck) {
    s = StringValueCStr(str);
    }
    @@ -3809,5 +3824,6 @@ Init_Bignum(void)
    power_cache_init();

    big_three = rb_uint2big(3);

  • sym_replace = ID2SYM(rb_intern("replace"));
    rb_gc_register_mark_object(big_three);
    }
    Index: test/ruby/test_literal.rb
    ===================================================================
    --- test/ruby/test_literal.rb (revision 35117)
    +++ test/ruby/test_literal.rb (working copy)
    @@ -261,6 +261,23 @@ class TestRubyLiteral < Test::Unit::Test
    }
    end

  • def test_integer_encoding

  • bug6192 = '[bug#6192]'

  • s = "2007".encode(Encoding::UTF_16LE)

  • assert_equal(2007, Integer(s), bug6192)

  • s = "3.14 is \xCF\x80"

  • s.force_encoding Encoding::UTF_8

  • e = assert_raises(ArgumentError, bug6192) do

  •  Integer(s)
    
  • end

  • assert_equal("Invalid value for Integer(): "#{s}"", e.message)

  • end

  • def test_float
    head = ['', '-', '+']
    chars = ['0', '1', '_', '9', 'f', '.']

But there is a problem:

1) Failure:

test_integer_utf_16(TestRubyLiteral) [/Users/drbrain/Work/svn/ruby/trunk/test/ruby/test_literal.rb:278]:
<"Invalid value for Integer(): "3.14 is π""> expected but was
<"invalid value for Integer(): "3.14 is \xCF\x80"">.

I'm not sure if this output is acceptable or not.
=end

Actions #3

Updated by nobu (Nobuyoshi Nakada) about 12 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r35120.
John, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • bignum.c (rb_str_to_inum): must be ASCII compatible encoding as
    well as String#hex and String#oct. [ruby-core:43566][Bug #6192]
  • string.c (rb_must_asciicompat): check if ASCII compatible.
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0