Bug #6566

JSON.dump can generate invalid UTF-8 sequence

Added by Shyouhei Urabe almost 2 years ago. Updated about 1 year ago.

[ruby-core:45535]
Status:Assigned
Priority:Normal
Assignee:Yui NARUSE
Category:M17N
Target version:next minor
ruby -v:ruby 2.0.0dev (2012-06-09) [x86_64-linux] Backport:

Description

=begin
Look, in the following code JSON.dump outputs a sequence invalid as UTF-8.

# -- encoding: utf-8 --
require 'json'
IO.popen('hexdump -C', 'w') do |fp|
JSON.dump(["\xea"], fp)
end

RFC4627 says that to encode JSON as a Unicode is a "SHALL". So this is an RFC violation.

=end

bug-6566.diff Magnifier - reject invalid UTF-8 sequence in JSON.generate (1.62 KB) Nobuyoshi Nakada, 06/10/2012 07:19 AM

History

#1 Updated by Nobuyoshi Nakada almost 2 years ago

=begin
A bit simpler, it seems wrong that
JSON.generate(["\xea"]).valid_encoding?
returns (({false})).

I think this would be a bug in json generator, but what should happen
in this case? Seems (({convertUTF8toJSONASCII()})) wants to reject
invalid sequence.
=end

#2 Updated by Yui NARUSE almost 2 years ago

json is not only for 1.9, so nobu's patch is not acceptable.
I made https://github.com/flori/json/pull/139 .

#3 Updated by Yui NARUSE about 1 year ago

  • Target version changed from 2.0.0 to next minor

Also available in: Atom PDF