Bug #6566

JSON.dump can generate invalid UTF-8 sequence

Added by Shyouhei Urabe over 2 years ago. Updated about 2 years ago.

[ruby-core:45535]
Status:Assigned
Priority:Normal
Assignee:Yui NARUSE
ruby -v:ruby 2.0.0dev (2012-06-09) [x86_64-linux] Backport:

Description

=begin
Look, in the following code JSON.dump outputs a sequence invalid as UTF-8.

# -*- encoding: utf-8 -*-
require 'json'
IO.popen('hexdump -C', 'w') do |fp|
JSON.dump(["\xea"], fp)
end

RFC4627 says that to encode JSON as a Unicode is a "SHALL". So this is an RFC violation.

=end

bug-6566.diff Magnifier - reject invalid UTF-8 sequence in JSON.generate (1.62 KB) Nobuyoshi Nakada, 06/10/2012 07:19 AM

History

#1 Updated by Nobuyoshi Nakada over 2 years ago

=begin
A bit simpler, it seems wrong that
JSON.generate(["\xea"]).valid_encoding?
returns (({false})).

I think this would be a bug in json generator, but what should happen
in this case? Seems (({convert_UTF8_to_JSON_ASCII()})) wants to reject
invalid sequence.
=end

#2 Updated by Yui NARUSE over 2 years ago

json is not only for 1.9, so nobu's patch is not acceptable.
I made https://github.com/flori/json/pull/139 .

#3 Updated by Yui NARUSE about 2 years ago

  • Target version changed from 2.0.0 to next minor

Also available in: Atom PDF