Project

General

Profile

Bug #6566

JSON.dump can generate invalid UTF-8 sequence

Added by Shyouhei Urabe over 3 years ago. Updated almost 3 years ago.

Status:
Assigned
Priority:
Normal
Assignee:
ruby -v:
ruby 2.0.0dev (2012-06-09) [x86_64-linux]
Backport:
2.0.0: UNKNOWN, 2.1: UNKNOWN, 2.2: UNKNOWN, 2.3: UNKNOWN
[ruby-core:45535]

Description

=begin
Look, in the following code JSON.dump outputs a sequence invalid as UTF-8.

# -- encoding: utf-8 --
require 'json'
IO.popen('hexdump -C', 'w') do |fp|
JSON.dump(["\xea"], fp)
end

RFC4627 says that to encode JSON as a Unicode is a "SHALL". So this is an RFC violation.

=end

bug-6566.diff Magnifier - reject invalid UTF-8 sequence in JSON.generate (1.62 KB) Nobuyoshi Nakada, 06/10/2012 07:19 AM

History

#1 [ruby-core:45539] Updated by Nobuyoshi Nakada over 3 years ago

=begin
A bit simpler, it seems wrong that
JSON.generate(["\xea"]).valid_encoding?
returns (({false})).

I think this would be a bug in json generator, but what should happen
in this case? Seems (({convert_UTF8_to_JSON_ASCII()})) wants to reject
invalid sequence.
=end

#2 [ruby-core:45555] Updated by Yui NARUSE over 3 years ago

json is not only for 1.9, so nobu's patch is not acceptable.
I made https://github.com/flori/json/pull/139 .

#3 [ruby-core:52377] Updated by Yui NARUSE almost 3 years ago

  • Target version changed from 2.0.0 to next minor

Also available in: Atom PDF