Project

General

Profile

Bug #13292

Invalid encodings in UTF-32

Added by rbjl (Jan Lelis) over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux]
[ruby-core:79966]

Description

Ruby is very strict about valid UTF-8 encodings, which is great.

Strings that encode surrogates or too large codepoints are not valid.

However, in UTF-32, it is possible to encode such values, and Ruby treats them as valid:

Example 1 (too large value)

a = [0, 0, 17, 0].pack("C*").force_encoding("UTF-32LE") #=> "\u{110000}"
a.valid_encoding? # => true

Example 2 (surrogate)

b = [0, 216, 0, 0].pack("C*").force_encoding("UTF-32LE") # => "\uD800"
b.valid_encoding? #=> true

The behaviour should be changed to String#valid_encoding? reporting false

For reference: http://unicode.org/versions/Unicode9.0.0/UnicodeStandard-9.0.pdf (page 71)

#1

Updated by nobu (Nobuyoshi Nakada) over 3 years ago

  • Status changed from Open to Closed

Applied in changeset r57816.


fix UTF-32 valid_encoding?

  • enc/utf_32be.c (utf32be_mbc_enc_len): check arguments precisely.
    [ruby-core:79966] [Bug #13292]

  • enc/utf_32le.c (utf32le_mbc_enc_len): ditto.

  • regenc.h (UNICODE_VALID_CODEPOINT_P): predicate for valid
    Unicode codepoints.

Updated by naruse (Yui NARUSE) over 3 years ago

  • Backport changed from 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: UNKNOWN to 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: DONE

ruby_2_4 r57935 merged revision(s) 57816,57817.

#3

Updated by nagachika (Tomoyuki Chikanaga) over 3 years ago

  • Backport changed from 2.2: UNKNOWN, 2.3: UNKNOWN, 2.4: DONE to 2.2: REQUIRED, 2.3: REQUIRED, 2.4: DONE

Updated by usa (Usaku NAKAMURA) over 3 years ago

  • Backport changed from 2.2: REQUIRED, 2.3: REQUIRED, 2.4: DONE to 2.2: DONE, 2.3: REQUIRED, 2.4: DONE

ruby_2_2 r58103 merged revision(s) 57816,57817.

Updated by nagachika (Tomoyuki Chikanaga) over 3 years ago

  • Backport changed from 2.2: DONE, 2.3: REQUIRED, 2.4: DONE to 2.2: DONE, 2.3: DONE, 2.4: DONE

ruby_2_3 r58183 merged revision(s) 57816,57817.

Also available in: Atom PDF