Project

General

Profile

Feature #4066

Encoding GBK needs update

Added by oCameLo (oCameLo oTnTh) about 9 years ago. Updated over 8 years ago.

Status:
Rejected
Priority:
Normal
Target version:

Description

=begin
When GBK was released in 1995, it included 95 characters were not included in Unicode 1.1. Until now (Windows 7), these characters were still assigned Unicode PUA code points in CP936.

GBK isn't an official standard, so I think it won't be updated anymore. But GB18030 is official, and the subset consisting of one-byte and two-byte characters is sometimes also referred to as GBK. In GB18030-2005, 81 characters were assigned to PUA, are now defined in Unicode.

(Reference: http://en.wikipedia.org/wiki/GBK#History)

Actually, the remaining 14 characters are now defined in Unicode, too. Please take a look at gbk_fe05.gif, light grey and light yellow ones.

These 95 characters are all defined in Unicode now (see gbk_mod.htm), so I think we should add these characters to gbk-tbl.rb. It won't cause any compatibility issue, at least in Ruby side.
=end


Files

gbk-tbl.95_chars.diff (4.27 KB) gbk-tbl.95_chars.diff oCameLo (oCameLo oTnTh), 11/18/2010 03:54 AM
gbk_fe05.gif (33.9 KB) gbk_fe05.gif oCameLo (oCameLo oTnTh), 11/18/2010 03:54 AM
gbk_mod.htm (6.36 KB) gbk_mod.htm oCameLo (oCameLo oTnTh), 11/18/2010 03:54 AM

History

#1

Updated by shyouhei (Shyouhei Urabe) about 9 years ago

  • Category changed from test to M17N
  • Status changed from Open to Assigned
  • Assignee set to naruse (Yui NARUSE)

=begin

=end

#2

Updated by naruse (Yui NARUSE) about 9 years ago

=begin
I understand your point and it seems reasonable.
Anyway, is there any other implementation whose conversion table includes such characters?
=end

#3

Updated by oCameLo (oCameLo oTnTh) about 9 years ago

=begin
I just can find out only one mailing list thread about this problem here: http://sources.redhat.com/ml/libc-alpha/2000-09/msg00394.html

For compatibility, we should accept this patch. But from the angle of standard, let it go.

Both ways are acceptable.
=end

#4

Updated by naruse (Yui NARUSE) about 9 years ago

=begin
(2010/11/19 20:12), oCameLo oTnTh wrote:

I just can find out only one mailing list thread about this problem
here: http://sources.redhat.com/ml/libc-alpha/2000-09/msg00394.html

For compatibility, we should accept this patch. But from the angle of
standard, let it go.

Ruby's mapping table should follow de facto or de jure standards.
In current situation, I should say the expectation for compatibility is
wrong. (you may know, Euro sign is also incompatible).

So until other implementation like converters, editors, or web browsers
supports such table, ruby won't support them.

--
NARUSE, Yui naruse@airemix.jp

=end

#5

Updated by naruse (Yui NARUSE) about 9 years ago

  • Status changed from Assigned to Rejected

=begin

=end

Also available in: Atom PDF