Backport #8323

no conversion by "bom|utf-8"

Added by Nobuyoshi Nakada 12 months ago. Updated 12 months ago.

[ruby-core:54563]
Status:Closed
Priority:Normal
Assignee:Tomoyuki Chikanaga

Description

Mode spec in open and etc. accepts BOM-prefix UTF encoding, however if the external and internal encodings given there are same no conversion take place regardless the actual external encoding.
Since the encoding prefixed with "BOM" is not a real encoding, but just a fallback, the conversion should honor the detected encoding by BOM.

0001-io.c-conversion-from-bom-encoding.patch Magnifier (8.76 KB) Nobuyoshi Nakada, 04/25/2013 01:42 PM

Associated revisions

Revision 40541
Added by Tomoyuki Chikanaga 12 months ago

merge revision(s) 40462: [Backport #8323]

* io.c (rb_io_ext_int_to_encs, parse_mode_enc): bom-prefixed name is
  not a real encoding name, just a fallback.  so the proper conversion
  should take place even if if the internal encoding is equal to the
  bom-prefixed name, unless actual encoding is equal to the internal
  encoding.   [Bug #8323]

* io.c (io_set_encoding_by_bom): reset extenal encoding if no BOM
  found.  

History

#1 Updated by Nobuyoshi Nakada 12 months ago

  • Description updated (diff)

Sorry, miss-post to ruby-core.

#2 Updated by Nobuyoshi Nakada 12 months ago

  • File 0001-io.c-conversion-from-bom-encoding.patch added

A patch attached

#3 Updated by Yui NARUSE 12 months ago

The patch doesn't work on following case:

% ./ruby -e'IO.write"p","a";open("p","r:BOM|utf-8:utf-8"){|f|p f.read.size}'
-e:1:in read': code converter not found (UTF-8 to UTF-8) (Encoding::ConverterNotFoundError)
from -e:1:in
block in '
from -e:1:in open'
from -e:1:in
'

#5 Updated by Nobuyoshi Nakada 12 months ago

  • File deleted (0001-io.c-conversion-from-bom-encoding.patch)

#6 Updated by Yui NARUSE 12 months ago

nobu (Nobuyoshi Nakada) wrote:

Updated.

OK, commit please.

#7 Updated by Nobuyoshi Nakada 12 months ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r40462.
Nobuyoshi, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


io.c: conversion from bom encoding

  • io.c (rbioextinttoencs, parsemode_enc): bom-prefixed name is not a real encoding name, just a fallback. so the proper conversion should take place even if if the internal encoding is equal to the bom-prefixed name, unless actual encoding is equal to the internal encoding. [Bug #8323]
  • io.c (iosetencodingbybom): reset extenal encoding if no BOM found.

#8 Updated by Nobuyoshi Nakada 12 months ago

  • Backport changed from 1.9.3: UNKNOWN, 2.0.0: UNKNOWN to 1.9.3: REQUIRED, 2.0.0: REQUIRED

#9 Updated by Nobuyoshi Nakada 12 months ago

  • Tracker changed from Bug to Backport
  • Project changed from ruby-trunk to Backport200
  • Category deleted (M17N)
  • Status changed from Closed to Assigned
  • Assignee changed from Yui NARUSE to Tomoyuki Chikanaga
  • Target version deleted (2.1.0)

#10 Updated by Tomoyuki Chikanaga 12 months ago

  • Status changed from Assigned to Closed

This issue was solved with changeset r40541.
Nobuyoshi, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


merge revision(s) 40462: [Backport #8323]

* io.c (rb_io_ext_int_to_encs, parse_mode_enc): bom-prefixed name is
  not a real encoding name, just a fallback.  so the proper conversion
  should take place even if if the internal encoding is equal to the
  bom-prefixed name, unless actual encoding is equal to the internal
  encoding.   [Bug #8323]

* io.c (io_set_encoding_by_bom): reset extenal encoding if no BOM
  found.  

Also available in: Atom PDF