Project

General

Profile

Actions

Bug #20924

closed

IO#readline ignores the limit argument when the encoding is UTF-32LE and the limit would split a character

Added by javanthropus (Jeremy Bopp) 29 days ago. Updated 14 days ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.4.0dev (2024-11-28T12:38:16Z master 3af1a04741) +PRISM [x86_64-linux]
[ruby-core:120058]

Description

require 'tempfile'

Tempfile.open(binmode: true, encoding: 'utf-32le') do |f|
  f.write('0123456789')
  f.rewind

  # A limit that would truncate a character becomes completely ignored
  f.readline(3).bytesize  # => 40; should be 4
end

Tempfile.open(binmode: true, encoding: 'utf-32le') do |f|
  f.write('0123456789')
  f.rewind

  # A limit on character boundaries is respected
  f.readline(4).bytesize  # => 4
end

Tempfile.open(encoding: 'utf-8:utf-32le') do |f|
  f.write('0123456789')
  f.rewind

  # A limit that would truncate a character becomes completely ignored
  f.readline(3).bytesize  # => 40; should be 4
end

Tempfile.open(encoding: 'utf-8:utf-32le') do |f|
  f.write('0123456789')
  f.rewind

  # A limit on character boundaries is respected
  f.readline(4).bytesize  # => 4
end

This doesn't happen with UTF-32BE. This also doesn't happen in Ruby 3.3, but it does happen in 3.4-dev and Ruby 3.0 - 3.2.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0