Project

General

Profile

Actions

Bug #20185

closed

String#ascii_only? buggy in ruby 3.3

Added by chucke (Tiago Cardoso) 3 months ago. Updated 3 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:116203]

Description

This was the smallest reduction of the bug I could come up with:

require "stringio"

puts StringIO::VERSION

def is_ascii(buffer)
  str = buffer.string
  puts "\"#{str}\" is ascii: #{str.ascii_only?}"
end

buffer = StringIO.new("".b)

buffer.write("a=b&c=d")
buffer.rewind
is_ascii(buffer)
buffer.write "богус"
is_ascii(buffer)

# in ruby 3.3
#=> 3.1.0
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: true

# in ruby 3.2
#=> 3.0.4
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: true

# in ruby 3.1
#=> 3.0.1
#=> "a=b&c=d" is ascii: true
#=> "богус" is ascii: false

I believe that only the 3.1 result is correct, as "богус" first character is not ascii.

Updated by andrykonchin (Andrew Konchin) 3 months ago

I cannot reproduce the issue with plain String (without StringIO) on Ruby 3.2, 3.1 and 3.0. ascii_only? reports false for "богус":

ruby -e 'p "богус".ascii_only?'
false

I believe in the examples involving StringIO the observed behaviour is caused by preserving StringIO#string's encoding. StringIO instance is initialised with a String literal in binary encoding. And any modification like writing doesn't change encoding even when a UTF-8 String is written:

io = StringIO.new "".b
io.string.encoding # => #<Encoding:ASCII-8BIT>

io.write "汉"
io.string.encoding # => #<Encoding:ASCII-8BIT>

In case of the "богус" String literal there are bytes greater than 127 so they are treated as non-ASCII:

io = StringIO.new "".b
io.write "богус"
io.string.bytes # => [208, 177, 208, 190, 208, 179, 209, 131, 209, 129]
Actions #2

Updated by nobu (Nobuyoshi Nakada) 3 months ago

  • Status changed from Open to Closed

Updated by chucke (Tiago Cardoso) 3 months ago

nobu, can I ask why was the ticket closed? Even considering the comment from andrykonchin, he clearly points oot at the end that there are bytes greater than 128 in the string (therefore .ascii_only? should be false).

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

chucke (Tiago Cardoso) wrote in #note-3:

nobu, can I ask why was the ticket closed? Even considering the comment from andrykonchin, he clearly points oot at the end that there are bytes greater than 128 in the string (therefore .ascii_only? should be false).

This was fixed by 6283ae8d369bd2f8a022bb69bc5b742c58529dec

Updated by chucke (Tiago Cardoso) 3 months ago

Apologies everyone, got temporary redmine visual impairment. Thank you.

Updated by Eregon (Benoit Daloze) 3 months ago

Indeed on Redmine I see no link to the commit in https://bugs.ruby-lang.org/issues/20185?tab=history#note-2, it seems like a bug.

Updated by hsbt (Hiroshi SHIBATA) 3 months ago

No, Fix https://bugs.ruby-lang.org/issues/20185of commit message is not correct format for redmine autolink.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0