Project

General

Profile

Actions

Bug #18353

closed

Czech keyboard input encoding on czech Windows

Added by koleq (Ondřej Kurz) over 2 years ago. Updated over 2 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]
[ruby-core:106191]

Description

Inputing czech characters in czech Windows does not work unless "text.force_encoding("CP852")" is used, I would be expecting for this to work seemlesly just like it does in python

This issue also does not happen in WSL (Windows Subsystem for Linux) where is just works without encoding issues.

To test you can run this code and copy the "ěščřžýáíé" and paste it,
you will see the fisrt print works just fine but you input does not.

I do not know if it's reproduceble on another language version of Windows.

Ruby

puts("ěščřžýáíé")
text = gets
# input.force_encoding("CP852") this line fixes the input, but probably not the best solution if other windows languages use another code page.
puts(text)

output:

ěščřžýáíé
ěščřžýáíé
����젡�

"text.encoding" returns "UTF-8"
"text.bytes.inspect" returns "[216, 231, 159, 253, 167, 236, 160, 161, 130, 10]"

Python 3

print("ěščřžýáíé")
text = input()
print(text)

output:

ěščřžýáíé
ěščřžýáíé
ěščřžýáíé

I don't know how to check encoding or return bytes of the current encoding in python.

I was told on Ruby discord that my terminal is misconfigured but that is not the case, it does it in multiple terminals and I can't be expecting users to be changing their terminal settings.

other languages like Python or C# do not seem to have this issue.

I wonder what python does to ge around encoding issues on Czech Windows.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0