Bug #5714

Unexpected error of STDIN#read with non-ascii input on Windows XP

Added by Heesob Park over 2 years ago. Updated over 2 years ago.

[ruby-core:41496]
Status:Closed
Priority:Normal
Assignee:Usaku NAKAMURA
Category:build
Target version:-
ruby -v:- Backport:

Description

When the input contains non-ascii character, STDIN#read raised Permission denied or Invalid argument error with MSVC compiled version on Windows XP.

C:\work>ruby -ve 'p STDIN.read(5)'
ruby 2.0.0dev (2011-12-05 trunk 33955) [i386-mswin32_100]
한글abcd
-e:1:in `read': Permission denied - <STDIN> (Errno::EACCES)
    from -e:1:in `<main>'

C:\>irb
irb(main):001:0> STDIN.read(5)
한글abcd
Errno::EINVAL: Invalid argument - <STDIN>
    from (irb):1:in `read'
    from (irb):1
    from c:/usr/bin/irb.bat:19:in `<main>'

If the input is ascii only, STDIN.read works fine.

C:\work>ruby -ve 'p STDIN.read(5)'
ruby 2.0.0dev (2011-12-05 trunk 33955) [i386-mswin32_100]
abcdefg
"abcde"

C:\>irb
irb(main):001:0> STDIN.read(5)
abcdefg
=> "abcde"

It is odd but the Mingw compiled version works fine.

C:\work>ruby -ve 'p STDIN.read(5)'
ruby 2.0.0dev (2011-12-05 trunk 33955) [i386-mingw32]
한글abcde
"\xC7\xD1\xB1\xDBabc"

And Ruby 1.9.3p0 works fine.

C:\>ruby -ve 'p STDIN.read(5)'
ruby 1.9.3p0 (2011-10-30 revision 33570) [i386-mswin32_100]
한글abcd
"\xC7\xD1\xB1\xDBabc"

read_binmode.patch Magnifier (861 Bytes) Hiroshi Shirosaki, 12/08/2011 02:54 PM

read_binmode2.patch Magnifier (933 Bytes) Hiroshi Shirosaki, 12/10/2011 03:10 PM

read_binmode_fix_r34024.patch Magnifier (5.89 KB) Hiroshi Shirosaki, 12/13/2011 04:44 PM

read_binmode_fix_r34035.patch Magnifier (7.7 KB) Hiroshi Shirosaki, 12/14/2011 05:04 PM

set_binmode_fix.patch Magnifier (2.61 KB) Hiroshi Shirosaki, 12/15/2011 02:58 PM

set_binmode_fix_r34120.patch Magnifier (3.46 KB) Hiroshi Shirosaki, 12/25/2011 11:17 PM


Related issues

Related to Backport93 - Backport #5791: Please backport r34043, r34045 (Unexpected error of STDIN... Closed 12/23/2011

Associated revisions

Revision 34043
Added by Usaku NAKAMURA over 2 years ago

  • win32/win32.c, include/ruby/win32.h (rbw32fdistext): new function.

  • win32/win32.c (init_stdhandle): set default mode of stdin as binmode.

  • io.c (setbinarymodewithseekcur): new function to replace
    SET
    BINARYMODEWITHSEEKCUR macro. now returns previous mode of the
    fd and take care of LF in rbuf.

  • io.c (do_writeconv): set text mode when needed.

  • io.c (io_read): need to change the mode of the IO to binmode
    temporally when the length for IO#read, because IO#read with length
    must behave so.

  • test/ruby/testiom17n.rb (TestIOM17N#est{readwithlength,
    readwithlengthbinmode,get[cs]andreadwithbinmode,
    read
    withbinmodeandget[cs],readwritewithbinmode}): tests for
    above changes.

all patches are written by Hiroshi Shirosaki.
[Feature #5714]

Revision 34132
Added by shirosaki over 2 years ago

  • io.c (rbsysfailpath): move the definition.
    Move above for using it in set
    binarymodewithseekcur().

  • io.c (setbinarymodewithseek_cur): fix improper seek cursor.
    Seeking file cursor with setting binary mode has possibility to
    cause infinite loop. Fixed the bug and refined error handling.
    Introduced at r34043.

    And cleanups as below.
    Remove unnecessary parentheses of fptr.
    Use return value of setmode().

  • test/ruby/testiom17n.rb
    (TestIOM17N#testseekwithsetting_binmode): add a test for abobe.
    [Bug #5714]

History

#1 Updated by Heesob Park over 2 years ago

I guess this issue is due to a bug of _read function of Microsoft Runtime library.

Here is a patch for workaround.

diff --git a/win32.c b/win32.c.new
index 67a392e..4d5a253 100644
--- a/win32.c
+++ b/win32.c.new
@@ -5451,7 +5451,8 @@ rbw32read(int fd, void *buf, size_t size)
return -1;
}

  • if (_osfile(fd) & FTEXT) {
  • isconsole = isconsole(osfhnd(fd));
  • if (!isconsole && (_osfile(fd) & FTEXT)) { return _read(fd, buf, size); }

@@ -5464,7 +5465,6 @@ rbw32read(int fd, void *buf, size_t size)
}

 ret = 0;
  • isconsole = isconsole(osfhnd(fd)); if (isconsole) { DWORD mode; GetConsoleMode((HANDLE)_osfhnd(fd),&mode);

#2 Updated by Usaku NAKAMURA over 2 years ago

  • ruby -v changed from ruby 2.0.0dev (2011-12-05 trunk 33955) [i386-mswin32_100] to -

Hello,

In message " [ruby-trunk - Bug #5714] Unexpected error of STDIN#read with non-ascii input on Windows XP"
on Dec.06,2011 15:57:36, phasis@gmail.com wrote:

I guess this issue is due to a bug of _read function of Microsoft Runtime library.

agree. (To say strictly, Win32 API has a bug and MSVC runtime is not
coping with it.)

Here is a patch for workaround.

This patch may cause newline code problem.

On my idea, for example, throw away XP :)

Regards,
--
U.Nakamura usa@garbagecollect.jp

#3 Updated by Luis Lavena over 2 years ago

On Tue, Dec 6, 2011 at 4:18 AM, U.Nakamura usa@garbagecollect.jp wrote:

On my idea, for example, throw away XP :)

WIN32WINNT

#4 Updated by Nobuyoshi Nakada over 2 years ago

Hi,

(11/12/06 22:34), Luis Lavena wrote:

BTW, I'm collecting stats about Ruby-users with Windows to determine
the best option for that, can you spread the word?

https://docs.google.com/spreadsheet/viewform?hl=en_US&formkey=dGp1VGthak50dDZSYzZtTG55c29scFE6MQ#gid=0

Why isn't ruby 2.0.0 there? :)

--
Nobu Nakada

#5 Updated by Hiroshi Shirosaki over 2 years ago

  •    if (_osfile(fd) & FTEXT) {
  •    isconsole

#6 Updated by Heesob Park over 2 years ago

Hi,

2011/12/7 Hiroshi Shirosaki h.shirosaki@gmail.com

  •    if (_osfile(fd) & FTEXT) {
  •    isconsole = isconsole(osfhnd(fd));
  •    if (!isconsole && (_osfile(fd) & FTEXT)) {        return _read(fd, buf, size);     }

I intended STDIN doesn't use text mode at #5562 patch. So STDIN
doesn't need to use read().
I thought (
osfile(fd) & FTEXT) should be false in the case of STDIN.

I found that the initial value of _osfile(STDIN) is 0xC1 and same for
_osfile(STDOUT) and _osfile(STDERR).
The value 0xC1 means (FTEXT | FDEV | FOPEN).

Here is another patch for

diff --git a/win32.c b/win32.c.new
index 67a392e..db9e222 100644
--- a/win32.c
+++ b/win32.c.new
@@ -2259,12 +2259,21 @@ initstdhandle(void)
if (fileno(stdin) < 0) {
stdin->
file = opennull(0);
}
+ else {
+ setmode(fileno(stdin), O
BINARY); // osfile(fileno(stdin)) &= ~FTEXT;
+ }
if (fileno(stdout) < 0) {
stdout->
file = opennull(1);
}
+ else {
+ setmode(fileno(stdout), O
BINARY); // osfile(fileno(stdout)) &= ~FTEXT;
+ }
if (fileno(stderr) < 0) {
stderr->
file = opennull(2);
}
+ else {
+ setmode(fileno(stderr), O
BINARY); // _osfile(fileno(stderr)) &= ~FTEXT;
+ }
if (nullfd >= 0 && !keep) close(nullfd);
setvbuf(stderr, NULL, _IONBF, 0);
}

Regards,
Park Heesob

#7 Updated by Hiroshi Shirosaki over 2 years ago

It seems read with length should always be binary mode. #5562 patch lacks this point.
I think this patch fixes above issue.

It seems OK that stdin is default binary mode, but if stdout and stderr are default binary mode, newline conversion breaks.
We can confirm this by the following tests.

make test-all TESTS="ruby/testiom17n.rb"

#8 Updated by Luis Lavena over 2 years ago

  • Category set to build
  • Assignee set to Usaku NAKAMURA

Usa, what do you think about attached read_binmode.patch?

#9 Updated by Hiroshi Shirosaki over 2 years ago

I modified the patch because rubygems didn't work properly.
However, I think it is consistent with _read() api behavior that read(length) does CRLF conversion.
Ruby 1.9 doesn't do CRLF conversion, but ruby 1.8 does.

ruby -ve "open('t1', 'wb') {|f| f.write(\"a\r\nb\")}; open ('t1', 'r') {|f| p f.read(4)}"
ruby 1.8.7 (2011-06-30 patchlevel 352) [i386-mingw32]
"a\nb"

ruby -ve "open('t1', 'wb') {|f| f.write(\"a\r\nb\")}; open ('t1', 'r') {|f| p f.read(4)}"
ruby 1.9.2p290 (2011-07-09) [i386-mingw32]
"a\r\nb"

ruby -ve "open('t1', 'wb') {|f| f.write(\"a\r\nb\")}; open ('t1', 'r') {|f| p f.read(4)}"
ruby 1.9.3p0 (2011-10-30) [i386-mingw32]
"a\r\nb"

#10 Updated by Usaku NAKAMURA over 2 years ago

Hello,

In message " [ruby-trunk - Bug #5714] Unexpected error of STDIN#read with non-ascii input on Windows XP"
on Dec.10,2011 06:49:55, luislavena@gmail.com wrote:

Category set to build

build?

Usa, what do you think about attached read_binmode.patch?

Difficult, difficult...

Of course, as Shirosaki-san says, IO#read with length should always
be binary mode.
But if setting binmode once, it's never canceled.
So, I doubt that such implicit mode setting in IO#read is right.

Regards,
--
U.Nakamura usa@garbagecollect.jp

#11 Updated by Luis Lavena over 2 years ago

On Sun, Dec 11, 2011 at 11:42 PM, U.Nakamura usa@garbagecollect.jp wrote:

Hello,

In message " [ruby-trunk - Bug #5714] Unexpected error of STDIN#read with non-ascii input on Windows XP"
   on Dec.10,2011 06:49:55, luislavena@gmail.com wrote:

Category set to build

build?

Sorry, meant to set core, but we don't have any category to describe
platform-specific issues.

Usa, what do you think about attached read_binmode.patch?

Difficult, difficult...

Of course, as Shirosaki-san says, IO#read with length should always
be binary mode.
But if setting binmode once, it's never canceled.
So, I doubt that such implicit mode setting in IO#read is right.

Agree.

Hiroshi and Heesob, do you guys think can solve this without reverting
the changes of #5562?

If not, then reverting seems the only alternative.
--
Luis Lavena
AREA 17
-
Perfection in design is achieved not when there is nothing more to add,
but rather when there is nothing more to take away.
Antoine de Saint-Exupéry

#12 Updated by Hiroshi Shirosaki over 2 years ago

Hiroshi and Heesob, do you guys think can solve this without reverting
the changes of #5562?

It's difficult, but it seems switching from binary mode to text mode would be possible. I added binary and text mixed test cases.
I use implicit mode setting at other positions.

File cursor seeking with setting binary mode had a bug. I fixed it to take care of '\r'.
Is there other test cases to check?

#13 Updated by Usaku NAKAMURA over 2 years ago

Hello,

In message " [ruby-trunk - Bug #5714] Unexpected error of STDIN#read with non-ascii input on Windows XP"
on Dec.13,2011 16:44:29, h.shirosaki@gmail.com wrote:

It's difficult, but it seems switching from binary mode to text mode would be possible. I added binary and text mixed test cases.
I use implicit mode setting at other positions.

File cursor seeking with setting binary mode had a bug. I fixed it to take care of '\r'.
Is there other test cases to check?

Although I don't want to trouble you...

str = "a\nb"
generatefile("tmp", str)
open("tmp", "r") do |f|
assert
equal(str, f.read(3))
end

Regards,
--
U.Nakamura usa@garbagecollect.jp

#14 Updated by Hiroshi Shirosaki over 2 years ago

str = "a\nb"
generatefile("tmp", str)
open("tmp", "r") do |f|
assert
equal(str, f.read(3))
end

Thank you. I added above test and some other tests.
I added one more mode setting for safe.

"make test" passed.
"make test-all" failures and errors were same as before patch.

Could you check this patch?

#15 Updated by Usaku NAKAMURA over 2 years ago

Hello,

In message " [ruby-trunk - Bug #5714] Unexpected error of STDIN#read with non-ascii input on Windows XP"
on Dec.14,2011 17:04:16, h.shirosaki@gmail.com wrote:

Thank you. I added above test and some other tests.
I added one more mode setting for safe.

"make test" passed.
"make test-all" failures and errors were same as before patch.

Great!
I've also retested and get same results.

Could you check this patch?

I've checked in your patches at r34043.
Thank you!

Regards,
--
U.Nakamura usa@garbagecollect.jp

#16 Updated by Hiroshi Shirosaki over 2 years ago

Hi Usaku NAKAMURA san,

I've checked in your patches at r34043.

Thank you for check and merge, and test fix.

This is a follow up fix.
Sorry for my mistake. I noticed an issue that file cursor seek does't work properly.
In the case, trying to seek cursor before the beginning of the file and it causes infinite loop.
I added the test case.

I've fixed this and refined error handling.

I checked "make test" passed and "make test-all" Errors/Failures were same as before patch.

#17 Updated by Usaku NAKAMURA over 2 years ago

Hello,

In message " [ruby-trunk - Bug #5714] Unexpected error of STDIN#read with non-ascii input on Windows XP"
on Dec.15,2011 14:58:07, h.shirosaki@gmail.com wrote:

Thank you for check and merge, and test fix.

Your welcome :)

This is a follow up fix.
Sorry for my mistake. I noticed an issue that file cursor seek does't work properly.
In the case, trying to seek cursor before the beginning of the file and it causes infinite loop.
I added the test case.

I've fixed this and refined error handling.

I checked "make test" passed and "make test-all" Errors/Failures were same as before patch.

I think you should check in your patches by yourself, if you
are not disagreeable.
Do you want to do so?

Regards,
--
U.Nakamura usa@garbagecollect.jp

#18 Updated by Hiroshi Shirosaki over 2 years ago

Hello,

I think you should check in your patches by yourself, if you
are not disagreeable.
Do you want to do so?

Yes. I'll check in that if that's acceptable.

Thanks,
--
Hiroshi Shirosaki

#19 Updated by Usaku NAKAMURA over 2 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r34043.
Heesob, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • win32/win32.c, include/ruby/win32.h (rbw32fdistext): new function.

  • win32/win32.c (init_stdhandle): set default mode of stdin as binmode.

  • io.c (setbinarymodewithseekcur): new function to replace
    SET
    BINARYMODEWITHSEEKCUR macro. now returns previous mode of the
    fd and take care of LF in rbuf.

  • io.c (do_writeconv): set text mode when needed.

  • io.c (io_read): need to change the mode of the IO to binmode
    temporally when the length for IO#read, because IO#read with length
    must behave so.

  • test/ruby/testiom17n.rb (TestIOM17N#est{readwithlength,
    readwithlengthbinmode,get[cs]andreadwithbinmode,
    read
    withbinmodeandget[cs],readwritewithbinmode}): tests for
    above changes.

all patches are written by Hiroshi Shirosaki.
[Feature #5714]

#20 Updated by Usaku NAKAMURA over 2 years ago

Hello,

In message " Re: [ruby-trunk - Bug #5714] Unexpected error of STDIN#read with non-ascii input on Windows XP"
on Dec.16,2011 14:07:18, h.shirosaki@gmail.com wrote:

I think you should check in your patches by yourself, if you
are not disagreeable.
Do you want to do so?

Yes. I'll check in that if that's acceptable.

So, send your SSH public key to cvs-admin at ruby-lang.org,
and wait the response.

Shugo-san, yoroshiku.

Regards,
--
U.Nakamura usa@garbagecollect.jp

#21 Updated by Hiroshi Shirosaki over 2 years ago

I updated the patch while waiting for the response.

Cleanups
* remove unnecessary parentheses of fptr
* use return value of setmode() which returns the previous translation mode if successful
http://msdn.microsoft.com/en-us/library/tw4k6df8.aspx

Test results of trunk r34120 mingw32 on Windows7 64bit

"make test" passed.
test-all failures and errors were as below. Result is same as before patch.

testraceexception(TestRequire)
testgeneratebinbindirwithuserinstallwarning(TestGemInstaller)
test
sopenerror(TestGDBM)
testsopencreatenew(TestGDBM)
testthreadtimerandinterrupt(TestThreadGroup) # This test was hung-up.
testconstants(OpenSSL::TestConfig)
test
reorganize(TestGDBM)
testfilenameasbytesextutf8(TestDirM17N)
test
filenameextutf8inteucjpunrepresentable(TestDirM17N)

Also available in: Atom PDF