Bug #6228

[mingw] Errno::EBADF in ruby/test_io.rb on ruby_1_9_3

Added by Jon Forums about 2 years ago. Updated over 1 year ago.

[ruby-core:43951]
Status:Closed
Priority:Normal
Assignee:Luis Lavena
Category:test
Target version:1.9.3
ruby -v:ruby 1.9.3p172 (2012-03-30 revision 35179) [i386-mingw32] Backport:

Description

On Win7 32bit built with MinGW 4.6.2 using the RubyInstaller build recipes, I get the following test error:

sh-3.1$ ruby --version
ruby 1.9.3p172 (2012-03-30 revision 35179) [i386-mingw32]

sh-3.1$ make test-all TESTS='openssl fiddle psych zlib io json mkmf pathname stringio erb
fileutils ruby/testio.rb ruby/testiom17n.rb ruby/testfile.rb'

7) Error:
testlines(TestIO):
Errno::EBADF: Bad file descriptor
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test
io.rb:33:in close'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:33:in
rescue in block in pipe'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:30:in `block in pipe'

Associated revisions

Revision 35631
Added by Luis Lavena almost 2 years ago

  • test/ruby/test_io.rb (class TestIO): Disable GC during IO tests to avoid file descriptors being GC'ed. Suggestion by Tomoyuki Chikanaga [Bug #6228]

Revision 35637
Added by Tomoyuki Chikanaga almost 2 years ago

  • test/ruby/testio.rb (testflushinfinalizer1): don't use IO.for_fd to close IO objects. it create IO object with already closed fd, and cause occasional Errno::EBADF in following tests. [Bug #6228]

History

#1 Updated by Yui NARUSE about 2 years ago

Do you know which revision break it?

#2 Updated by Jon Forums about 2 years ago

Not yet.

#3 Updated by Jon Forums about 2 years ago

Haven't found root cause, but these commits also have Errno::EBADF errors:

commit revision patch


b1bcb82 r34865 p154
1b21d03 r34811 p146
7e2e3dc r34806 p144

In the above commits, the failures are at a different location in ruby/test_io.rb likely because I'm not building deps in order to speed things up:

5) Error:
testinvalidadvise(TestIO):
Errno::EBADF: Bad file descriptor - C:/Users/Jon/AppData/Local/Temp/testio20120330-6868-1lqi3p9
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test
io.rb:1119:in make_tempfile'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1848:in
block (2 levels) in testinvalidadvise'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/testio.rb:1847:in each'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1847:in
block in test
invalidadvise'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test
io.rb:1846:in each'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:1846:in
testinvalidadvise'
../../../../Users/Jon/Documents/RubyDev/ruby-git/test/runner.rb:15:in `'

I suspect this testing difference isn't relevant since EBADF errors persist, but it is a difference.

Time permitting this weekend, I'll try again using git bisect

#4 Updated by Jon Forums about 2 years ago

In my configuration, one fix is to remove the recent -DFD_SETSEZE=32767 define from the RubyInstaller build recipe. Removing the define gives the following results on r35179:

  • no deps build using WinSDK 7.1 (without RubyInstaller) has no test_lines(TestIO) nor test_invalid_advise(TestIO) Errno::EBADF failures
  • no deps build using MinGW GCC 4.6.2 (with or without RubyInstaller) has no test_lines nor test_invalid_advise failures
  • full deps build using MinGW GCC 4.6.2 (with RubyInstaller recipe) has no test_lines nor test_invalid_advise failures

However, removing the define is not acceptable as it was added to fix the following user issue

https://github.com/oneclick/rubyinstaller/issues/104

And it get's more convoluted.

Running git bisect between commits a81c766/r34649 (good) and 7e2e3dc/r34806 (bad) with -DFDSETSEZE=32767 enabled in the RubyInstaller recipes, the "first bad commit" is Nobu's 35bf809/r34749 tweak to Enumerator in io.c. Building the "good" fa1a2ae/r34746 parent commit (without deps) resulted in _no test_invalid_advise failure when running make test-all TESTS='ruby/test_io.rb'. But strangely, test_lines failed while test_invalid_advise passed when running my original make test-all TESTS='openssl ... ruby/test_io.rb ...'

I suspect there are other lurking issues left to discover, so don't close this issue as a 3rd party issue until we know more.

#5 Updated by Jon Forums about 2 years ago

of course it's the FD_SETSIZE macro I'm talking about...

#6 Updated by Yui NARUSE about 2 years ago

  • Status changed from Open to Assigned
  • Assignee set to Luis Lavena

#7 Updated by Hiroshi Shirosaki about 2 years ago

This issue seems to be related to GC finalizer.
GC is triggered just before Errno::EBADF occurs, and finalizer closes the fd of maketempfile. Errno::EBADF is raised for trying to close again the closed fd of maketempfile.

This seems due to test_flush_in_finalizer1.

 500.times {
   f = File.open(path, "w")
   fds << f.fileno
   f.print "hoge"
 }

f is not closed. When GC sweeps f, the fd associated with f seems to be closed.
fds contains 3. And make_tempfile in test_invalid_advise creates fd of 3.

I confirmed the error disappeared with the following patch.
-DFD_SETSEZE=32767 and ruby 1.9.3p184 (2012-04-15 revision 35335) [i386-mingw32] on Windows 7.

diff --git a/test/ruby/testio.rb b/test/ruby/testio.rb
index edf449d..70ca240 100644
--- a/test/ruby/testio.rb
+++ b/test/ruby/test
io.rb
@@ -1800,6 +1800,7 @@ End
end
ensure
fds.each {|fd| IO.for_fd(fd).close rescue next}
+ GC.start
end

def testflushin_finalizer2

#8 Updated by Hiroshi Shirosaki about 2 years ago

I tried to create simplified test case of this issue.

cat gciotest.rb
f = open("testfile", "w")
p f.fileno
IO.for
fd(f.fileno).close

f = open("test_file", "w")
p f.fileno

GC.start # trigger GC

begin
f.close # => raise Errno::EBADF
rescue
p $!
end

File.unlink("test_file")

~/ruby-trunk/bin/ruby -v gciotest.rb
ruby 2.0.0dev (2012-04-17 trunk 35363) [x8664-darwin11.3.0]
5
5
#<Errno::EBADF: Bad file descriptor - test
file>

#9 Updated by Luis Lavena about 2 years ago

  • Status changed from Assigned to Open
  • Assignee deleted (Luis Lavena)

Who should be the responsible for GC? Seems to me that open file descriptor shouldn't be garbage collected while still open.

#10 Updated by Heesob Park about 2 years ago

I found that this issue is related with the auto-close flag of IO.

The following code fails with Errno::EBADF

f = open("testfile", "w")
IO.for
fd(f.fileno).autoclose = true # default value
GC.start
f.close

But, the following code works fine.

f = open("testfile", "w")
IO.for
fd(f.fileno).autoclose = false
GC.start
f.close

#11 Updated by Yusuke Endoh about 2 years ago

  • Status changed from Open to Assigned
  • Assignee set to Narihiro Nakamura

Who should be the responsible for GC?

Nari-san.

Yusuke Endoh mame@tsg.ne.jp

#12 Updated by Tomoyuki Chikanaga almost 2 years ago

Hi,

I think the root cause is wrong usage of IO.forfd in ensure clause of testflush_finalizer1,
and just call GC.start works fine for recycle fds.

This test was written by nobu.

nakada-san
Is the following patch seems ok?

diff --git a/test/ruby/testio.rb b/test/ruby/testio.rb
index ebf9b76..be5a333 100644
--- a/test/ruby/testio.rb
+++ b/test/ruby/test
io.rb
@@ -1891,7 +1891,7 @@ End
}
end
ensure
- fds.each {|fd| IO.for_fd(fd).close rescue next}
+ GC.start
end

def testflushin_finalizer2

#13 Updated by Jon Forums almost 2 years ago

Why is the following

ensure
fds.each {|fd| IO.for_fd(fd).close rescue next}

wrong usage of IO.for_fd?

In normal usage, I would likely use this style of explicit closure (by IO wrapping the fd) but not kick off a GC cycle with GC.start.

I'd like to understand why the usage is wrong. Specifically, is there another style of explicit recycling of fds that doesn't require starting a GC cycle?

#14 Updated by Yusuke Endoh almost 2 years ago

  • Assignee changed from Narihiro Nakamura to Nobuyoshi Nakada

+1 for nagachika-san.

jonforums (Jon Forums) wrote:

Why is the following

ensure
fds.each {|fd| IO.for_fd(fd).close rescue next}

wrong usage of IO.for_fd?

f = open("...") # this first IO has a fd X which is opened now.
IO.for_fd(f.fileno) # this second IO also has the fd X.
GC.start # the second IO is GC'ed; the fd X is auto-closed.
f.close # this attempts to close the fd X, but it is already closed => Errno::EBADF

Specifically, is there another style of explicit recycling of fds that doesn't require starting a GC cycle?

Call IO#close explicitly.

Yusuke Endoh mame@tsg.ne.jp

#15 Updated by Jon Forums almost 2 years ago

mame (Yusuke Endoh) wrote:

jonforums (Jon Forums) wrote:

Why is the following

ensure
fds.each {|fd| IO.for_fd(fd).close rescue next}

wrong usage of IO.for_fd?

f = open("...") # this first IO has a fd X which is opened now.
IO.for_fd(f.fileno) # this second IO also has the fd X.
GC.start # the second IO is GC'ed; the fd X is auto-closed.
f.close # this attempts to close the fd X, but it is already closed => Errno::EBADF

Specifically, is there another style of explicit recycling of fds that doesn't require starting a GC cycle?

Call IO#close explicitly.

Endoh-san, thank you. I understand.

Why do you think the same incorrect test code fails (consistently, not intermittently) on windows when only using -DFDSETSIZE, but doesn't appear to fail on *nix? On Arch, I've not been able to repro when building using -DFDSETSIZE=32767. This may indicate a lurking weakness in an _WIN32 MRI codepath (or winsock2 peculiarity) that doesn't exist (or does a better job of hiding) on *nix.

FYI, there's another recent issue on windows in which -DFDSETSIZE=32767 exposes a SEGV with EventMachine with a simple EM.run {}. By default, EM sets FDSETSIZE to 1024. Rebuilding ruby on windows without the FD_SETSIZE override fixes the SEGV. The EM issue is a separate issue than this one, but may be related. I haven't started investigating, so FYI.

I will test again on ruby19_3 and trunk with the suggested patch and reply if errors persist.

Jon

#16 Updated by Hiroshi Shirosaki almost 2 years ago

Why do you think the same incorrect test code fails (consistently, not intermittently) on windows when only using -DFDSETSIZE, but doesn't appear to fail on *nix? On Arch, I've not been able to repro when building using -DFDSETSIZE=32767. This may indicate a lurking weakness in an _WIN32 MRI codepath (or winsock2 peculiarity) that doesn't exist (or does a better job of hiding) on *nix.

This issue depends on the timing when IO object is GCed. GC of an object with the same fd must occur between open() and close().
I guess -DFD_SETSIZE=32767 would change memory usage and GC timing. MRI implementation is different between WIndows and Linux, so GC timing would differ.

FYI, there's another recent issue on windows in which -DFDSETSIZE=32767 exposes a SEGV with EventMachine with a simple EM.run {}. By default, EM sets FDSETSIZE to 1024. Rebuilding ruby on windows without the FD_SETSIZE override fixes the SEGV. The EM issue is a separate issue than this one, but may be related. I haven't started investigating, so FYI.

I think above issue is cause of fd_array buffer overflow.

typedef struct fdset
{
u
int fdcount;
SOCKET fd
array[FDSETSIZE];
} fd
set;

On EM, FDSETSIZE = 1024 and fdarray[1024].
EM uses FDSET() and FDSET() seems rbw32fdset() on Windows.

https://github.com/ruby/ruby/blob/trunk/win32/win32.c#L2458

In rbw32fdset(), FDSETSIZE = 32767 since rbw32fdset is compiled with -DFDSETSIZE=32767.

if (i == set->fd_count) {
    if (set->fd_count < FD_SETSIZE) { // FD_SETSIZE = 32767
        set->fd_array[i] = s;                 // `i` could be over 1023
        set->fd_count++;
    }
}

If above scenario is correct, FDSETSIZE of Ruby should be equal or less then FDSETSIZE of EM.

include/winsock2.h has FDSET macro on mingw, but MRI undef FDSET and uses rbw32fdset() function. It might be better that FD_SET() is macro instead of function.

#17 Updated by Luis Lavena almost 2 years ago

  • Assignee changed from Nobuyoshi Nakada to Usaku NAKAMURA

Usa-san, what do you think about Hiroshi suggestion about leave FD_SET() as macro instead?

#18 Updated by Usaku NAKAMURA almost 2 years ago

Hello,

In message " [ruby-trunk - Bug #6228] [mingw] Errno::EBADF in ruby/testio.rb on ruby193"
on Apr.24,2012 00:38:57, luislavena@gmail.com wrote:

Usa-san, what do you think about Hiroshi suggestion about leave FD_SET() as macro instead?

First, as Jon mentioned, the SEGV of EM is another issue.
Please split it from this ticket.

Second, I don't understand how Hiroshi want to change the code,
so I cannot say anything about this.

Regards,
--
U.Nakamura usa@garbagecollect.jp

#19 Updated by Luis Lavena almost 2 years ago

  • Assignee changed from Usaku NAKAMURA to Luis Lavena

usa (Usaku NAKAMURA) wrote:

First, as Jon mentioned, the SEGV of EM is another issue.
Please split it from this ticket.

Understood, creating another one to have FD_SET() discussion.

I'll commit suggested patch from nagachika-san.

Thank you.

#20 Updated by Luis Lavena almost 2 years ago

  • Status changed from Assigned to Feedback

luislavena (Luis Lavena) wrote:

I'll commit suggested patch from nagachika-san.

I seem unable to reproduce the exact exception on both trunk and ruby19_3 branches. The error seems randomly manifest.

Right now, against ruby193, it shows in testinvalid_advise

5) Error:
testinvalidadvise(TestIO):
Errno::EBADF: Bad file descriptor - C:/Users/Luis/AppData/Local/Temp/testio20120427-1432-1xmkrxh
C:/Users/Luis/Projects/oss/ruby/test/ruby/test
io.rb:1119:in make_tempfile'
C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1872:in
block (2 levels) in testinvalidadvise'
C:/Users/Luis/Projects/oss/ruby/test/ruby/testio.rb:1871:in each'
C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1871:in
block in test
invalidadvise'
C:/Users/Luis/Projects/oss/ruby/test/ruby/test
io.rb:1870:in each'
C:/Users/Luis/Projects/oss/ruby/test/ruby/test_io.rb:1870:in
testinvalidadvise'
../test/runner.rb:15:in `'

#21 Updated by Jon Forums almost 2 years ago

I seem unable to reproduce the exact exception on both trunk and ruby19_3 branches. The error seems randomly manifest.

Right now, against ruby193, it shows in testinvalid_advise

I've seen failures in both testinvalidadvise (see comment #3) and test_lines. Both disappeared when I tested Nagachika-san's suggested patch.

I'll try again with the latest trunk and ruby193, but IIRC, the testinvalid_advise failure occurred when I built without deps.

#22 Updated by Jon Forums almost 2 years ago

On Win7 32bit, using the following build recipe on both trunk@r35485 and ruby19_3@r35484 with MinGW GCC 4.6.2

sh -c "../configure --enable-shared --disable-install-doc CPPFLAGS='-DFDSETSIZE=32767'"
make
make test-all TESTS="ruby/test
io.rb"

I get only the test_invalid_advise(TestIO) error on ruby19_3, and a freeze (cleared by CTRL-C) on trunk. Nagachika-san's suggested patch fixes the errors as does adding TestIO#setup and TestIO#teardown containing GC.disable and GC.enable, respectively.

#23 Updated by Luis Lavena almost 2 years ago

  • Status changed from Feedback to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r35631.
Jon, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • test/ruby/test_io.rb (class TestIO): Disable GC during IO tests to avoid file descriptors being GC'ed. Suggestion by Tomoyuki Chikanaga [Bug #6228]

#24 Updated by Luis Lavena almost 2 years ago

  • Status changed from Closed to Assigned
  • % Done changed from 100 to 0

Reopening as reported in this breaks chkbuild on debian.

#25 Updated by Hiroshi Shirosaki almost 2 years ago

Hi Luis,

I think Nagachika-san's original patch as below is sufficient for this issue. GC.disable for all tests seems too far.
Did you test with this patch?

diff --git a/test/ruby/testio.rb b/test/ruby/testio.rb
index ebf9b76..be5a333 100644
--- a/test/ruby/testio.rb
+++ b/test/ruby/test
io.rb
@@ -1891,7 +1891,7 @@ End
}
end
ensure
- fds.each {|fd| IO.for_fd(fd).close rescue next}
+ GC.start
end

def testflushin_finalizer2

#26 Updated by Tomoyuki Chikanaga almost 2 years ago

Hi Luis,

Sorry for a lack of enough explanations about my patch.

The ensure clause of testflushinfinalizer1 seems to try close all IO objects opened above and not GC'ed & finalized using IO.forfd. But it's wrong usage of IO.for_fd. The scenario is as follows.

io = open("..")
fd = io.fileno
io = nil
IO.forfd(fd).close # <- at this moment, io (if not GC'ed yet) has invalid fd,
# because it's already closed by IO.for
fd(fd).close
newio = open("...") # newio might have same fd of io, because file descriptor can be recycled.
... # => io is GC'ed and finalized here. Now newio has closed fd!
new
io.gets # => raised Errno::EBADF

testflushin_finalizer1 is test for finalizer of IO (see #3910) and GC.disable breaks it's intent.
I'm sorry, but I'd like to revert r35631.
And I'll commit my patch. Please check it.

#27 Updated by Tomoyuki Chikanaga almost 2 years ago

  • Status changed from Assigned to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r35637.
Jon, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • test/ruby/testio.rb (testflushinfinalizer1): don't use IO.for_fd to close IO objects. it create IO object with already closed fd, and cause occasional Errno::EBADF in following tests. [Bug #6228]

#28 Updated by Luis Lavena almost 2 years ago

Hi Tomoyuki,

Please revert my change. As commented before, was unable to use your
suggested patch since the invalid file descriptor error 'moved' to a
different test.

I was unable to replicate the error from chkbuild.

Please revert and commit your change, will verify again later today.

Thank you.

Sorry for top posting. Sent from mobile.
On May 13, 2012 11:32 AM, "nagachika (Tomoyuki Chikanaga)" <
nagachika00@gmail.com> wrote:

Issue #6228 has been updated by nagachika (Tomoyuki Chikanaga).

Hi Luis,

Sorry for a lack of enough explanations about my patch.

The ensure clause of testflushinfinalizer1 seems to try close all IO
objects opened above and not GC'ed & finalized using IO.for
fd. But it's
wrong usage of IO.for_fd. The scenario is as follows.

io = open("..")
fd = io.fileno
io = nil
IO.forfd(fd).close # <- at this moment, io (if not GC'ed yet) has
invalid fd,
# because it's already closed by IO.for
fd(fd).close
newio = open("...") # newio might have same fd of io, because file
descriptor can be recycled.
... # => io is GC'ed and finalized here. Now newio has closed fd!
new
io.gets # => raised Errno::EBADF

testflushin_finalizer1 is test for finalizer of IO (see #3910) and
GC.disable breaks it's intent.
I'm sorry, but I'd like to revert r35631.

And I'll commit my patch. Please check it.

Bug #6228: [mingw] Errno::EBADF in ruby/testio.rb on ruby193
https://bugs.ruby-lang.org/issues/6228#change-26601

Author: jonforums (Jon Forums)
Status: Assigned
Priority: Normal
Assignee: luislavena (Luis Lavena)
Category: test
Target version: 1.9.3
ruby -v: ruby 1.9.3p172 (2012-03-30 revision 35179) [i386-mingw32]

On Win7 32bit built with MinGW 4.6.2 using the RubyInstaller build
recipes, I get the following test error:

sh-3.1$ ruby --version
ruby 1.9.3p172 (2012-03-30 revision 35179) [i386-mingw32]

sh-3.1$ make test-all TESTS='openssl fiddle psych zlib io json mkmf
pathname stringio erb
fileutils ruby/testio.rb ruby/testiom17n.rb ruby/testfile.rb'

7) Error:
testlines(TestIO):
Errno::EBADF: Bad file descriptor
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test
io.rb:33:in
close'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:33:in
rescue in block in pipe'
c:/Users/Jon/Documents/RubyDev/ruby-git/test/ruby/test_io.rb:30:in
`block in pipe'

http://bugs.ruby-lang.org/

Also available in: Atom PDF