Project

General

Profile

Actions

Bug #11744

closed

Open files being GC'ed while still in use

Added by chancancode (Godfrey Chan) about 9 years ago. Updated about 9 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:71697]

Description

Hello!

We recently added a feature on Rails that uses the listen gem under-the-hood. Since we enabled those tests, we have been seeing random failures on Travis CI like these:

  1) Error:
KeyGeneratorTest#test_Generating_a_key_of_an_alternative_length:
Errno::EBADF: Bad file descriptor @ fptr_finalize - /home/travis/build/rails/rails/activesupport/lib/active_support/key_generator.rb
    /home/travis/build/rails/rails/activesupport/test/key_generator_test.rb:16:in `setup'

What's happening here is that the setup code for this test is referencing an constant previously marked for autoload, and something went wrong during autoload (inside Ruby).

(By the way, I was referring to Ruby's autoload feature which is not specific to Rails, although we did enhance it to infer the filename.)

Because there are so many things going on in the test suite, I tried to narrow down the problem further by removing most of the work we do.

Eventually, I arrived at this reproduction script.

I disabled the rest of the test suite so that this file is the only test being ran, in three different configurations:

(1) LISTEN=1 LISTEN_GC_FIX=1 bundle exec rake test
(2) LISTEN=1 bundle exec rake test
(3) LISTEN=0 bundle exec rake test

In each of the tests, we tried to do a very simple File.open(..., 'r') { |f| ... } (reading in the first 10 bytes to make sure we are dealing with the correct file) up to 5000 times.

In (1), we set up 20 listeners, do no work, and properly tear down each of them, and disabling the GC before opening the files.

In (2), we also setup the listeners but do not disable the GC.

In (3), we do not set up the listeners nor disable the GC.

Scenario (2) reliably fails on Travis CI, and the rest reliably passes: https://travis-ci.org/rails/rails/builds/93402196

/home/travis/build/rails/rails/activesupport/test/file_evented_update_checker_test.rb:31:in `close': Bad file descriptor @ fptr_finalize - vendor/bundle/ruby/2.2.0/cache/bundler/git/mail-c03100656df0e86c13a2d6103ac408da7a4e728c/objects/34/fdd44ee4de34a2f05b0088562e38797db604a8 (Errno::EBADF)
	from /home/travis/build/rails/rails/activesupport/test/file_evented_update_checker_test.rb:31:in `open'
	from /home/travis/build/rails/rails/activesupport/test/file_evented_update_checker_test.rb:31:in `block (2 levels) in <top (required)>'
	from /home/travis/build/rails/rails/activesupport/test/file_evented_update_checker_test.rb:26:in `glob'
	from /home/travis/build/rails/rails/activesupport/test/file_evented_update_checker_test.rb:26:in `block in <top (required)>'
	from /home/travis/build/rails/rails/activesupport/test/file_evented_update_checker_test.rb:23:in `chdir'
	from /home/travis/build/rails/rails/activesupport/test/file_evented_update_checker_test.rb:23:in `<top (required)>'
	from /home/travis/.rvm/rubies/ruby-2.2.3/lib/ruby/2.2.0/rake/rake_test_loader.rb:10:in `require'
	from /home/travis/.rvm/rubies/ruby-2.2.3/lib/ruby/2.2.0/rake/rake_test_loader.rb:10:in `block (2 levels) in <main>'
	from /home/travis/.rvm/rubies/ruby-2.2.3/lib/ruby/2.2.0/rake/rake_test_loader.rb:9:in `each'
	from /home/travis/.rvm/rubies/ruby-2.2.3/lib/ruby/2.2.0/rake/rake_test_loader.rb:9:in `block in <main>'
	from /home/travis/.rvm/rubies/ruby-2.2.3/lib/ruby/2.2.0/rake/rake_test_loader.rb:4:in `select'
	from /home/travis/.rvm/rubies/ruby-2.2.3/lib/ruby/2.2.0/rake/rake_test_loader.rb:4:in `<main>'

Note that we are using the "block form" File.open in the test here, and the error is happening during the implicit close call inside Ruby. Also note that we are loading the listen gem in all three cases, so merely having it loaded does not trigger the issue.

I was not able to reproduce this locally on a Mac, so this is probably Linux specific. From what I can tell, listen gem uses rb-inotify (which in turns uses ffi) on Linux so that might be related too.


My hypothesis is that, so interaction between listen/rb-inotify/ffi during the listener setup/teardown phase is putting the Ruby VM/process into a bad state. In this state, somehow the VM is confused about what is being used. When the GC runs, it somehow thinks that the file we are currently reading (the f inside the File.open(...) { |f| ... } call) is no longer being used, causing it to be GC-ed and auto-closed. When the block finishes running (it did indeed open the file correctly, as it was able to read the first 10 bytes correctly), the finalizer code tries to close FD again, causing the OS to return a EBADF error.

While it is certainly possible that the listen/rb-inotify/ffi gems are doing something wrong, it seems quite strange that they could put the Ruby process in such a state, so I figured it is worth having Ruby core people look into this.

I tried to reproduce this outside of Rails but so far I cannot make it work, I'll keep trying. (Feel free to fork it.)

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0