Project

General

Profile

Actions

Bug #9115

open

Logger traps all exceptions; breaks Timeout

Added by cphoenix (Chris Phoenix) about 10 years ago. Updated 3 months ago.

Status:
Assigned
Priority:
Normal
Target version:
-
ruby -v:
ruby 2.0.0p247 (2013-06-27) [i386-mingw32]
Backport:
[ruby-core:58374]
Tags:

Description

Line 577-579 of logger.rb

  rescue Exception => ignored
    warn("log writing failed. #{ignored}")
  end

Thus, when the system times out in the middle of writing a log message, it warns "log writing failed. execution expired" and just keeps right on running.

This is true in 1.9.3 as well. I haven't looked at older versions.

Pardon me while I go grep "rescue Exception" in the entire Ruby codebase, and see whether I can reliably use Timeout at all...

OK, you might check out C:\Ruby200\lib\ruby\gems\2.0.0\gems\activerecord-3.2.13\lib\active_record\railties\databases.rake

All the other "rescue Exception" seem to re-raise it, except maybe C:\Ruby200\lib\ruby\2.0.0\xmlrpc\server.rb and C:\Ruby200\lib\ruby\gems\2.0.0\gems\activesupport-3.2.13\lib\active_support\callbacks.rb

Updated by cphoenix (Chris Phoenix) about 10 years ago

...I guess my search was broken. I did another and found a whole lot of "rescue exception" that don't seem to always re-raise. But the logger is a big one because it's supposed to be usable everywhere, and someone who logs a lot will run into this bug.

Updated by aaron@serendipity.cx (Aaron Stone) over 9 years ago

The behavior has changed in Ruby 2.1, and the default rescue no longer rescues Timeout unless Timeout was given an explicit exception class to raise. Could you try Ruby 2.1 and see if it behaves according to your expectation here?

Updated by wpaulson (Bill Paulson) about 9 years ago

The source code for 2.1.2 doesn't seem to have changed: the LogDevice::write method still does a rescue Exception and ignores the result.
Our application ran into this problem when one thread raises an exception in another thread that's doing a LogDevice::write(). The exception is supposed to be handled at a higher level, so write() rescuing it and ignoring it causes our application to hang.
A write() method without this problem looks like

  def write(message)
    @mutex.synchronize do
      if @shift_age and @dev.respond_to?(:stat)
        begin
          check_shift_log
        rescue IOError
          warn("log shifting failed. #{$!}")
        end
      end
      begin
        @dev.write(message)
      rescue IOError
        warn("log writing failed. #{$!}")
      end
    end
  end

Updated by astratto (Stefano Tortarolo) over 8 years ago

I confirm that this bug is reproducible up to 2.2 and it's pretty easy to trigger.

require 'timeout'
require 'logger'

class MyExc < RuntimeError; end

def long_call(logger)
  Timeout.timeout(3, MyExc) do
    a = 1
    loop do
      a += 1
      logger.warn "Test #{a}"
    end
  end
end

Aaron Stone is right though that using 2.1+ it would work if you don't provide a custom exception or if your exception inherits from TimeoutError.

Beware that it must inherit from Object::TimeoutError and not Timeout::Error.

Having said that, it'd be nice to have a proper fix at logger.rb level.

@Bill, I don't think your code would fix it though, because the underlying device might throw a different exception.

Updated by astratto (Stefano Tortarolo) over 8 years ago

Correction: providing an exception to Timeout#timeout would break it also in trunk.

It seems that by design inner blocks are able to rescue the exception provided to that method.

Updated by jeremyevans0 (Jeremy Evans) over 4 years ago

  • Status changed from Open to Assigned
  • Assignee set to sonots (Naotoshi Seo)
  • Backport deleted (1.9.3: UNKNOWN, 2.0.0: UNKNOWN)

It makes sense for Logger to handle all exceptions by default, as you don't want Logger breaking otherwise working code. So the default behavior is expected and not a bug.

I think the best way to handle this is to allow the user to specify exception classes they would like reraised instead of swallowed. I've added a pull request that implements this: https://github.com/ruby/logger/pull/37

Updated by ioquatix (Samuel Williams) over 4 years ago

It makes sense for Logger to handle all exceptions by default

I'm not sure I agree with this. There must be some well-defined set of exceptions that could be raised by the code it's executing that should be handled/ignored, e.g. IOError, EPERM, ENOSPC, etc. I think it's going to be much better to solve this problem by catching only specific errors that related to the operations that might fail.

Updated by naruse (Yui NARUSE) over 4 years ago

See the code again before discuss.

    def write(message)
      begin
        synchronize do
          if @shift_age and @dev.respond_to?(:stat)
            begin
              check_shift_log
            rescue
              warn("log shifting failed. #{$!}")
            end
          end
          begin
            @dev.write(message)
          rescue
            warn("log writing failed. #{$!}")
          end
        end
      rescue Exception => ignored
        warn("log writing failed. #{ignored}")
      end
    end

https://github.com/ruby/ruby/blob/5a384e2c08704dc7af9d8d3bdfc475eb8c0723aa/lib/logger/log_device.rb#L29-L48

With the code, the intention of rescue Exception is clear. It wants to rescue exceptions which is raised by synchronize.

And note that synchronize is Monitor's and it only raises ThreadError.
https://github.com/ruby/ruby/blob/master/lib/monitor.rb
I don't understand why it loosely rescues all exceptions.

Updated by naruse (Yui NARUSE) over 4 years ago

There're also rescue-clauses for check_shift_log and @dev.write(message).
They may raise various exceptions and logger.rb never knows that.

Explicitly passing those exceptions are general way, but are there any exceptions other than the family of Timeout::Error?

Updated by Eregon (Benoit Daloze) 3 months ago

"defensive" code like that is just broken, no one should rescue Exception.
Let's remove that outer rescue?
If synchronize raises there is a serious bug worth fixing and not ignoring.

Similarly I think the inner rescues should either be removed or only rescue IOError if really needed.

Updated by mame (Yusuke Endoh) 3 months ago

Discussed at the dev meeting.

  • It is clear that Logger has the intention not to interfere with the execution of the application on a log write failure.
  • On the other hand, it is also clear that Logger is not considering about asynchronous exceptions.

As a better-than-nothing mitigation, it is proposed to enclose the entire Logger::LogDevice#write in Thread.handle_interrupt(:never) { ... }.
It could be a problem if the method takes a very long time because Timeout.timeout cannot interrupt the execution, but such a case will be rare (hopefully).

What exception Timeout.timeout raises is user-configurable, so it would be difficult to reraise only a specific exception class.

It is ultimately up to the maintainer, @sonots (Naotoshi Seo), to decide.

Updated by Eregon (Benoit Daloze) 3 months ago

mame (Yusuke Endoh) wrote in #note-11:

As a better-than-nothing mitigation, it is proposed to enclose the entire Logger::LogDevice#write in Thread.handle_interrupt(:never) { ... }.
It could be a problem if the method takes a very long time because Timeout.timeout cannot interrupt the execution, but such a case will be rare (hopefully).

This sounds problematic, especially since that is doing IO, possibly even network IO (e.g. NFS, or logging over some API).

I believe all libraries should only catch specific errors they want to rescue and if that's hard to determine than StandardError at most, never Exception.
Rescuing Exception without re-raise is always a bug (e.g. NoMemoryError/SystemStackError can be silent and that can cause pretty serious inconsistencies and indirectly what looks like memory corruption).

Updated by shyouhei (Shyouhei Urabe) 3 months ago

Eregon (Benoit Daloze) wrote in #note-12:

mame (Yusuke Endoh) wrote in #note-11:

As a better-than-nothing mitigation, it is proposed to enclose the entire Logger::LogDevice#write in Thread.handle_interrupt(:never) { ... }.
It could be a problem if the method takes a very long time because Timeout.timeout cannot interrupt the execution, but such a case will be rare (hopefully).

This sounds problematic, especially since that is doing IO, possibly even network IO (e.g. NFS, or logging over some API).

Right. It's never an ultimate fix. However considering the way a log destination is designed generally, I think it's rare for that blocking IO to block indefinitely. Sure, with handle_interrupt a timeout can delay for seconds. That's better than never, and "never" is the way it is now.

I believe all libraries should only catch specific errors they want to rescue and if that's hard to determine than StandardError at most, never Exception.
Rescuing Exception without re-raise is always a bug (e.g. NoMemoryError/SystemStackError can be silent and that can cause pretty serious inconsistencies and indirectly what looks like memory corruption).

The Logger author seems hesitating to interface with its callers at any cost (understandable). It is unfortunate that we currently have no idiomatic ways to achieve their goals.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0