Project

General

Profile

Feature #6647

Exceptions raised in threads should be logged

Added by Charles Nutter almost 4 years ago. Updated about 14 hours ago.

Status:
Assigned
Priority:
Normal
[ruby-core:45864]

Description

Many applications and users I have dealt with have run into bugs due to Ruby's behavior of quietly swallowing exceptions raised in threads. I believe this is a bug, and threads should always at least log exceptions that bubble all the way out and terminate them.

The implementation should be simple, but I'm not yet familiar enough with the MRI codebase to provide a patch. The exception logging should be logged in the same way top-level exceptions get logged, but perhaps with information about the thread that was terminated because of the exception.

Here is a monkey patch that simulates what I'm hoping to achieve with this bug:

class << Thread
  alias old_new new

  def new(*args, &block)
    old_new(*args) do |*bargs|
      begin
        block.call(*bargs)
      rescue Exception => e
        raise if Thread.abort_on_exception || Thread.current.abort_on_exception
        puts "Thread for block #{block.inspect} terminated with exception: #{e.message}"
        puts e.backtrace.map {|line| "  #{line}"}
      end
    end
  end
end

Thread.new { 1 / 0 }.join
puts "After thread"   

Output:

system ~/projects/jruby $ ruby thread_error.rb 
Thread for block #<Proc:0x000000010d008a80@thread_error.rb:17> terminated with exception: divided by 0
  thread_error.rb:17:in `/'
  thread_error.rb:17
  thread_error.rb:7:in `call'
  thread_error.rb:7:in `new'
  thread_error.rb:5:in `initialize'
  thread_error.rb:5:in `old_new'
  thread_error.rb:5:in `new'
  thread_error.rb:17
After thread

History

#1 [ruby-core:45865] Updated by Charles Nutter almost 4 years ago

FWIW, precedent: Java threads log their exceptions by default. I have never found the feature to be a bother, and it makes it nearly impossible to ignore fatally-flawed thread logic that spins up and fails lots of threads.

#2 [ruby-core:45866] Updated by Eero Saynatkari almost 4 years ago

headius (Charles Nutter) wrote:

Many applications and users I have dealt with have run into bugs due to Ruby's behavior of quietly swallowing exceptions raised in threads. I believe this is a bug, and threads should always at least log exceptions that bubble all the way out and terminate them.

I have had to set .abort_on_exception more times than I care to remember.

  rescue Exception => e
    raise if Thread.abort_on_exception || Thread.current.abort_on_exception
    puts "Thread for block #{block.inspect} terminated with exception: #{e.message}"
    puts e.backtrace.map {|line| "  #{line}"}

$stderr/warn, but this would improve the current situation significantly.

Can significant upgrade problems be expected if .abort_on_exception defaulted to true? This would seem to be the behaviour to suit most users.

#3 [ruby-core:45881] Updated by Alex Young almost 4 years ago

On 25/06/12 23:44, rue (Eero Saynatkari) wrote:

Issue #6647 has been updated by rue (Eero Saynatkari).

headius (Charles Nutter) wrote:

Many applications and users I have dealt with have run into bugs due to Ruby's behavior of quietly swallowing exceptions raised in threads. I believe this is a bug, and threads should always at least log exceptions that bubble all the way out and terminate them.

I have had to set .abort_on_exception more times than I care to remember.

Agreed. It's one of the things I check for in code review. Consider
this a +1 from me.

   rescue Exception =>  e
     raise if Thread.abort_on_exception || Thread.current.abort_on_exception
     puts "Thread for block #{block.inspect} terminated with exception: #{e.message}"
     puts e.backtrace.map {|line| "  #{line}"}

$stderr/warn, but this would improve the current situation significantly.

Can significant upgrade problems be expected if .abort_on_exception defaulted to true? This would seem to be the behaviour to suit most users.

That sounds a little extreme, although I wouldn't object. I'd be happy
with them not being silently swallowed.

--
Alex


Bug #6647: Exceptions raised in threads should be logged
https://bugs.ruby-lang.org/issues/6647#change-27456

Author: headius (Charles Nutter)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:
ruby -v: head

Many applications and users I have dealt with have run into bugs due to Ruby's behavior of quietly swallowing exceptions raised in threads. I believe this is a bug, and threads should always at least log exceptions that bubble all the way out and terminate them.

The implementation should be simple, but I'm not yet familiar enough with the MRI codebase to provide a patch. The exception logging should be logged in the same way top-level exceptions get logged, but perhaps with information about the thread that was terminated because of the exception.

Here is a monkey patch that simulates what I'm hoping to achieve with this bug:

class<< Thread
alias old_new new

def new(*args,&block)
old_new(*args) do |*bargs|
begin
block.call(*bargs)
rescue Exception => e
raise if Thread.abort_on_exception || Thread.current.abort_on_exception
puts "Thread for block #{block.inspect} terminated with exception: #{e.message}"
puts e.backtrace.map {|line| " #{line}"}
end
end
end
end

Thread.new { 1 / 0 }.join
puts "After thread"

END

Output:

system ~/projects/jruby $ ruby thread_error.rb
Thread for block #Proc:0x000000010d008a80@thread_error.rb:17 terminated with exception: divided by 0
thread_error.rb:17:in /'
thread_error.rb:17
thread_error.rb:7:in
call'
thread_error.rb:7:in new'
thread_error.rb:5:in
initialize'
thread_error.rb:5:in old_new'
thread_error.rb:5:in
new'
thread_error.rb:17
After thread

#4 [ruby-core:45906] Updated by Eric Wong almost 4 years ago

Alex Young alex@blackkettle.org wrote:

On 25/06/12 23:44, rue (Eero Saynatkari) wrote:

Issue #6647 has been updated by rue (Eero Saynatkari).

$stderr/warn, but this would improve the current situation significantly.

Can significant upgrade problems be expected if .abort_on_exception defaulted to true? This would seem to be the behaviour to suit most users.

That sounds a little extreme, although I wouldn't object. I'd be
happy with them not being silently swallowed.

I think aborting the whole process is extreme (though, I usually do
it myself).

I would very much like to see this via $stderr/warn, though.

#5 [ruby-core:45913] Updated by Eero Saynatkari almost 4 years ago

normalperson (Eric Wong) wrote:

Alex Young alex@blackkettle.org wrote:

On 25/06/12 23:44, rue (Eero Saynatkari) wrote:

Issue #6647 has been updated by rue (Eero Saynatkari).

$stderr/warn, but this would improve the current situation significantly.

Can significant upgrade problems be expected if .abort_on_exception defaulted to true? This would seem to be the behaviour to suit most users.

That sounds a little extreme, although I wouldn't object. I'd be
happy with them not being silently swallowed.

I think aborting the whole process is extreme (though, I usually do
it myself).

You are probably correct. Reconsidering the issue, the benefit of raising
is probably not enough to offset that, thus leaving the $stderr/warn as the
better choice.

--
Eero

#6 [ruby-core:46444] Updated by Hiroshi Nakamura almost 4 years ago

  • Category set to core
  • Target version set to 2.0.0
  • Assignee set to Yukihiro Matsumoto
  • Status changed from Open to Assigned

Discussions ad CRuby dev meeting at 14th July.

  • We can understand the requirement. (We understood that the requirement is dumping something without raising when Thread#abort_on_exception = false)
  • Writing to STDERR could cause problem with existing applications so we should take care about it.
  • rb_warn() instead of puts would be good because we already using rb_warns.

Matz, do you mind if we dump Thread error with rb_warn if Thread#abort_on_exception = false?

#7 [ruby-core:47466] Updated by Charles Nutter over 3 years ago

Any update on this?

#8 [ruby-core:47988] Updated by Charles Nutter over 3 years ago

Ping! This came up in JEG's talk at Aloha RubyConf as a recommendation (specifically, set abort_on_exception globally to ensure failed threads don't quietly disappear). Ruby should not allow threads to quietly fail.

#9 [ruby-core:47994] Updated by Motohiro KOSAKI over 3 years ago

I think "exception raised" callback is better way because an ideal output (both format and output device) depend on an application. It should be passed a raised exception.

#10 [ruby-core:48000] Updated by Alex Young over 3 years ago

On 15/10/12 03:24, kosaki (Motohiro KOSAKI) wrote:

Issue #6647 has been updated by kosaki (Motohiro KOSAKI).

I think "exception raised" callback is better way because an ideal output (both format and output device) depend on an application. It should be passed a raised exception.

This, along with a sensible default that displays something to stderr,
would be absolutely ideal from my point of view.

--
Alex

#11 [ruby-core:48057] Updated by Charles Nutter over 3 years ago

I started prototyping a callback version and ran into some complexities I could not easily resolve:

  • How does abort_on_exception= interact with a callback system? ** I tried implementing abort_on_exception=true to use a builtin callback that raises in Thread.main, but should abort_on_exception=true blow away a previously-set callback? ** Similarly: should abort_on_exception=false reset to a do-nothing callback? ** If neither of these, how do we combine callback and abort_on_exception behavior?
  • Seems like there should be a Thread.default_exception_handler you can set once for all future threads.

My concern is that bikeshedding a callback API -- as useful as it might be -- will cause further delays in the more important behavior of having threads report that they terminated due to an exception.

#13 [ruby-core:50046] Updated by Charles Nutter over 3 years ago

Checking in on this again. Can we at least agree it should happen for 2.0.0? Perhaps Matz should review this?

#14 Updated by Yusuke Endoh over 3 years ago

  • Tracker changed from Bug to Feature

headius (Charles Nutter) wrote:

Can we at least agree it should happen for 2.0.0?

No, objection. This looks to me nothing except for a feature request.
I cannot estimate the impact how writing to stderr affects existing applications.
There is a workaround.
I don't think that your concern is so significant that we should address by changing the spec from now.

Moving to the feature tracker and setting to next minor.

--
Yusuke Endoh mame@tsg.ne.jp

#15 [ruby-core:50055] Updated by Yusuke Endoh over 3 years ago

  • Target version changed from 2.0.0 to next minor

#16 [ruby-core:57437] Updated by Charles Nutter over 2 years ago

So, can we do this for 2.1? I have heard from many other users that really would like exceptions bubbling out of threads to be reported in some way. We have had numerous bug reports relating to code where threads disappear without a trace.

#17 [ruby-core:57468] Updated by Avdi Grimm over 2 years ago

This would indeed eliminate a huge amount of confusion for people getting
started with threads. Or for people years of experience with threads, for
that matter...

--
Avdi Grimm
http://avdi.org

I only check email twice a day. to reach me sooner, go to
http://awayfind.com/avdi

#18 [ruby-core:57473] Updated by Koichi Sasada over 2 years ago

  • Target version changed from next minor to 2.1.0

#19 [ruby-core:57472] Updated by Koichi Sasada over 2 years ago

(2013/09/27 20:18), headius (Charles Nutter) wrote:

So, can we do this for 2.1? I have heard from many other users that really would like exceptions bubbling out of threads to be reported in some way. We have had numerous bug reports relating to code where threads disappear without a trace.

I'll ask matz.

Does JRuby log it already?
Any problem on your experience if you have?

--
// SASADA Koichi at atdot dot net

#20 [ruby-core:57493] Updated by Charles Nutter over 2 years ago

We do not currently log it, but the patch to do so is trivial.

https://gist.github.com/6764310

I'm running tests now to confirm it doesn't break anything.

#21 [ruby-core:57494] Updated by Charles Nutter over 2 years ago

Testing seems to indicate this is a pretty safe change, and it just makes the debug-logged exception output be logged any time abort_on_exception is not true.

#22 [ruby-core:57576] Updated by Akira Tanaka over 2 years ago

In the yesterday's meeting,
https://bugs.ruby-lang.org/projects/ruby/wiki/DevelopersMeeting20131001Japan
we discussed this issue.

We found that message at thread exiting with exception have a problem.
The thread can be joined after exit and the exception may be handled by joined thread.

% ruby -e '
t = Thread.new {
  raise "foo"
}
sleep 1 # the thread exits with an exception.
begin
  t.join
rescue
  p $! # something to do with the exception
end
'
#<RuntimeError: foo>

If thread exiting with exception outputs a message,
there is no way to disable to it.

So, the message should be delayed until Ruby is certain that
the thread is not joined.
This means the message should be output at the thread is collected by GC.

#23 [ruby-core:57586] Updated by Charles Nutter over 2 years ago

akr (Akira Tanaka) wrote:

In the yesterday's meeting,
https://bugs.ruby-lang.org/projects/ruby/wiki/DevelopersMeeting20131001Japan
we discussed this issue.

We found that message at thread exiting with exception have a problem.
The thread can be joined after exit and the exception may be handled by joined thread.
...
If thread exiting with exception outputs a message,
there is no way to disable to it.

So, the message should be delayed until Ruby is certain that
the thread is not joined.
This means the message should be output at the thread is collected by GC.

GC is a pretty fuzzy time boundary, but it's not terrible. Handling it will mean some finalization requirement for threads to say "hey, I just GCed this thread that died due to an unhandled exception". I feel like something more explicit is needed.

I guess I need to think about this. Some of the cases I want to fix -- where threads are spun up and left to do their own work -- this might be acceptable. But many users will keep references to worker threads they start in order to explicitly stop them on shutdown or other events. In those cases, the thread will be hard referenced and never GCed...and there will be no indication that the thread has died.

Perhaps this could be an on-by-default flag? It would require very little work to add something like:

class Thread
  def report_on_exception=(report) ..
end

...where the default is true. Going forward, this would be like having the debug output of a thread-killing exception always happen, but you could turn it off. That would address your concern about not being able to silence it.

The workflow would go like this:

If you are spinning up a thread to do background work and don't plan to check on it...

  • Spin up the thread
  • Store it in a list if you like
  • A message will be reported if the thread dies in an exceptional way

If you are spinning up a thread you plan to join on at some time in the future...

  • Spin up the thread
  • Set Thread#report_on_exception = false
  • Join at your leisure...no message will be reported

This at least allows users to say "I mean to pick up this thread's results later...don't report an error" without having hard-referenced threads die silently.

Is this a reasonable compromise?

#24 [ruby-core:57594] Updated by Koichi Sasada over 2 years ago

FYT:
On pthread, there is pthread_detach() which declares nobody join on this thread.
In other words, pthread_detach() is same as Thread#report_on_exception=true.

#25 [ruby-core:57595] Updated by Koichi Sasada over 2 years ago

Sorry, it is not same, but we can consier that.

BTW, I think it true as default is good idea.

IMO, inter-thread communication via exception with Thread#join should be bad idea.

#26 [ruby-core:57617] Updated by Charles Nutter over 2 years ago

ko1 (Koichi Sasada) wrote:

Sorry, it is not same, but we can consier that.

BTW, I think it true as default is good idea.

So to summarize:

  • Exceptions will log when they bubble out of a thread, as with -d, unless Thread#report_on_exception == false
  • Thread#report_on_exception defaults to true

Can we do this for 2.1?

IMO, inter-thread communication via exception with Thread#join should be bad idea.

+1

I had originally wanted something similar to Java, where you can set an "unhandled exception handler" for any thread. That would cover all cases, and the default case would be to just report the error. I was unsuccessful in specifying it because I wasn't sure how it should interact with abort_on_exception=.

#27 [ruby-core:60265] Updated by Hiroshi SHIBATA over 2 years ago

  • Target version changed from 2.1.0 to current: 2.2.0

#28 [ruby-core:60375] Updated by Koichi Sasada over 2 years ago

Restart for 2.2.
Matz, do you have any idea?

#29 [ruby-core:69107] Updated by Lin Jen-Shin about 1 year ago

Not sure if a +1 would do anything, but I like the idea of
Thread#report_on_exception defaults to true.

For quick and one time scripts, it's tedious to write
Thread.current.abort_on_exception = true all the time,
and it shouldn't be set to true by default, either.
So at least make debugging easier by default is a good idea,
and who doesn't like to see warnings anyway? :P

I was referred from yahns mailing list:
http://yhbt.net/yahns-public/m/20150508170311.GA1260%40dcvr.yhbt.net.html
Which some worker threads were dead silently and it's puzzling
if I don't even know there's an exception was raised.

#30 [ruby-core:69109] Updated by Eric Wong about 1 year ago

I have an actual patch which is only 2 lines, but there's some test
failures and MANY warnings I don't feel motivated to fix just yet
unless matz approves the feature:

http://80x24.org/spew/m/0a12f5c2abd2dfc2f055922a16d02019ee707397.txt

#31 [ruby-core:69110] Updated by Charles Nutter about 1 year ago

Eric Wong wrote:

I have an actual patch which is only 2 lines, but there's some test
failures and MANY warnings I don't feel motivated to fix just yet
unless matz approves the feature:

Hot diggity! I bet there's several of these that indicate bugs to be fixed. At the very least, they indicate exceptions that are being raised and not dealt with.

I think this is great evidence that this IS the right change to make.

#32 [ruby-core:74743] Updated by Benoit Daloze 2 months ago

Akira Tanaka wrote:

In the yesterday's meeting,
https://bugs.ruby-lang.org/projects/ruby/wiki/DevelopersMeeting20131001Japan
we discussed this issue.

We found that message at thread exiting with exception have a problem.
The thread can be joined after exit and the exception may be handled by joined thread.

% ruby -e '
t = Thread.new {
  raise "foo"
}
sleep 1 # the thread exits with an exception.
begin
  t.join
rescue
  p $! # something to do with the exception
end
'
#<RuntimeError: foo>

If thread exiting with exception outputs a message,
there is no way to disable to it.

So, the message should be delayed until Ruby is certain that
the thread is not joined.
This means the message should be output at the thread is collected by GC.

I am strongly in favor of having something like Thread#report_on_exception, defaulting to true.
If a Thread can support known exceptions, it can rescue them explicitly.
If a Thread is used as some sort of isolation, it can disable #report_on_exception.

Thread#join is not enough, and because of the lack of reporting it's very easy to end in a deadlock with no other way to notice than dumping the thread stacks.
Imagine a simple actor framework where actors are just Threads with a Queue.
Actor1 waits for a message from Actor2, and Actor2 crashes because for instance it calls a method which does not exist.
Actor1 is blocked, and the user has absolutely no knowledge of what's happening, unless Thread.abort_on_exception is set before creating any thread.

So, in summary, Thread.abort_on_exception is not always appropriate,
Thread#join is not enough,
and silently swallowing exceptions can lead to deadlocks that the programmer has a hard time to notice.
Let's give a chance to users to see problems in their code with Thread!

#33 [ruby-core:75235] Updated by Shyouhei Urabe about 1 month ago

I remember this topic was looked at in the developer meeting this month. Matz was positive to have Thread#report_on_exception, but not default true. Sorry I don't remember the reason why he was not comfortable with defaulting this.

#34 [ruby-core:75250] Updated by Benoit Daloze about 1 month ago

Shyouhei Urabe wrote:

I remember this topic was looked at in the developer meeting this month. Matz was positive to have Thread#report_on_exception, but not default true.

Thanks for the reply.

Sorry I don't remember the reason why he was not comfortable with defaulting this.

That's unfortunate.
The main reason to have it by default is to give a chance when developing with Threads to notice the error in the sub-thread.
When trying out threads, the program will of course have very little chance to have Thread.abort_on_exception or Thread.report_on_exception in it,
particularly if the author is not extremely familiar with current Ruby thread exception pitfalls.
Debug mode (-d) would help but it's also a feature most people ignore or do not think to use (and it outputs much more).

That's why I think report_on_exception by default is the only reasonable choice for people not extremely familiar with Ruby thread and their exception handling.

I would guess the argument is about Thread#join or Thread#value.
But threads in Ruby are OS threads, using them for just one computation (like a future) is inefficient as it incurs spawning a OS thread every time.
For this and many other reasons, joining threads is often done much later than when the exception happens (if ever, for instance with a dead/livelock it would not).
Here is another example: Communication with threads is done most often with Queue, yet if the main Thread pops from the queue to get results from the sub-thread (producer) and the sub-thread throws an exception, the program will deadlock with no clue given to the programmer.

So relying on #join or #value is very brittle and I believe causes much more harm than a few extra exceptions printed on stdout, which can be easily handled with Thread.current.report_on_exception = false.

#35 [ruby-core:75313] Updated by Charles Nutter 28 days ago

I remember this topic was looked at in the developer meeting this month. Matz was positive to have Thread#report_on_exception, but not default true. Sorry I don't remember the reason why he was not comfortable with defaulting this.

I would guess it's for all the badly-behaved code out there that's just letting threads die silently when an error is raised.

Having it default to off defeats the purpose of this feature request. My request is that threads report when an exception bubbles out before being handled.

I understand the concern about Thread#join. If we report by default then we might have exceptions reported as unhandled when a subsequent Thread#join would handle them. The idea about reporting on GC of the thread is interesting but it might mean we still never get any indication if the thread never GCs. I'd expect this is the typical case, since most people don't fire off threads without having a hard reference to them.

Re: Thread#join

Thread#join always re-raises its exception no matter whether abort_on_exception is set, so I don't see this as an issue. If you expect you'll be handling a Thread's last exception via #join, you would just specify that it should be quiet.

Thread#join works this way right now for abort_on_exception:

2.3.0 :001 > Thread.abort_on_exception = true
 => true 
2.3.0 :002 > go = false
 => false 
2.3.0 :003 > t = Thread.new { Thread.pass until go; raise }
 => #<Thread:0x007fc8920abe98@(irb):3 run> 
2.3.0 :004 > begin; go = true; sleep; rescue Exception; p $!; end
RuntimeError
 => RuntimeError 
2.3.0 :005 > t.join
RuntimeError: 
    from (irb):3:in `block in irb_binding'

I also made the larger suggestion of having this all wire in as a per-thread exception handler API.

There would be at least four default handlers. In all cases I think #join and #value should still produce the original exception.

  • Silent: Do not propagate the exception and do not report it. This is current behavior.
  • Report: Do not propagate the exception but report that it ended a thread. This is my requested behavior.
  • Reraise: Propagate the exception to the main thread but do not otherwise report it. This is abort_on_exception = true.
  • Custom: Provide your own proc/block to handle any exception raised from the thread.

Note also we could blunt some compatibility concerns by making these settable as Thread.new keyword args:

t = Thread.new(exception: :raise)

#36 [ruby-core:75536] Updated by Hiroshi SHIBATA 14 days ago

  • Description updated (diff)

#37 [ruby-core:75537] Updated by Nobuyoshi Nakada 14 days ago

  • Description updated (diff)

#38 [ruby-core:75538] Updated by Akira Tanaka 14 days ago

How about introducing Thread[.#]report_on_exception=(bool) ?

false by default for compatibility.
message should be printed at thread exit (not GC).

#39 [ruby-core:75549] Updated by Yukihiro Matsumoto 14 days ago

I vote for Thread#report_on_exception

Matz.

#40 [ruby-core:75557] Updated by Benoit Daloze 14 days ago

Akira Tanaka wrote:

How about introducing Thread[.#]report_on_exception=(bool) ?

false by default for compatibility.
message should be printed at thread exit (not GC).

Agreed, except that it should be true by default otherwise it misses the whole point of Ruby Threads dying silently by default!
Extra output is very rarely a big problem and can be easily solved:

Thread.current.report_on_exception = false if Thread.current.respond_to? :report_on_exception

Not so nice, so maybe we could use keyword args on Thread.new as Charles suggested.
This could conflict with existing arguments to Thread.new but it seems mostly harmless
(it would just yield an extra argument to the block which would just get ignored on older versions).

But more importantly, it seems to always be wrong to ignore exceptions like that,
and Threads which can allow such exceptions should handle them explicitly by adding a begin/rescue/end around the whole body so they can perform a useful recovery strategy.

Ruby Threads are not "futures", they are heavy concurrent OS threads and therefore I believe the current behavior (dying silently) is counter productive for everyone.

#41 [ruby-core:75563] Updated by Shyouhei Urabe 14 days ago

Benoit, I advise you to compromise on the default value. Once this feature gets implemented, its default value could be flipped later I think. We technically can't do what is proposed here now, which is definitely worse than a wrong default. no?

#42 [ruby-core:75564] Updated by Akira Tanaka 14 days ago

Benoit Daloze wrote:

Agreed, except that it should be true by default otherwise it misses the whole point of Ruby Threads dying silently by default!

I have sympathy with you.
But it seems that it is difficult to persuade matz now.

However the class method, Thread.report_on_exception = true, is also accepted by matz.
This can be used to change the default.

I think that it is possible to change the default value
if we can show it doesn't cause problems.
Thread.report_on_exception = true can provide a way to experiment.

#43 [ruby-core:75566] Updated by Nobuyoshi Nakada 14 days ago

Will messages be printed on all Threads if Thread.report_on_exception is true?
Or it is just the default value for each Threads at starting?

And, the initial value of Thread#report_on_exception is always false,
the value of the parent thread, or Thread.report_on_exception?

#44 [ruby-core:75567] Updated by Daniel Ferreira 14 days ago

Why do we need Thread#report_on_exception ?
As I see it Thread.report_on_exception should be enough.

Is there a use case scenario for Thread#report_on_exception ?

The instance method brings added complexity that maybe we don't need to worry about.

#45 [ruby-core:75569] Updated by Akira Tanaka 14 days ago

Nobuyoshi Nakada wrote:

Will messages be printed on all Threads if Thread.report_on_exception is true?
Or it is just the default value for each Threads at starting?

And, the initial value of Thread#report_on_exception is always false,
the value of the parent thread, or Thread.report_on_exception?

I feel that the initial value of Thread#report_on_exception should be
Thread.report_on_exception.

The message is printed if a thread is exited with exception and
Thread#report_exception of the thread is true.

This means that
we can enable the message for each thread when Thread.report_on_exception = false and
we can disable the message for each thread when Thread.report_on_exception = true.

#46 [ruby-core:75571] Updated by Nobuyoshi Nakada 14 days ago

Akira Tanaka wrote:

This means that
we can enable the message for each thread when Thread.report_on_exception = false and
we can disable the message for each thread when Thread.report_on_exception = true.

Do you mean that setting it at the start up time, before starting other threads?
Or should the change affect the threads started before it?

#47 [ruby-core:75573] Updated by Akira Tanaka 14 days ago

Nobuyoshi Nakada wrote:

Do you mean that setting it at the start up time, before starting other threads?
Or should the change affect the threads started before it?

I expect that an assignment to Thread.report_on_exception doesn't affect
threads started before the assignment.

#48 [ruby-core:75576] Updated by B Kelly 13 days ago

Daniel Ferreira wrote:

Why do we need Thread#report_on_exception ?
As I see it Thread.report_on_exception should be enough.

Is there a use case scenario for Thread#report_on_exception ?

Yes, because Thread#value passes the exception to the caller.

Granted ruby recently no longer supports $SAFE=4 sandboxing,
the following was a common sandbox idiom:

result = Thread.new {$SAFE=4; do_some_sandbox_work()}.value

In such cases, it is intended by design that any exception be
passed out of the sandbox to the caller. Automatic logging
at that boundary would not be desired.

By the way, I'm personally fine with logging by default, and
I support this feature as way to catch unexpected unhandled
thread exceptions.

But I would argue there indeed needs to be a way to disable
logging on a per-thread basis, where exceptions are intended
to be passed via Thread#value by design.

Regards,

Bill

#49 [ruby-core:75580] Updated by Nobuyoshi Nakada 13 days ago

Akira Tanaka wrote:

I expect that an assignment to Thread.report_on_exception doesn't affect
threads started before the assignment.

Ok.

https://github.com/ruby/ruby/compare/trunk...nobu:feature/6647-report_on_exception

#50 [ruby-core:75581] Updated by Charles Nutter 13 days ago

Benoit, I advise you to compromise on the default value. Once this feature gets implemented, its default value could be flipped later I think. We technically can't do what is proposed here now, which is definitely worse than a wrong default. no?

I am confused what it is we "can't" do right now. Do you mean: there's no reporting currently and we're seeking to add it...off by default but at least better than we have now?

I guess it's a little better because you could turn it on to see what's happening to all those badly-behaved threads, but it seems like having it off by default means most people never see any benefit.

I would try to keep convincing you all, but it seems the majority do not want it on by default. I'll just reiterate that I believe most of the purpose of this PR is lost if threads continue to die silently by default.

What about having verbose mode set Thread.report_on_exception = true? At least then people could pass a command-line flag to look for exception-killing threads without using -d and logging all exceptions.

  1. User runs app with threads...threads disappear without reporting.
  2. User runs app with -v (or some other flag to enable thread exception logging) to see why threads are dying.
  3. User fixes app to properly handle exceptions in those threads.

#51 [ruby-core:75582] Updated by Shyouhei Urabe 13 days ago

Charles Nutter wrote:

Benoit, I advise you to compromise on the default value. Once this feature gets implemented, its default value could be flipped later I think. We technically can't do what is proposed here now, which is definitely worse than a wrong default. no?

I am confused what it is we "can't" do right now. Do you mean: there's no reporting currently and we're seeking to add it...off by default but at least better than we have now?

Yes. Sorry for my bad English. I wanted to say this is "better than nothing". I understand people want it with default on, and myself can live with that. However sticking to "default true, or no such feature" tactics can make this issue hard to land. Matz is not comfortable with this for years. We have just partially persuaded him about the functionality. I think this is a step forward.

#52 [ruby-core:75591] Updated by Akira Tanaka 13 days ago

I tried test-all with Thread.report_on_exception = true using nobu's patch.
https://patch-diff.githubusercontent.com/raw/ruby/ruby/pull/1357.patch

But most of reports are about joined threads.
They are not actual problem.

Now, I think report-on-exit (not on GC) should be disabled by default.

I recommend people wanting reports enabled by default should try similar test
with own applications.

I think reports enabled by default should be only (or mostly) for actual problems.
report-on-GC may be considerable, I think.

#53 [ruby-core:75598] Updated by Benoit Daloze 13 days ago

Akira Tanaka wrote:

I tried test-all with Thread.report_on_exception = true using nobu's patch.
https://patch-diff.githubusercontent.com/raw/ruby/ruby/pull/1357.patch

But most of reports are about joined threads.
They are not actual problem.

I also ran with nobu's patch and here is the result on my machine:
https://gist.github.com/eregon/811c8db1b91fac627b444ddc4c0f2760

What about DRb and these "can't alloc thread" for instance? It does not seem to be harmless.

There are of course tests using the fact exceptions propagate to #join or #value, but that is very coarse grained.
For example take TestQueue#test_deny_pushers, it's unclear whether pop or push throws the exception in
Thread.new{ synq.pop; q << i }, and the test would be more accurate by rescuing just the right error around the push:

thr = Thread.new{
  synq.pop
  q << i rescue $!
}
...
assert_kind_of ClosedQueueError, thr.value

Can we have a branch where the thread exceptions printed here are fixed?
Then I think we can potentially make a better judgment of whether it should be the default.
We might find a couple bugs along the way.

#54 [ruby-core:75600] Updated by Charles Nutter 12 days ago

I wanted to say this is "better than nothing". I understand people want it with default on, and myself can live with that. However sticking to "default true, or no such feature" tactics can make this issue hard to land.

I agree. Whether it is on or off by default, I want to see the feature land. I'd like to see a way to enable it at command line (via -v or similar).

Can we have a branch where the thread exceptions printed here are fixed?
Then I think we can potentially make a better judgment of whether it should be the default.
We might find a couple bugs along the way.

I agree. I look at this output and my first thought is not that we should silently swallow exceptions...it's that we have a lot of buggy code in stdlib that's letting threads die without handling the error properly.

Some examples:

Perhaps this isn't supposed to be a NoMethodError in the child?

 3) Failure:
TestDigest::TestDigestParen#test_race_mixed [/home/eregon/code/ruby/test/digest/test_digest.rb:257]:
assert_separately failed with error message
pid 29378 exit 0
| 
| Thread terminated with
| -:9:in `block (2 levels) in <main>': undefined method `new' for nil:NilClass (NoMethodError)

This one comes up hundreds of times for all the threads it spins up. It's explicitly testing that a certain exception is thrown for each thread. Being able to do Thread.new(report: false) would solve all cases like this.

Thread terminated with
/home/eregon/code/ruby/test/thread/test_queue.rb:420:in `push': queue closed (ClosedQueueError)
    from /home/eregon/code/ruby/test/thread/test_queue.rb:420:in `block (4 levels) in test_deny_pushers'

Similar case here would be solved by report: false. This only comes up because it's testing Thread.abort_on_exception=, so it's an explicit case where it would not want reporting on.

  5) Failure:
TestThread#test_abort_on_exception [/home/eregon/code/ruby/test/ruby/test_thread.rb:297]:

1. [2/2] Assertion for "stderr"
   | <[]> expected but was
   | <["", "Thread terminated with", "-:3:in `block in <main>': unhandled exception"]>.

In every case I looked at, the warning is telling us something we can improve. Either it's an unexpected error getting swallowed and possibly breaking the test, or it's explicitly testing exceptions bubbling out of Thread.value. These are not cases I would want to see in my code.

#55 [ruby-core:75737] Updated by Koichi Sasada 4 days ago

any conclusion?

Another idea:

(1) Thread#report_on_exception = true
Show exception and backlog immediately (already proposed)

(2) Thread#report_on_exception = false (default)
Show exception and backlog at GC timing if exception is not handled.

I agree (1) is preferable, but breaks compatibility.
I agree #54, if people specify all of threads (which are joined after) report_on_exception = false (or specify report: false) will solve this compatibility. But I'm not sure we can accept this approach.

(2) is not so good because the report will be delayed.
But not so bad compare with nothing to show.

#56 [ruby-core:75747] Updated by Charles Nutter 3 days ago

(1) Thread#report_on_exception = true
Show exception and backlog immediately (already proposed)

(2) Thread#report_on_exception = false (default)
Show exception and backlog at GC timing if exception is not handled.

I'm kinda coming around on the GC mechanism, if it's not going to be possible to report synchronously AND have this feature on by default.

I assume GC logging would only happen for threads that are not joined and on which value has not been called. So basically, by default any threads that you start up and walk away from will report that they've been terminated by an exception.

We will still have a few cases out there where the GC logging is unwanted, so I think there needs to be a way to turn it completely off. We have three states for VERBOSE and DEBUG, perhaps that works here?

  • Thread#report_on_exception = true -- immediately report when the thread terminates due to an exception
  • Thread#report_on_exception = nil (default) -- report when the thread object GCs without anyone looking at the exception
  • Thread#report_on_exception = false -- no reporting of exception-triggered thread death for this thread

And I assume we would have Thread::report_on_exception{=} as well, right?

#57 [ruby-core:75749] Updated by Charles Nutter 3 days ago

Here's an implementation in JRuby: https://github.com/jruby/jruby/pull/3937

From the primary commit:

Implement Thread{.,#}report_on_exception[=].

This impl works as follows:

  • Default global value is nil.
  • nil = report when thread is GC if nobody has captured the exception (i.e. called #join or #value or abort_on_exception logic has fired).
  • true = report when the thread terminates, regardless of capture.
  • false = never report.
  • New threads inherit the current global setting.
  • If a thread name has been set from Ruby, it will be combined with JRuby's internal name for the report. If it has not been set, we will just report the internal thread name.

There are some open questions for this feature:

  • If join/value are interrupted, should they still set the capture bit? My impl does not; they must complete.
  • Is the VERBOSE-like nil/true/false clear enough or should we use symbols like :off, :gc, :on?

#58 [ruby-core:75771] Updated by Benoit Daloze about 14 hours ago

Koichi Sasada wrote:

any conclusion?

(2) Thread#report_on_exception = false (default)
Show exception and backlog at GC timing if exception is not handled.

I am against reporting at GC time because:
* it might be surprising for the user to have exceptions shown at random times, possibly long after the thread dies
* it still shows nothing if the thread is still referenced (very easy, storing it in a local variable, in a constant, leaking by the VM, etc) or there is no GC
* in case of a thread dying and causing an application deadlock (so GC is unlikely), there is still no clue to the programmer and user of what happened
(without -d which modifies the application behavior in a much larger way).

Also available in: Atom PDF