Bug #9480
closed
Ruby on windows can get into an infinite loop when exiting
Description
The Ruby VM can get into an infinite loop when exiting. I have attached two ruby programs (main.rb and watcher.rb). Main.rb spawns watcher.rb, and the watcher waits for main to exit. However, sometimes main.rb will get into an infinite loop trying to shutdown the VM.
To reproduce:
I have tried this on ruby 1.9.3p484 from rubyinstaller.org, which is the latest 19 version to date.
C:\>ruby --version ruby 1.9.3p484 (2013-11-22) [i386-mingw32] C:\>gem list bigdecimal (1.1.0) io-console (0.3) json (1.5.5) minitest (2.5.1) rake (0.9.2.2) rdoc (3.9.5)
Execute main.rb in a loop, e.g. from powershell:
C:\>for ($i = 0; $i -lt 1000; $i++) { ruby main.rb Start-Sleep 1 }
Sometimes I see the parent ruby process get into an infinite loop when trying to shutdown the ruby VM. If I remove the use of IO.pipe
in main.rb, then the problem doesn't seem to happen. I created a full crash dump, and see the following, note thread b8c
is running away:
0:000> !runaway User Mode Time Thread Time 0:b8c 0 days 0:00:11.528 2:564 0 days 0:00:00.000 1:a30 0 days 0:00:00.000
Looking at that thread's call stack:
0:000> kb ChildEBP RetAddr Args to Child 0028fd98 76f00fa2 000000b0 00000000 0028fe38 ntdll!ZwReleaseMutant+0x15 0028fda8 62e6285e 000000b0 62e5e128 00355128 KERNELBASE!ReleaseMutex+0x10 WARNING: Stack unwind information not available. Following frames may be wrong. 0028fe38 62d425d9 0035524c 00000000 0028fec8 msvcrt_ruby191!rb_thread_terminate_all+0x15a 0028fec8 62d42710 00000000 00351390 003525b4 msvcrt_ruby191!ruby_cleanup+0x2c1 0028fef8 0040136f 02557998 00352c38 0028ff28 msvcrt_ruby191!ruby_run_node+0x50 0028ff28 004010b9 00000002 00352c38 003516b0 ruby+0x136f 0028ff68 00401284 00000001 00000000 00000000 ruby+0x10b9 0028ff88 753f33aa 7efde000 0028ffd4 773f9f72 ruby+0x1284 0028ff94 773f9f72 7efde000 77dccdc5 00000000 kernel32!BaseThreadInitThunk+0xe 0028ffd4 773f9f45 0040126c 7efde000 ffffffff ntdll!__RtlUserThreadStart+0x70 0028ffec 00000000 0040126c 7efde000 00000000 ntdll!_RtlUserThreadStart+0x1b
where handle b0
is free:
0:000> !handle b0 f Handle 000000b0 Type Mutant Attributes 0 GrantedAccess 0x1f0001: Delete,ReadControl,WriteDac,WriteOwner,Synch QueryState HandleCount 2 PointerCount 4 Name <none> Object specific information Mutex is Free Mutant Owner 0.0
Unfortunately, it's hard to tell from that stack trace if the handle was just freed, or was already free before ReleaseMutex
was called. However, a stack trace from a different run shows something more interesting:
0:000> ~kb ChildEBP RetAddr Args to Child 0028fc38 76d315e9 00000001 0028fc88 00000001 ntdll!NtWaitForMultipleObjects+0x15 0028fcd4 75b71a2c 0028fc88 0028fcfc 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100 0028fd1c 75b74220 00000001 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0 0028fd38 62e5f913 00000001 0028fe1c 00000000 kernel32!WaitForMultipleObjects+0x18 WARNING: Stack unwind information not available. Following frames may be wrong. 0028fda8 62e6288b 00000000 62e5e128 0062c668 msvcrt_ruby191!rb_mutex_unlock+0x152b 0028fe38 62d425d9 0062c78c 00000000 0028fec8 msvcrt_ruby191!rb_thread_terminate_all+0x187 0028fec8 62d42710 00000006 006223c8 0062235c msvcrt_ruby191!ruby_cleanup+0x2c1 0028fef8 0040136f 015ad9b8 00622b70 0028ff28 msvcrt_ruby191!ruby_run_node+0x50 0028ff28 004010b9 00000005 00622b70 006214e8 ruby+0x136f 0028ff68 00401284 00000001 00000000 00000000 ruby+0x10b9 0028ff88 75b733aa 7efde000 0028ffd4 774b9f72 ruby+0x1284 0028ff94 774b9f72 7efde000 74b8f904 00000000 kernel32!BaseThreadInitThunk+0xe 0028ffd4 774b9f45 0040126c 7efde000 ffffffff ntdll!__RtlUserThreadStart+0x70 0028ffec 00000000 0040126c 7efde000 00000000 ntdll!_RtlUserThreadStart+0x1b
where the call to WaitForMultipleObjects has the following parameters:
DWORD nCount = 00000001 HANDLE *lpHandles = 0028fe1c BOOL bWaitAll = 00000000 DWORD dwMilliseconds = ffffffff
where lpHandles
points to 1c8
and the timeout is INFINITE. Looking at the handle, it is again free:
0:000> !handle 1c8 f Handle 000001c8 Type Mutant Attributes 0 GrantedAccess 0x1f0001: Delete,ReadControl,WriteDac,WriteOwner,Synch QueryState HandleCount 2 PointerCount 4 Name <none> Object specific information Mutex is Free Mutant Owner 0.0
But what is interesting is that the timeout is INFINITE, which makes me think ruby is trying to unlock a mutex that is already unlocked. In looking at the call stack, I see rb_thread_terminate_all
which calls rb_threadptr_unlock_all_locking_mutexes
:
void rb_threadptr_unlock_all_locking_mutexes(rb_thread_t *th) { const char *err; rb_mutex_t *mutex; rb_mutex_t *mutexes = th->keeping_mutexes; while (mutexes) { mutex = mutexes; /* rb_warn("mutex #<%p> remains to be locked by terminated thread", mutexes); */ mutexes = mutex->next_mutex; err = rb_mutex_unlock_th(mutex, th); if (err) rb_bug("invalid keeping_mutexes: %s", err); } }
Which makes me wonder if the thread's list of keeping_mutexes
has a cycle?
Files
Updated by joshc (Josh C) over 11 years ago
- File main.rb main.rb added
- File watcher.rb watcher.rb added
Updated by usa (Usaku NAKAMURA) over 7 years ago
- Status changed from Open to Feedback
- Assignee changed from usa (Usaku NAKAMURA) to windows
- Backport deleted (
1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN)
This is not reproducible on my machine long time.
Is this still occurs?
Updated by MSP-Greg (Greg L) over 7 years ago
I checked with the following, and couldn't reproduce...
ruby 2.5.0dev (2017-10-23 trunk 60368) [x64-mingw32] RI2
ruby 2.4.2p198 (2017-09-14 revision 59899) [x64-mingw32] RI2
ruby 2.3.3p222 (2016-11-21 revision 56859) [x64-mingw32] RI
Updated by jeremyevans0 (Jeremy Evans) almost 6 years ago
- Status changed from Feedback to Closed