Bug #9090

[BUG] object allocation during garbage collection phase

Added by Akira Tanaka almost 2 years ago. Updated almost 2 years ago.

[ruby-dev:47794]
Status:Closed
Priority:Normal
Assignee:Koichi Sasada
ruby -v:ruby 2.1.0dev (2013-11-06 trunk 43558) [x86_64-linux] Backport:1.9.3: UNKNOWN, 2.0.0: UNKNOWN

Description

昨日の夜、r43558 で make check を 300回動かしてみたところ、
なんかずいぶんと [BUG] が出ました。

% ./ruby -v
ruby 2.1.0dev (2013-11-06 trunk 43558) [x86_64-linux]
% fgrep '[BUG]' zd10/*/log
zd10/1/log:TestGem#test_looks_for_gemdeps_files_automatically_on_start = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/27/log:TestGem#test_gem_path_ordering = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/34/log:TestGemDependencyInstaller#test_install_version = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/45/log:TestGemSourceInstalled#test_spaceship = /home/akr/tst1/ruby/lib/rubygems/package.rb:419: [BUG] rb_gc_mark(): 0x002b969501c870 is T_ZOMBIE
zd10/51/log:TestWeakRef#test_weakref_finalize = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/52/log:TestGem#test_looks_for_gemdeps_files_automatically_on_start = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/56/log:TestGemInstaller#test_generate_bin_symlink_update_older = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/73/log:TestCSV::Features#test_gzip_writer_bug_fix = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/74/log:TestGemDependencyInstaller#test_install_minimal_deps = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/88/log:TestGemDependencyInstaller#test_install_dependency_old = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/99/log:TestGemCommandsInstallCommand#test_execute_prerelease_skipped_when_non_pre_available = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/102/log:TestGemUninstaller#test_uninstall_doesnt_prompt_when_other_gem_satifies_requirement = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/104/log:TestWeakRef#test_weakref_finalize = /home/akr/tst1/ruby/lib/webrick/utils.rb:165: [BUG] object allocation during garbage collection phase
zd10/120/log:TestGemSpecFetcher#test_available_specs_prerelease = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/127/log:TestGemDependencyInstaller#test_install_dependency_development_shallow = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/132/log:TestGemSourceSpecificFile#test_spec = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/157/log:TestGemUninstaller#test_remove_executables_user_format = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/172/log:TestGemSourceLocal#test_find_gem_prerelease = /home/akr/tst1/ruby/lib/webrick/utils.rb:165: [BUG] Segmentation fault
zd10/173/log:TestGem#test_looks_for_gemdeps_files_automatically_on_start = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/193/log:TestGemSourceLocal#test_find_gem_prerelease = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/195/log:TestGem#test_looks_for_gemdeps_files_automatically_on_start = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/196/log:TestGemUninstaller#test_remove_executables_user = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/197/log:TestGemUninstaller#test_uninstall_prompt_includes_dep_type = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/215/log:TestWeakRef#test_weakref_finalize = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/216/log:TestGem#test_looks_for_gemdeps_files_automatically_on_start = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/217/log:TestGemDependencyInstaller#test_install_dependency_development_deep = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/227/log:TestGemCommandsCleanupCommand#test_handle_options_dry_run = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/234/log:TestGemInstaller#test_install_extension_flat = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/239/log:TestGemUninstaller#test_uninstall_only_lists_unsatified_deps = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/246/log:TestGemDependencyInstaller#test_install_install_dir = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/255/log:TestWeakRef#test_weakref_finalize = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/270/log:TestGemDependencyInstaller#test_install_dependencies_satisfied = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/288/log:TestWeakRef#test_weakref_finalize = /home/akr/tst1/ruby/lib/rinda/tuplespace.rb:325: [BUG] object allocation during garbage collection phase
zd10/291/log:TestWeakRef#test_weakref_finalize = /home/akr/tst1/ruby/lib/webrick/utils.rb:165: [BUG] object allocation during garbage collection phase

一昨日の夜、r43539 でやったときは出なかったので、
r43539 から r43558 のあいだのどれかだとおもうのですが、
心当たりのある方はいませんか。

% svn log -r43539:43558

r43539 | tarui | 2013-11-05 20:15:21 +0900 (Tue, 05 Nov 2013) | 3 lines

* gc.c (is_live_object): A hidden object may be a live object.
   [Bug #9072]

r43540 | akr | 2013-11-05 21:25:15 +0900 (Tue, 05 Nov 2013) | 4 lines

  • test/ruby/test_thread.rb (test_thread_join_in_trap): Run the test in a different process.

r43541 | tenderlove | 2013-11-06 04:15:40 +0900 (Wed, 06 Nov 2013) | 4 lines

  • ext/psych/lib/psych/visitors/to_ruby.rb: process merge keys before reviving objects. Fixes GH psych #168
  • test/psych/test_merge_keys.rb: test for change

    https://github.com/tenderlove/psych/issues/168

    r43542 | svn | 2013-11-06 04:15:46 +0900 (Wed, 06 Nov 2013) | 1 line

* 2013-11-06

r43543 | ko1 | 2013-11-06 07:13:08 +0900 (Wed, 06 Nov 2013) | 3 lines

  • benchmark/gc/gcbench.rb: add some options to make quiet.

r43544 | ko1 | 2013-11-06 07:13:51 +0900 (Wed, 06 Nov 2013) | 3 lines

  • gc.c (Init_GC): add GC::OPTS to show options.

r43545 | eregon | 2013-11-06 07:27:27 +0900 (Wed, 06 Nov 2013) | 1 line

* benchmark/gc/gcbench.rb: fix typo of r43543

r43546 | ko1 | 2013-11-06 09:14:26 +0900 (Wed, 06 Nov 2013) | 3 lines

  • gc.c: define RGENGC_ESTIMATE_OLDSPACE == 0 if USE_RGENGC is 0.

r43547 | ko1 | 2013-11-06 09:17:14 +0900 (Wed, 06 Nov 2013) | 12 lines

  • benchmark/bm_vm1_gc_short_lived.rb: added. These GC benchmarks do not reflect practical applications. They are only for tuning.
  • benchmark/bm_vm1_gc_short_with_complex_long.rb: added.
  • benchmark/bm_vm1_gc_short_with_long.rb: added.
  • benchmark/bm_vm1_gc_short_with_symbol.rb: added.
  • benchmark/bm_vm1_gc_wb_ary.rb: added.
  • benchmark/bm_vm1_gc_wb_obj.rb: added.
  • benchmark/bm_vm_thread_queue.rb: added. Thie benchmark is added to know how fast C verion of thread.so.

r43548 | nobu | 2013-11-06 09:21:05 +0900 (Wed, 06 Nov 2013) | 1 line

* properties.

r43549 | glass | 2013-11-06 10:39:55 +0900 (Wed, 06 Nov 2013) | 1 line

* array.c (rb_ary_and): defer hash creation and some refactoring.

r43550 | glass | 2013-11-06 11:50:28 +0900 (Wed, 06 Nov 2013) | 2 lines

  • array.c (rb_ary_sample): use RARRAY_AREF() and RARRAY_PTR_USE() instead of RARRAY_PTR(). ------------------------------------------------------------------------ r43551 | naruse | 2013-11-06 15:23:27 +0900 (Wed, 06 Nov 2013) | 1 line

Suppress Warnings: instance variable @installed_by_version not initialized

r43552 | naruse | 2013-11-06 15:23:29 +0900 (Wed, 06 Nov 2013) | 1 line

Suppress warnings: File.exists? is a deprecated name

r43553 | naruse | 2013-11-06 15:25:18 +0900 (Wed, 06 Nov 2013) | 1 line

Suppress warnings: ambiguous first argument; put parentheses or even spaces

r43554 | duerst | 2013-11-06 16:32:56 +0900 (Wed, 06 Nov 2013) | 1 line

lib/open3.rb: tweaked grammar in comments

r43555 | tarui | 2013-11-06 19:32:40 +0900 (Wed, 06 Nov 2013) | 2 lines

* common.mk (help): add texts about gcbench.

r43556 | glass | 2013-11-06 21:09:31 +0900 (Wed, 06 Nov 2013) | 1 line

* array.c (rb_ary_reverse): use RARRAY_PTR_USE().

r43557 | glass | 2013-11-06 21:20:39 +0900 (Wed, 06 Nov 2013) | 2 lines

  • array.c (rb_ary_shift_m): use RARRAY_PTR_USE() without WB because

    there are not new relations.

    r43558 | tarui | 2013-11-06 21:36:34 +0900 (Wed, 06 Nov 2013) | 12 lines

    • gc.c (gc_before_sweep): Change algorithm of malloc_limit to conservative for closing to memory consumption of ruby 2.0.
    • gc.c (GC_MALLOC_LIMIT, GC_MALLOC_LIMIT_GROWTH_FACTOR): Adjust parameters for new algorithm.

    Example: make gcbench-rdoc on a pc
    time maxrss
    2.0.0p343 285.27 281853952
    trunk before patch 207.19 690405376
    trunk after patch 211.59 312500224


いちおう、失敗した場合のログを添付しておきます。

r43558-check-logs.tar.gz (2.35 MB) Akira Tanaka, 11/07/2013 09:54 AM

r43558-asm.txt.gz (4.67 MB) Akira Tanaka, 11/07/2013 12:08 PM

r43563-backtrace.txt Magnifier (161 KB) Akira Tanaka, 11/07/2013 11:22 PM

Associated revisions

Revision 43574
Added by Koichi Sasada almost 2 years ago

  • gc.c (vm_malloc_increase): check GVL before gc_rest_sweep(). vm_malloc_increase() can be called without GVL. However, gc_rest_sweep() assumes acquiring GVL. To avoid this problem, check GVL before gc_rest_sweep(). [Bug #9090] This workaround introduces possibility to set malloc_limit as wrong value (*1). However, this may be rare case. So I commit it. *1: Without rest_sweep() here, gc_rest_sweep() can decrease malloc_increase due to ruby_sized_xfree().

Revision 43574
Added by Koichi Sasada almost 2 years ago

  • gc.c (vm_malloc_increase): check GVL before gc_rest_sweep(). vm_malloc_increase() can be called without GVL. However, gc_rest_sweep() assumes acquiring GVL. To avoid this problem, check GVL before gc_rest_sweep(). [Bug #9090] This workaround introduces possibility to set malloc_limit as wrong value (*1). However, this may be rare case. So I commit it. *1: Without rest_sweep() here, gc_rest_sweep() can decrease malloc_increase due to ruby_sized_xfree().

History

#1 Updated by Masaya Tarui almost 2 years ago

C-level stack traceにシンボル情報がほとんど載ってないので、objdump -S ruby の結果も何処かに上げていただけるとありがたいです。

#2 Updated by Akira Tanaka almost 2 years ago

objdump -S ruby の結果をつけます。

あと、rubyci.org の結果をざっとみると、同じような失敗がいくつか見つかります。

r43540 http://fb32.rubyci.org/~chkbuild/ruby-trunk/log/20131105T170302Z.diff.html.gz
r43558 http://c5632.rubyci.org/~chkbuild/ruby-trunk/log/20131106T170301Z.diff.html.gz
r43558 http://rbci.lakewood.privs.net/ruby-trunk/log/20131106T201002Z.diff.html.gz
r43560 http://fb.rubyci.org/~chkbuild/ruby-trunk/log/20131106T220301Z.diff.html.gz
r43560 http://rbci.lakewood.privs.net/ruby-trunk/log/20131106T211002Z.diff.html.gz
r43560 http://ruby-mswin.cloudapp.net/vc10-x64/ruby-trunk/log/20131106T221741Z.log.html.gz

r43540 がいちばん早いようです。

% svn log -r43539:43540

r43539 | tarui | 2013-11-05 20:15:21 +0900 (Tue, 05 Nov 2013) | 3 lines

* gc.c (is_live_object): A hidden object may be a live object.
   [Bug #9072]

r43540 | akr | 2013-11-05 21:25:15 +0900 (Tue, 05 Nov 2013) | 4 lines

  • test/ruby/test_thread.rb (test_thread_join_in_trap): Run the test in a different process.

r43539 では起きなかったので残りは r43540 で、えーと、それは私の commit ですが、
これはさすがに違うような気がするので r43540 が fb32 で起こしたのは別件かなぁ。
そうすると r43558 ですかねぇ。

% svn log -r43558

r43558 | tarui | 2013-11-06 21:36:34 +0900 (Wed, 06 Nov 2013) | 12 lines

* gc.c (gc_before_sweep): Change algorithm of malloc_limit to
  conservative for closing to memory consumption of ruby 2.0.

* gc.c (GC_MALLOC_LIMIT, GC_MALLOC_LIMIT_GROWTH_FACTOR):
  Adjust parameters for new algorithm.

Example: make gcbench-rdoc on a pc
                          time       maxrss
2.0.0p343             285.27     281853952
trunk before patch    207.19     690405376
trunk after patch     211.59     312500224

#3 Updated by Masaya Tarui almost 2 years ago

r43558もGC起動のスレッシュルドを変えただけなので、
ここで発生となるとこれまで発生確率が低くて眠ってたバグを掘り起こしたって事に。
とりあえず見てみます。

#4 Updated by Akira Tanaka almost 2 years ago

mswin の CI だと backtrace がまともに出ていますね。

r43558 http://ruby-mswin.cloudapp.net/vc10-x64/ruby-trunk/log/20131106T191452Z.log.html.gz
r43560 http://ruby-mswin.cloudapp.net/vc10-x64/ruby-trunk/log/20131106T221741Z.log.html.gz
r43564 http://ruby-mswin.cloudapp.net/vc10-x64/ruby-trunk/log/20131107T101335Z.log.html.gz

あと、手元でも gdb の上で以下のように (くりかえしくりかえしくりかえし) 動かして
なんとか失敗させて、backtrace をとってみました。
(gdb) run -Ilib test/runner.rb -v test/rubygems test/rinda test/webrick test/test_weakref.rb test/csv

r43563-backtrace.txt として添付しておきます。

#5 Updated by Masaya Tarui almost 2 years ago

  • Assignee set to Koichi Sasada

=begin
手元で再現させた結果GVL無しでsweepしてる部分がみつかりました。
ここはささださんの部分らしいので担当を設定しておきます。
#これを修正しても他の部分でエラーならその時にreopenと言う事で。

(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7fe3740 (LWP 17027))]
#0 0x0000000000435cfa in gc_page_sweep (sweep_page=0x953440, heap=0x6b7a00, objspace=) at ../gc.c:2674
2674 if ((bitset & 1) && BUILTIN_TYPE(p) != T_ZOMBIE) {
(gdb) bt
#0 0x0000000000435cfa in gc_page_sweep (sweep_page=0x953440, heap=0x6b7a00, objspace=) at ../gc.c:2674
#1 gc_heap_lazy_sweep (heap=0x6b7a00, objspace=0x6b79e0) at ../gc.c:2930
#2 gc_heap_rest_sweep (heap=0x6b7a00, objspace=0x6b79e0) at ../gc.c:2955
#3 gc_rest_sweep (objspace=0x6b79e0) at ../gc.c:2965
#4 vm_malloc_increase (objspace=objspace@entry=0x6b79e0, do_gc=, old_size=, new_size=) at ../gc.c:5451
#5 0x0000000000437ee2 in vm_malloc_increase (do_gc=, old_size=, new_size=, objspace=) at ../gc.c:5444
#6 vm_malloc_prepare (size=, objspace=0x6b79e0) at ../gc.c:5472
#7 vm_xmalloc (objspace=0x6b79e0, size=size@entry=32768) at ../gc.c:5503
=end

#6 Updated by Koichi Sasada almost 2 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r43574.
Akira, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • gc.c (vm_malloc_increase): check GVL before gc_rest_sweep(). vm_malloc_increase() can be called without GVL. However, gc_rest_sweep() assumes acquiring GVL. To avoid this problem, check GVL before gc_rest_sweep(). [Bug #9090] This workaround introduces possibility to set malloc_limit as wrong value (*1). However, this may be rare case. So I commit it. *1: Without rest_sweep() here, gc_rest_sweep() can decrease malloc_increase due to ruby_sized_xfree().

#7 Updated by Akira Tanaka almost 2 years ago

r43577 で 100回ほど make check してみましたがこの問題は発生しませんでした。
ありがとうございます。

Also available in: Atom PDF