Bug #18731: Parallel test-all sometimes does not run at all some tests - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #18731

closed

Parallel test-all sometimes does not run at all some tests

Added by Eregon (Benoit Daloze) about 3 years ago. Updated over 2 years ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

Backport:

2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN

[ruby-core:<unknown>]

Description

In TruffleRuby I've noticed that some CRuby tests sometimes run or not, non-deterministically.
The TruffleRuby CI currently runs CRuby tests with -j4.

Today I investigated and I think I've found the reason for it.

One occurrence of this bug is for:

Both define a test class TestMethod.

The parallel runner distributes files across processes.

If the same worker process gets both files, then all tests in the second file are not run at all! (the file is still loaded).
The reason seems this line:
https://github.com/ruby/ruby/blob/8751c5c2672d1391c73d9dec590063d27bed7e4c/tool/lib/test/unit/parallel.rb#L128
(it's also possible to reproduce with just these 2 files, -j2 and not the full test suite, and having one worker start very slowly so the same worker gets both files)

Test::Unit::TestCase.test_suites is an array of classes, and so here because TestMethod was already defined by the first file loaded, the result of Test::Unit::TestCase.test_suites-suites is empty, and nothing gets run.

This doesn't seem an issue when not running in parallel/using -j.
But that's rare/impractical because test-all is very slow without -j.

For this specific case it seems we should rename the class in test_inlinecache.rb, but I suspect there are more name conflicts and the parallel runner should be fixed to handle this.

<rant>
FWIW, this test/unit code seems pretty messy, very long lines, duplication, hard to follow with the mix of minitest/test-unit, etc. I don't understand how this code is any better than mspec. It seems far more complex and hacky.
And that's probably why Ruby implementations except CRuby run MRI tests only when they have no choice, it's so annoying to work with, so many unnecessary subprocesses (very slow to run, annoying to debug), many CRuby-specific tests, so many unreliable tests, many metaprogramming-defined tests which are very brittle, lots of global state and coupling, etc.
</rant>

Actions

Copy link

Updated by Eregon (Benoit Daloze) about 3 years ago

Subject changed from Parallel test-all skips some tests to Parallel test-all sometimes does not run at all some tests

Actions

Copy link

#2 [ruby-core:108240]

Updated by Eregon (Benoit Daloze) about 3 years ago

Here is a quick search of test classes with the same name in different places:

class Test_Bignum
class TestDateParse
class TestEmojiBreaks
class TestFileUtils
class TestFuncall
class TestGraphemeBreaksFromFile
class TestMethod
class TestMkmf
class TestOptionParser

From ack '^class \w+ < Test::Unit::TestCase' test/mri/tests | grep -o -P 'class \w+' | sort > test.txt +

puts File.readlines("tests.txt", chomp: true).select { |t| tests.count(t) > 1 }.uniq

Actions

Copy link

#3 [ruby-core:108295]

Updated by headius (Charles Nutter) about 3 years ago

Ruby implementations except CRuby run MRI tests only when they have no choice

JRuby has run CRuby tests as part of our regular suite since the mid 2000s and will continue to do so as long as new tests and assertions continue to be added there. We don't really have a choice in the matter if we want to keep up with changes.

That said, I'd definitely love to see the tests cleaned up or migrated into ruby/spec.

Actions

Copy link

#4 [ruby-core:108376]

Updated by jeremyevans0 (Jeremy Evans) about 3 years ago

I submitted a pull request to make it so the same test class is not used in multiple files, for the classes identified as problems: https://github.com/ruby/ruby/pull/5839

Actions

Copy link

#5 [ruby-core:108379]

Updated by Eregon (Benoit Daloze) about 3 years ago

Thank you.

I think we should also fix the underlying issue of the parallel runner though.
This is otherwise bound to happen again in the future.
Some ideas:

Raise if the check Test::Unit::TestCase.test_suites-suites (which is the bug) returns an empty Array. It won't catch all issues though when e.g. 2nd test class is defined in the same file.
Use the :class TracePoint instead of the inherited hook which does not trigger for reopening a class.
Run whatever new methods were added via the method_added hook.

Finally there is the question of what could we backport, so that running tests on the 2.6/2.7/3.0/3.1 branch with -j doesn't run into this bug and actually skips some tests?

Actions

Copy link

Updated by nagachika (Tomoyuki Chikanaga) over 2 years ago

Status changed from Open to Closed

Applied in changeset git|5d4bfaccabe66d89460739ee682f1f78698c93a3.

merge revision(s) ab3cb29bd9bff9c16cfb9d19cc02026998282c12:

    Avoid defining the same test class in multiple files

    Should fix issues with parallel testing sometimes not running all
    tests.

    This should be viewed skipping whitespace changes.

    Fixes [Bug #18731]
    ---
     test/-ext-/bignum/test_big2str.rb        |  38 +-
     test/-ext-/bignum/test_bigzero.rb        |  20 +-
     test/-ext-/bignum/test_div.rb            |  38 +-
     test/-ext-/bignum/test_mul.rb            | 260 ++++++------
     test/-ext-/bignum/test_pack.rb           | 653 +++++++++++++++----------------
     test/-ext-/bignum/test_str2big.rb        |  52 ++-
     test/-ext-/funcall/test_funcall.rb       |  11 -
     test/-ext-/funcall/test_passing_block.rb |   5 +
     test/date/test_date_ractor.rb            |   2 +-
     test/fileutils/clobber.rb                |   5 +-
     test/fileutils/test_dryrun.rb            |   2 +-
     test/fileutils/test_nowrite.rb           |   2 +-
     test/fileutils/test_verbose.rb           |   2 +-
     test/fileutils/visibility_tests.rb       |   5 +-
     test/mkmf/base.rb                        | 225 ++++++-----
     test/mkmf/test_config.rb                 |  16 +-
     test/mkmf/test_constant.rb               |  56 ++-
     test/mkmf/test_convertible.rb            |  48 ++-
     test/mkmf/test_egrep_cpp.rb              |  14 +-
     test/mkmf/test_find_executable.rb        |  82 ++--
     test/mkmf/test_flags.rb                  |  92 +++--
     test/mkmf/test_framework.rb              |  70 ++--
     test/mkmf/test_have_func.rb              |  18 +-
     test/mkmf/test_have_library.rb           |  84 ++--
     test/mkmf/test_have_macro.rb             |  46 ++-
     test/mkmf/test_install.rb                |  38 +-
     test/mkmf/test_libs.rb                   | 156 ++++----
     test/mkmf/test_mkmf.rb                   |  14 +-
     test/mkmf/test_pkg_config.rb             |  98 +++--
     test/mkmf/test_signedness.rb             |  38 +-
     test/mkmf/test_sizeof.rb                 |  74 ++--
     test/optparse/test_acceptable.rb         |   2 +-
     test/optparse/test_autoconf.rb           |   4 +-
     test/optparse/test_bash_completion.rb    |   4 +-
     test/optparse/test_cclass.rb             |   2 +-
     test/optparse/test_did_you_mean.rb       |   2 +-
     test/optparse/test_getopts.rb            |   4 +-
     test/optparse/test_kwargs.rb             |   4 +-
     test/optparse/test_noarg.rb              |   6 +-
     test/optparse/test_optarg.rb             |   2 +-
     test/optparse/test_placearg.rb           |   2 +-
     test/optparse/test_reqarg.rb             |  10 +-
     test/optparse/test_summary.rb            |   2 +-
     test/optparse/test_zsh_completion.rb     |   4 +-
     test/ruby/enc/test_emoji_breaks.rb       | 207 +++++-----
     test/ruby/enc/test_grapheme_breaks.rb    | 115 +++---
     test/ruby/test_inlinecache.rb            |   2 +-
     47 files changed, 1280 insertions(+), 1356 deletions(-)
     delete mode 100644 test/-ext-/funcall/test_funcall.rb

Actions

Copy link

#7 [ruby-core:112621]

Updated by luke-gru (Luke Gruber) over 2 years ago

I've been working a bit on the parallel runner and I've noticed this problem too.

Raise if the check Test::Unit::TestCase.test_suites-suites (which is the bug) returns an empty Array. It won't catch all issues though when e.g. 2nd test class is defined in the same file.

Unfortunately we can't do this because some test classes are guarded by if statements, and the array can be empty in those cases too.

I'll take a look at the method_added solution, incrementing a number on the class and checking it matches the previous one.

Edit: PR here: https://github.com/ruby/ruby/pull/7395

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #18731

Parallel test-all sometimes does not run at all some tests

Updated by Eregon (Benoit Daloze) about 3 years ago

Updated by Eregon (Benoit Daloze) about 3 years ago

Updated by headius (Charles Nutter) about 3 years ago

Updated by jeremyevans0 (Jeremy Evans) about 3 years ago

Updated by Eregon (Benoit Daloze) about 3 years ago

Updated by nagachika (Tomoyuki Chikanaga) over 2 years ago

Updated by luke-gru (Luke Gruber) over 2 years ago