Project

General

Profile

Actions

Feature #19008

closed

Introduce coverage support for `eval`.

Added by ioquatix (Samuel Williams) 3 months ago. Updated 2 months ago.

Status:
Closed
Priority:
Normal
Target version:
-
[ruby-core:109937]

Description

I'd like to introduce coverage support for eval. I mostly only care about the case where an explicit path is given, and I'd even be okay to only handle the case where the line number is the default (0).

https://github.com/ruby/ruby/pull/6396

This is an incredibly useful feature for computing coverage of ERB templates and other similar things.

Updated by mame (Yusuke Endoh) 3 months ago

Basically I am okay to this proposal. When I first created coverage.so, I tried to support eval. But @ko1 (Koichi Sasada) objected to it because the source code for eval is not retained and there is no clear need for it. But I just talked to @ko1 (Koichi Sasada), and he seems not to be negative about this change now. I think it is great if it works great with ERB (I don't confirm it myself, though).

But note that this change would be an incompatible change.

$ cat t.rb
require "coverage"

Coverage.start

load "tt.rb"

p Coverage.result

$ cat tt.rb
eval(<<END)
p 1
END

$ ./local/bin/ruby t.rb
1
{"tt.rb"=>[1, nil, nil], "(eval)"=>[1]}

For real-world programs, Coverage.result would include many "eval"=>....

Even worse, it may refer to a wrong file:

$ cat tt.rb
eval(<<END, nil, "notexist.rb")
p 1
END

$ ./local/bin/ruby t.rb
1
{"tt.rb"=>[1, nil, nil], "notexist.rb"=>[1]}

It could corrupt the result of an existing file:

$ cat tt.rb
# comment line

eval(<<END, nil, "tt.rb")
x = 1
END

$ ./local/bin/ruby t.rb
{"tt.rb"=>[1, nil, 1, nil, nil]}

It wrongly records Line 1. Note that there is only a comment in Line 1 of tt.rb.

I have no idea how it should behave when the fourth argument of eval is supplied:

$ cat tt.rb
eval(<<END, nil, "notexist.rb", 100)
x = 1
END

$ ./local/bin/ruby t.rb
{"tt.rb"=>[1, nil, nil], "notexist.rb"=>[nil]}

I'd like to hear from the users of coverage.so, like the authors of coverage tools.

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

In terms of ERB template coverage, I recently added support to Tilt for it. Tilt is the library used by Sinatra, Roda, Hanami, dry-view and most non-Rails web frameworks to render templates. Since eval doesn't support coverage, the approach taken is to write a file with the compiled ERB code, and then use load (https://github.com/rtomayko/tilt/commit/384553f256cb5b37ef3f63979432edd55e4264eb).

I'm not against eval supporting coverage, but as @mame (Yusuke Endoh) indicates, there are numerous issues. Therefore, I think coverage support for eval should be opt-in, ideally on a per-eval basis (e.g. coverage: true keyword argument).

Updated by ioquatix (Samuel Williams) 3 months ago

I'm against introducing a keyword argument to eval. coverage implementation should not leak into unrelated code.

Updated by ioquatix (Samuel Williams) 3 months ago

@mame (Yusuke Endoh) thanks for all your feedback and edge cases. I have this working now (proof of concept).

(1) How to deal with (eval)? We can disable coverage for eval called without path (current implementation). This feels like the correct behaviour for "anonymous eval".
(2) eval(..., "notexist.rb") I am not sure this is a problem. In any case, someone can load(path) and then delete it, so this is essentially already a possible state.
(3) eval(..., "some_other.rb") I think this is the expected behaviour, and is also possible if someone deletes and recreates the file.
(4) If you specify a line offset, that's added to the instruction sequence line offset, so it triggers coverage for that line. Of course, if the line doesn't exist in the actual code, we can either add it (extend lines array) or ignore it (current implementation).

I'm the author of a coverage tool (covered) and these problems already exist, because people are already writing coverage tools to intercept eval and compute coverage. I found some PRs in Rails which fix the above issues in the past, so people are aware of the issues w.r.t. eval and incorrect path/file/line offsets, and the need for eval in those contexts to match the same number of lines in the source file, or coverage would be incorrect.

e.g. Previously in Rails, I saw something like eval("def foo\nbar\nend") but when I looked at it recently, it was expanded correctly, so it doesn't cause coverage bugs. If you want, I'll go looking for the PR, but it was a long time ago, so if you can trust me that this exists, it will save me some time and effort (to go searching through Rails code).

Updated by mame (Yusuke Endoh) 3 months ago

We discussed this proposal at the dev meeting, and @matz (Yukihiro Matsumoto) decided to give it a try.

@ioquatix (Samuel Williams) has been merged the PR, but I noticed two additional problems:

1. eval may discard existing file coverage data

require "coverage"

Coverage.start

load "tt.rb"

# the content of tt.rb
#
#   1: 100.times do
#   2:   1
#   3: end

# The coverage of tt.rb is gathered correctly
p Coverage.result(reset: false) #=> {"tt.rb"=>[1, 100, nil]}

# Call eval with a file "tt.rb"
eval(<<SRC, nil, "tt.rb", 1)
p 1
p 2
p 3
SRC

# The gathered coverage is discarded, and overwritten with a new coverage data for eval 
p Coverage.result #=> {"tt.rb"=>[1, 1, 1]}

I am unsure what behavior is desirable here, but at least I think it should not discard the old coverage for a file. I think a file coverage is more important than an eval coverage.

I propose three possible solutions:

    1. Do not record an eval coverage when a file coverage already exists. (And if a file is required after eval, overwrite an eval coverage with a file coverage)
    1. Accumulate the two data (It is inaccurate, but it is maybe better than current)
    1. Record the data for eval as another key (This will break some coverage tools because it changes the type of keys)
# Solution 1
p Coverage.result #=> {"tt.rb"=>[1, 100, nil]}

# Solution 2
p Coverage.result #=> {"tt.rb"=>[2, 101, 1]}

# Solution 3
p Coverage.result #=> {"tt.rb"=>[1, 100, nil], [:eval, "tt.rb"]=>[1, 1, 1]}

Personally, I like Solution 1.

2. branch coverage with ERB does not work well

Branch coverage includes column information, but this will not work well with ERB, because ERB generate code to keep row numbers, but it does not keep column numbers (it is almost impossible).

I think this would be difficult to resolve. So I guess the only thing we can do is to document it.

@ioquatix (Samuel Williams) What do you think? Can you create an additional patch?

Updated by mame (Yusuke Endoh) 2 months ago

@ioquatix (Samuel Williams) Also I'd like to ask you:

  • Please add a document to ext/coverage/coverage.c about the new spec for eval
  • Please write this spec change to NEWS

Updated by jeremyevans0 (Jeremy Evans) 2 months ago

Considering the issues, could we make eval coverage optional, maybe by a keyword to Coverage.start? I'd prefer it be opt-in, but at the minimum I think we need an option to disable it if it is enabled by default.

mame (Yusuke Endoh) wrote in #note-5:

I propose three possible solutions:

    1. Do not record an eval coverage when a file coverage already exists. (And if a file is required after eval, overwrite an eval coverage with a file coverage)
    1. Accumulate the two data (It is inaccurate, but it is maybe better than current)
    1. Record the data for eval as another key (This will break some coverage tools because it changes the type of keys)

I would prefer 3. It should always be possible to inspect the coverage data to determine whether the coverage is for eval or load/require. I think a unique coverage key should be generated for each eval. With this example:

3.times do |i|
  eval <<-END, binding, __FILE__, __LINE__+1
    case i
    when 0
      p "a"
    when 1
      p "2"
    else
      p "b"
    end
  END
end

You should have 3 separate coverage entries, one for each eval, each showing some branches/lines covered and others not.

Updated by ioquatix (Samuel Williams) 2 months ago

I would prefer 3.

@jeremyevans0 (Jeremy Evans) what's the use-case for your proposal, because it would be much harder to digest with the coverage tool. As it stands, with this PR, I can swap out my own coverage implementation and everything just works. But with your proposal, it won't work. By the way, if you wanted 3 separate coverage entries, you can do this:

3.times do |i|
  eval <<-END, binding, __FILE__ + ":#{i}", __LINE__+1
    case i
    when 0
      p "a"
    when 1
      p "2"
    else
      p "b"
    end
  END
end

I think we need an option to disable it if it is enabled by default.

Can you explain the use case? In what situations will it be a problem? I'm not against introducing a keyword, but I want to know what problem it addresses, because in my case such a usage is not desirable.

Updated by jeremyevans0 (Jeremy Evans) 2 months ago

ioquatix (Samuel Williams) wrote in #note-8:

I would prefer 3.

@jeremyevans0 (Jeremy Evans) what's the use-case for your proposal, because it would be much harder to digest with the coverage tool. As it stands, with this PR, I can swap out my own coverage implementation and everything just works. But with your proposal, it won't work. By the way, if you wanted 3 separate coverage entries, you can do this:

3.times do |i|
  eval <<-END, binding, __FILE__ + ":#{i}", __LINE__+1
    case i
    when 0
      p "a"
    when 1
      p "2"
    else
      p "b"
    end
  END
end

I think we need an option to disable it if it is enabled by default.

Can you explain the use case? In what situations will it be a problem? I'm not against introducing a keyword, but I want to know what problem it addresses, because in my case such a usage is not desirable.

Your use case (ERB coverage) is apparently limited to files that aren't required. However, eval methods are used with __FILE__ on a regular basis for metaprogramming.

In many cases, I don't care about evaled code, and it would be nice to be able to turn off coverage in such cases, instead of having to filter coverage results to specifically exclude them. There should be an easy way to get backwards compatibility with Ruby 3.1 and earlier versions, which is what a keyword would enable.

In the cases where I do care and I want to check coverage of evaled code, it would be best to be able to look at all evals separately. It looks like your current approach is to just override any existing coverage entry for each new eval, which does not allow for that.

Your recommendation to modify the third argument to eval breaks various things, such as __FILE__ inside the evaled code, and changing backtrace lines.

It's likely a keyword argument approach to Coverage.start would allow for more configurable behavior. Maybe an callable keyword that is passed the file and line of the eval, and returns the coverage key to use (and have nil not record coverage)?

Updated by mame (Yusuke Endoh) 2 months ago

@ioquatix (Samuel Williams) I found yet another issue. It does not work with oneshot coverage.

require "coverage"

Coverage.start(oneshot_lines: true)

# the content of tt.rb
#
#   1: 100.times do
#   2:   1
#   3: end

load "tt.rb"

eval(<<SRC, nil, "dummy.rb", 1)
p 1
p 2
p 3
SRC

p Coverage.result
#=> expected: {"tt.rb"=>{:oneshot_lines=>[1, 2]}, "dummy.rb"=>{:oneshot_lines=>[1, 2, 3]}}
#=> actual:   {"tt.rb"=>{:oneshot_lines=>[1, 2]}}

TBH I'm unsure which behavior is better. But anyway could you please investigate why it does not work?

Updated by mame (Yusuke Endoh) 2 months ago

@jeremyevans0 (Jeremy Evans) Thank you for joining the discussion!

jeremyevans0 (Jeremy Evans) wrote in #note-7:

Considering the issues, could we make eval coverage optional, maybe by a keyword to Coverage.start? I'd prefer it be opt-in, but at the minimum I think we need an option to disable it if it is enabled by default.

Now I am a bit keen to have Coverage.start(eval: false) because there may be some issues that we don't know yet.

In the cases where I do care and I want to check coverage of evaled code, it would be best to be able to look at all evals separately. It looks like your current approach is to just override any existing coverage entry for each new eval, which does not allow for that.

It is a fair point, but I am afraid that recording per call to eval will bloat the coverage data without limit. How about accumulating all coverage data of eval'ed code?

10000.times do |i|
  eval <<-END, binding, __FILE__, __LINE__+1
    1
  END
end

Coverage.result #=> { __FILE__ => [1, nil, ...], [:eval, __FILE__] => [10000] }

I am very afraid if the key [:eval, __FILE__] will break some existing coverage tools like simplecov. If we choose this design, we should introduce Coverage.start(eval: false) and disable it by default.

Updated by jeremyevans0 (Jeremy Evans) 2 months ago

mame (Yusuke Endoh) wrote in #note-11:

In the cases where I do care and I want to check coverage of evaled code, it would be best to be able to look at all evals separately. It looks like your current approach is to just override any existing coverage entry for each new eval, which does not allow for that.

It is a fair point, but I am afraid that recording per call to eval will bloat the coverage data without limit. How about accumulating all coverage data of eval'ed code?

10000.times do |i|
  eval <<-END, binding, __FILE__, __LINE__+1
    1
  END
end

Coverage.result #=> { __FILE__ => [1, nil, ...], [:eval, __FILE__] => [10000] }

I am very afraid if the key [:eval, __FILE__] will break some existing coverage tools like simplecov. If we choose this design, we should introduce Coverage.start(eval: false) and disable it by default.

I agree that there should be a keyword to enable this support, and it should be disabled by default.

If you are ignoring __LINE__ when computing coverage (I'm guessing you are, since you have a value of [10000], even though __LINE__ should be 3 and not 1), and trying to accumulate multiple calls, I think it would be problematic to use [:eval, __FILE__] as the key, because it wouldn't correctly handle something like:

  eval <<-END, binding, __FILE__, __LINE__+1
    if true
      something
    end
  END
  eval <<-END, binding, __FILE__, __LINE__+1
    if false
      something
    end
  END

You could probably work around that with [:eval, __FILE__, __LINE__] as the key. That wouldn't handle correctly handle cases where you aren't passing a literal string and the string can vary per call:

  3.times do |i|
    eval codes[i], binding, __FILE__, __LINE__
  end

However, I expect such calls are rarer, and maybe we don't need to handle them.

What are your thoughts on a callable keyword argument for setting the coverage key?:

  Coverage.start(eval_key: ->(file, line){"eval-#{file}:#{line}"})

If this returned a key that already existed, coverage information could be accumulated (either by default or as an option).

Updated by ioquatix (Samuel Williams) 2 months ago

Thanks Jeremy.

Your use case (ERB coverage) is apparently limited to files that aren't required. However, eval methods are used with FILE on a regular basis for metaprogramming.

My use case is general coverage, including files that are required, and include eval. I'm not sure why this is a problem, can you show a use case where this produces undesirable results (excepting code which is already invalid, like file and line numbers that don't make sense)?

it would be best to be able to look at all evals separately

Why? I want per-file coverage, this is how coverage tools today work.

Updated by ioquatix (Samuel Williams) 2 months ago

@mame (Yusuke Endoh) thanks for your test cases, we should add them to the tests, I'll figure out why it's not working.

Updated by ioquatix (Samuel Williams) 2 months ago

Coverage.start(eval_key: ->(file, line){"eval-#{file}:#{line}"})

What problem are you trying to solve?

Updated by mame (Yusuke Endoh) 2 months ago

mame (Yusuke Endoh) wrote in #note-10:

@ioquatix (Samuel Williams) I found yet another issue. It does not work with oneshot coverage.

I am sorry, ignore this. It works as expected. I think my testing was wrong. (Maybe I used ruby 3.1 instead of master)

Updated by jeremyevans0 (Jeremy Evans) 2 months ago

ioquatix (Samuel Williams) wrote in #note-13:

Thanks Jeremy.

Your use case (ERB coverage) is apparently limited to files that aren't required. However, eval methods are used with FILE on a regular basis for metaprogramming.

My use case is general coverage, including files that are required, and include eval. I'm not sure why this is a problem, can you show a use case where this produces undesirable results (excepting code which is already invalid, like file and line numbers that don't make sense)?

Consider this code:

class A
  %w'foo bar'.each do |meth|
    class_eval <<-RUBY, __FILE__, __LINE__+1
      def #{meth}(v)
        if v
          do_#{meth}(v)
        else
          not_#{meth}
        end
      end
    RUBY
  end
end

With eval coverage, you want to be sure that you can detect which evaled code is covered and which is not. For example, in the above code, you would want the coverage to be able to tell you if all four branches are taken (if/else for foo and if/else for bar). Accumulating results for multiple eval runs into a single coverage entry does not do this.

ioquatix (Samuel Williams) wrote in #note-15:

Coverage.start(eval_key: ->(file, line){"eval-#{file}:#{line}"})

What problem are you trying to solve?

Current coverage tools expect the coverage result hash key is a string. This type of API would allow the user to control the hash key used. You could use it to get separate coverage per-eval (forcing all keys distinct), or to combine coverage for multiple evals (by using the same key more than once).

Updated by ioquatix (Samuel Williams) 2 months ago

@jeremyevans0 (Jeremy Evans) now I understand your motivation.

To summarise:

  • Coverage is always going to be imprecise.
  • More precise coverage can be useful, like branch coverage, etc, because we can see whether specific lexical logic is tested or not.
  • We already get validation of all lexical execution via this PR, i.e. line coverage of eval blocks.
  • Your proposal gives us separate coverage for separate dynamically defined methods/attributes/etc, on the same lines (so no improvement on lexical coverage).

Here are the issues as I see them:

  1. There is no user-facing flag which we can use to detect this feature. e.g. Coverage::SUPPORTED = [:lines, :branches, :methods, :oneshot_lines, :eval] or something. @mame (Yusuke Endoh) can :oneshot_lines be used as the same time as :lines or are they mutually exclusive?
  2. Introduce a flag to switch this feature on and off. e.g. Coverage.start(eval: true/false). I am okay with introducing this if there is a strong desire for it. However, I think the normal Coverage.start(:all) should default to true for eval.
  3. Introduce eval key as per @jeremyevans0's request.

I think introducing (2) is reasonable given everything else supports this style, but the default is to enable :all as per the documentation. Let me check how complex it is to add a new flag.

The other issues probably need separate feature requests.

Updated by mame (Yusuke Endoh) 2 months ago

@znz (Kazuhiro NISHIYAMA) told me that this change broke our continuous coverage measurement.

https://github.com/ruby/actions/actions/runs/3104935882/jobs/5029973324#step:11:10

Error: Artifact path is not valid: /ruby/tmp/test_rubygems_20220922-21188-1ddkra/gemhome/specifications/a#{raise %<improper escaping>}-2.gemspec.func-sort-c.html. Contains the following character:  Less than <

Invalid characters include:  Double quote ", Colon :, Less than <, Greater than >, Vertical bar |, Asterisk *, Question mark ?, Carriage return \r, Line feed \n
          
The following characters are not allowed in files that are uploaded due to limitations with certain file systems such as NTFS. To maintain file system agnostic behavior, these characters are intentionally not allowed to prevent potential problems with downloads on different file systems.

Here is my analysis:

  1. Some tests of rubygems uses Kernel#eval with a filename a#{raise %<improper escaping>}-2.gemspec
  2. coverage.so records the filename and lcov creates a file named a#{raise %<improper escaping>}-2.gemspec.func-sort-c.html
  3. It fails to upload the file to GitHub Actions' artifact

I would like to temporarily revert the commit to restore the coverage measurement.

I am now convinced that it is necessary to disable this feature by default. @ioquatix (Samuel Williams) Could you create a patch of Coverage.start(eval: false)?

Updated by mame (Yusuke Endoh) 2 months ago

@ioquatix (Samuel Williams) said he could implement Coverage.start(eval: true) soon, so I would like to wait for him instead of reverting the commit in question.

ioquatix (Samuel Williams) wrote in #note-18:

However, I think the normal Coverage.start(:all) should default to true for eval.

I talked with @ioquatix (Samuel Williams) about this. I think "false" is safer, but AFAIK not so many people are using :all option, so I hope "true" is acceptable.

Updated by ioquatix (Samuel Williams) 2 months ago

Here is the PR to introduce a eval: true/false flag to Coverage.setup: https://github.com/ruby/ruby/pull/6462

Updated by ioquatix (Samuel Williams) 2 months ago

  • Status changed from Open to Closed
  • Assignee set to ioquatix (Samuel Williams)

(1) and (2) are implemented, (3) is a separate issue.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0