Project

General

Profile

Actions

Feature #18159

open

Integrate functionality of dead_end gem into Ruby

Added by duerst (Martin Dürst) 3 months ago. Updated 3 months ago.

Status:
Open
Priority:
Normal
Target version:
[ruby-core:105191]

Description

Missing 'end' errors are difficult to fix. We should integrate the functionality of the dead_end gem (https://github.com/zombocom/dead_end) into Ruby similar to how we integrated did_you_mean. It would greatly help programming Ruby, in particular for beginners.

See also Ruby Kaigi Takeout 2021 talk by Richard Schneeman https://rubykaigi.org/2021-takeout/presentations/schneems.html.


Related issues

Related to Ruby master - Feature #14244: Better error messages for scripts with non-matching end statementsOpenmame (Yusuke Endoh)Actions
Actions #1

Updated by mame (Yusuke Endoh) 3 months ago

  • Related to Feature #14244: Better error messages for scripts with non-matching end statements added

Updated by schneems (Richard Schneeman) 3 months ago

Thank you for creating the issue, Martin 🙏. For additional context for anyone who could not see the presentation, I uploaded my RubyKaigi talk as unlisted on YouTube https://youtu.be/DfGG77zVVR4. You can view it when it is done uploading soon.

Skip ahead for a demo of dead_end parsing source code at 12:51. And then, I do a walk-through of how the algorithm internals works from 15:08 until the end.

The first half of the talk sets up the problem, explains the tools (Ripper with lexing and parsing), gives a high-level intro to "AI", and explains why syntax errors are more challenging than standard errors in Ruby. The final algorithm is conceptually based on Dijkstra's algorithm for uniform cost search, hence the artificial intelligence tie-in.

The critical insight with dead_end was that removing invalid syntax lines from a Ruby document makes it parsable with ripper. We can use this as our goal condition and recursively search the source code using indentation and lexing to determine logical code "blocks".

Updated by duerst (Martin Dürst) 3 months ago

  • Target version set to 3.1
  • Assignee set to matz (Yukihiro Matsumoto)

schneems (Richard Schneeman) wrote in #note-2:

For additional context for anyone who could not see the presentation, I uploaded my RubyKaigi talk as unlisted on YouTube https://youtu.be/DfGG77zVVR4.

This is now also available 'officially' at https://youtu.be/oL_yxJN8534.

Skip ahead for a demo of dead_end parsing source code at 12:51. And then, I do a walk-through of how the algorithm internals works from 15:08 until the end.

I think in this issue, what's more important than the parser or algorithm used is how to integrate it into Ruby. My guess is that the easiest way would be to integrate it in the same way as the 'did_you_mean' and 'error_highlight' gems.

schneems (Richard Schneeman): Can you look at whether an integration similar to did_you_mean' and 'error_highlight' is easy/possible? I've put this on the agenda of this week's dev meeting (issue #18122, Thursday Japan time), so any additional information we can get before would be great.

Updated by schneems (Richard Schneeman) 3 months ago

Can you look at whether an integration similar to did_you_mean' and 'error_highlight' is easy/possible?

I've got some time on my calendar for exploration. I didn't previously budget for it so I can't guarantee how far I'll be able to get, but I'll take a look and try to post back Wednesday noon central-ish time.

Actions #5

Updated by jeremyevans0 (Jeremy Evans) 3 months ago

  • Backport deleted (2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN)
  • Tracker changed from Bug to Feature

Updated by schneems (Richard Schneeman) 3 months ago

I've made a draft PR with a working spike of getting this into Ruby. https://github.com/ruby/ruby/pull/4845 it should be good enough for discussion. I used the same interface you suggested here. It was very easy.

The only limitation of my approach is listed in the PR (dead_end hooks into require so will not fire in every context). But I believe we can work through this limitation.

I invite anyone to try out this branch or to include dead_end in their own gem in their own codebases.

Please let me know how the meeting goes.

Updated by mame (Yusuke Endoh) 3 months ago

In today's dev meeting, we discussed this. Basically, no one was opposed to the concept. But some concerns were discussed.

ko1 (Koichi Sasada) and knu (Akinori MUSHA) were concerned about whether the heuristic algorithm was mature enough. If dead_end prints a misguided summary, it may rather confuse users. They wanted to take an experimental period after dead_end is integrated to the ruby core. There are only three months until Ruby 3.1 release, so they suggested to try the integration towards Ruby 3.2.

nobu (Nobuyoshi Nakada) and I briefly read the source code, and had some concerns about the implementation details:

  • Currently, there are no good way to allow dead_end to hook a syntax error and extend its message. The redefinition of Kernel#require is never a preferable way. (discussed later)
  • The implementation uses timeout, five seconds by default. The heuristic algorithm may take so long time?
  • The gem includes "exe/dead_end" command. Should it be shipped with the ruby tarball, too?
  • dead_end seems to extend NoMethodError to show a code snippet where the error occurs (if it is not running in production?). Does this conflict with error_highlight?

If dead_end is integrated to the core, we want to avoid the redefinition of Kernel#require and require_relative. It does not work well against an syntax error of an entrypoint file (a file passed the ruby interpreter). But unfortunately, there is no good API for the feature. did_you_mean and error_highlight extend the error message of NameError by defining NameError#message. It would be good if dead_end can do the same (define SyntaxError#message), but unfortunately it does not work because the current parser outputs an error message directly to tty, not via SyntaxError#message. nobu (Nobuyoshi Nakada) is now trying to sort it out to allow the following API design. schneems (Richard Schneeman) What do you think?

class SyntaxError
  attr_reader :file_path # the file path that caused this syntax error

  def message
    dead_end_error_message = DeadEnd.call(source: File.read(file_path), ...)

    "#{ super }\n\nDeadEnd: Unmatched `end` detected ..." + dead_end_error_message
  end
end

Updated by schneems (Richard Schneeman) 3 months ago

ko1 (Koichi Sasada) and knu (Akinori MUSHA) were concerned about whether the heuristic algorithm was mature enough. If dead_end prints a misguided summary, it may rather confuse users.

I certainly understand and agree. If you (or anyone else) finds a strange case please let me know.

The implementation uses timeout, five seconds by default. The heuristic algorithm may take so long time?

I added the timeout because if the algorithm has a bug that causes a slowdown or infinite loop, I still want the developer to know there is a syntax error without pretty output. In practice, it is speedy. The user should not notice any time between running their program and getting output.

The only performance problem I've had since making the library was creating many combinations of a large array. I fixed it in https://github.com/zombocom/dead_end/pull/35. On further review, I can be lazier with this combination generation which will help.

I do have a performance test just in case of regressions https://github.com/zombocom/dead_end/blob/5656caa3116f38b1d1b3ef139dc4c692d93d04b9/spec/perf/perf_spec.rb#L30.

I could likely reduce the timeout value to 1 second comfortably. It is already configurable via an environment variable for flexibility while debugging.

The gem includes "exe/dead_end" command. Should it be shipped with the ruby tarball, too?

I added this to debug so someone can run the algorithm without having to execute any code or put dead_end in their Gemfile. I don't think it is necessary to ship the executable with Ruby.

dead_end seems to extend NoMethodError to show a code snippet where the error occurs (if it is not running in production?). Does this conflict with error_highlight?

After seeing highlight_error (thank you for your work), I can remove this code from dead_end and encourage developers to add your gem instead.

unfortunately it does not work because the current parser outputs an error message directly to tty, not via SyntaxError#message

I tried to hook into SyntaxError as my first attempt and learned that I could not patch it, but I didn't know why.

Nobu (Nobuyoshi Nakada) is now trying to sort it out to allow the following API design. schneems (Richard Schneeman) What do you think?

I like this API. I wonder if there are other cases we want to support where the source is not on disk. The two cases I can think of are eval and streaming ruby code via STDIN:

$ echo "puts 1+1" | ruby
2

I think this feature is an enhancement that is not strictly necessary in all cases, but if I had the source contents instead of just the file path, then I think I'd be able to handle these additional edge cases.

There are only three months until Ruby 3.1 release, so they suggested to try the integration towards Ruby 3.2.

Targeting 3.2 seems safer. I can begin making adjustments and preparations. Thank you for the opportunity 🙏

Updated by duerst (Martin Dürst) 3 months ago

Some additional comments during the meeting:

  • We want to make sure there's no (or very low) overhead for correct programs. So possibly load dead_end code only when the error occurs, because it's not needed otherwise.

  • What happens when indenting is not correct? Dead_end is most valuable to beginners, but beginners often have problems with correct indenting. (Well, if dead_end (indirectly) helps them improving their indents, then that would be good, too.)

  • The best time to introduce the gem would be once the 3.1 release branch is forked from the master branch. That would be in January next year, or possibly somewhat earlier. Then we would have a lot of time for gaining experience.

  • The name dead_end may be somewhat confusing for the release version. But maybe it wouldn't actually appear in the error messages.

  • There are some more details in the minutes of the developers' meeting, but I'm not sure they are public.

Updated by schneems (Richard Schneeman) 3 months ago

We want to make sure there's no (or very low) overhead for correct programs. So possibly load dead_end code only when the error occurs, because it's not needed otherwise.

Hooking into SyntaxError#message should make it zero overhead when there are no errors.

What happens when indenting is not correct? Dead_end is most valuable to beginners, but beginners often have problems with correct indenting. (Well, if dead_end (indirectly) helps them improving their indents, then that would be good, too.)

The code doesn't require perfect indentation. I'm testing against several bad indentation cases intentionally https://github.com/zombocom/dead_end/blob/15d0b93f684e8ac2fabb4affb092349d3392c0d9/spec/unit/code_search_spec.rb#L260-L337. If indentation is correct it will have a higher chance of providing better output. If it's incorrect it still should produce a decent result (within reason). This is possible because the process is lexically aware.

The best time to introduce the gem would be once the 3.1 release branch is forked from the master branch. That would be in January next year, or possibly somewhat earlier. Then we would have a lot of time for gaining experience.

Sounds good

The name dead_end may be somewhat confusing for the release version. But maybe it wouldn't actually appear in the error messages.

We can keep the name out of the error messages for sure.

Actions

Also available in: Atom PDF