Project

General

Profile

Actions

Feature #20210

closed

Invalid source encoding raises ArgumentError, not SyntaxError

Added by kddnewton (Kevin Newton) 11 months ago. Updated 10 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:116435]

Description

I was hoping we could change the error that is raised when an invalid source encoding is found from an ArgumentError to a SyntaxError.

First let me say, if this isn't possible for backward compatibility, I understand. Please do not take this as me not caring about backward compatibility.

Right now, if you have the script # encoding: foo\n"bar", it will raise an ArgumentError, not a SyntaxError. If there are other syntax errors in the file, there's no way to concat them together to give feedback to the user. If a user wants to consistently handle the errors coming back from a parse, they currently have to rescue ArgumentError and SyntaxError.

Ideally it would all be SyntaxError, so we could handle them consistently and append all errors together.

Updated by nobu (Nobuyoshi Nakada) 11 months ago

  • Tracker changed from Misc to Feature

I don't remember the reason to select ArgumentError, SyntaxError feels more reasonable.
https://github.com/ruby/ruby/pull/9701

Updated by yui-knk (Kaneko Yuichiro) 11 months ago

I'm wondering which encoding should be used if the parser hits invalid source encoding like # coding: foo. I think it's needed to clarify which encoding is assumed on this ticket.

Updated by naruse (Yui NARUSE) 11 months ago

Parsing entire source code with wrong encoding is not reasonable because in some encoding including SJIS (Windows-31J) parsing result won't be valid because some multibyte character may include ASCII character in the trailing byte in the encoding. A developer need to fix the encoding first.

Updated by Anonymous 11 months ago

Hi. One question:

When parsing begins, what encoding do Prism and parse.y use by default?

Updated by kddnewton (Kevin Newton) 11 months ago

@naruse (Yui NARUSE) I'm fine exiting immediately, I was just hoping to make it a syntax error.

@Edwing123 By default Ruby source assumes UTF-8 unless told otherwise by a magic comment or a command line option.

Updated by mame (Yusuke Endoh) 10 months ago

Discussed at the dev meeting.

We need a good reason to introduce incompatibility. You say you are fine with the current behavior (exiting immediately), Then, we can't see no reason to change it.

This is just my idea: it would be great for prism, as a library, to continue parsing even with an invalid encoding magic comment (maybe as ASCII-8BIT?), but it would be good to keep the behavior of Ruby interpreter as possible.

Updated by kddnewton (Kevin Newton) 10 months ago

  • Status changed from Open to Closed

I think that makes sense! Let's keep it as an argument error. Prism will keep parsing for now, but raise the right error.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0