In the /\//
and %r/\//
cases, the regexp source is transformed from \/
to /
in the lexer (tokadd_string
) before it even hits the parser, let alone the regexp engine. From a regexp perspective, /\//
and %r/\//
are treated as Regexp.new('/')
, and %r{\/}
as Regexp.new('\/')
.
Regexp#source should provide the source of the regexp, not necessarily the source as given in the source code. The statement escape sequences are retained as is
refers to Regexp escape sequences, and the \
in /\//
and %r/\//
is not a regexp escape sequence, but a lexer escape sequence (similar to %/\//
or %s/\//
). This issue is not related to /
specifically, it occurs for most terminators: %r,\,,.source # => ","
Note that in cases where escaping would actually change regexp behavior, the lexer doesn't swallow the escape character: %r$\$$.source # => "\\$"
It's fairly simple to remove this behavior from the lexer just by deleting code:
diff --git a/parse.y b/parse.y
index 167f064b31..523d5a85b3 100644
--- a/parse.y
+++ b/parse.y
@@ -7130,19 +7130,6 @@ tokadd_mbchar(struct parser_params *p, int c)
return c;
}
-static inline int
-simple_re_meta(int c)
-{
- switch (c) {
- case '$': case '*': case '+': case '.':
- case '?': case '^': case '|':
- case ')': case ']': case '}': case '>':
- return TRUE;
- default:
- return FALSE;
- }
-}
-
static int
parser_update_heredoc_indent(struct parser_params *p, int c)
{
@@ -7277,10 +7264,6 @@ tokadd_string(struct parser_params *p,
}
}
- if (c == term && !simple_re_meta(c)) {
- tokadd(p, c);
- continue;
- }
pushback(p, c);
if ((c = tokadd_escape(p, enc)) < 0)
return -1;
However, it breaks 3 tests in test_regexp.rb
: test_source_unescaped
, test_source
, and test_equal
. It also breaks a couple of specs, Literal Regexps supports escaping characters when used as a terminator
and Regexp#source will remove escape characters
.
Since the current behavior is clearly by design in the tests and specs, I can safely conclude this is not a bug, or at most, it is a minor documentation bug (I'll update the documentation). Switching to feature request. I'll add this the list of tickets to review at the next developer meeting, since while I'm not in favor of making the change, I do think this issue warrants discussion.