the regexp variable is a ascii-8bit regular expression with the byte interpolated into the middle. However, if you inline that interpolation:
# encoding: us-asciiregexp=/#{"\x80"}/
you get a syntax error, saying it's an invalid multi-byte character. I'm not sure what the rule is here, as it seems inconsistent. Is this the correct behavior?
I would prefer if it would create an ascii-8bit regular expression like the first example, which would be consistent.
Agreed, the current behavior breaks referential transparency and unexpectedly analyzes string literals inside interpolated parts.
This leads to extra confusion and I would think has no value in real-world usages of interpolated regexps (because it causes an error instead of none).
So I think this is a bug and the implementation should not analyze those parts and consequently the behavior should be the same as with the extra local variable.
I'm fine with it analyzing the string literals, I would just prefer it take the same codepath as the interpolated variable case, in which it would produce an ascii-8bit regular expression as opposed to raising an error.
Discussed at the dev meeting, and @matz (Yukihiro Matsumoto) said /#{"\x80"}/ should not raise a SyntaxError but return a binary encoded regexp object.