Three functions relate directly to this bug:
- match_op_gen() in parse.y [1], which compiles two expressions separated by a
=~
into some kind of MATCH node. When it sees that the right side is a literal Regexp, it returns a NODE_MATCH3 node.
- iseq_compile_each() in compile.c [2], which upon seeing NODE_MATCH3 flips the receiver and the value. Then, if the InstructionSequence option
specialized_instruction
is true [3] (which it is, by default), the code falls to line 4804, which creates an instruction for:
- opt_regexpmatch2(), in insns.def [4]. In this function, we optimize the instruction to just
rb_reg_match(obj1,obj2)
after finding that RB_TYPE_P(obj2, T_STRING)
. Ah ha!
The solution is a different test to answer "is obj2 a String?":
diff --git a/insns.def b/insns.def
index 7942804..7ef4c4c 100644
--- a/insns.def
+++ b/insns.def
@@ -2154,7 +2154,7 @@ opt_regexpmatch2
(VALUE obj2, VALUE obj1)
(VALUE val)
{
- if (RB_TYPE_P(obj2, T_STRING) &&
+ if (CLASS_OF(obj2) == rb_cString &&
BASIC_OP_UNREDEFINED_P(BOP_MATCH, STRING_REDEFINED_OP_FLAG)) {
val = rb_reg_match(obj1, obj2);
}
I don't think this represents a great slowdown.
[1] http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/parse.y?view=markup#l8535
[2] http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/compile.c?view=markup#l4770
[3] The bug can be temporarily remedied with today's Ruby using:
RubyVM::InstructionSequence.compile_option = { specialized_instruction: false }
s =~ /abc/ #=> :foo
[4] http://svn.ruby-lang.org/cgi-bin/viewvc.cgi/trunk/insns.def?view=markup#l2151