Project

General

Profile

Actions

Bug #21824

open

performance regression in regexp matching after update to 4.0

Bug #21824: performance regression in regexp matching after update to 4.0

Added by mackuba (Kuba Suder) 2 days ago. Updated 2 days ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 4.0.0 (2025-12-25 revision 553f1675f3) +PRISM [arm64-darwin24]
[ruby-core:<unknown>]

Description

TLDR: a change in the regexp algorithm in regcomp.c merged around October was supposed to speed up the matching, but it seems it has slowed it down more in practice.

===

I'm running a service written in Ruby which serves some number of Bluesky feeds. As part of this service, I have a process that connects to a websocket API (w/ EventMachine), reads from it all new posts made on the network in real time (on the order of ~100 posts per second on average lately), and passes each post through a #post_matches? method in about 20 feed objects, each of which has a number of regexps configured (most have around 20-30, but one has a few hundred, probably something like 900 regexps in total). If a given post's text (up to 300 unicode characters) matches any of the regexps of the given feed (and/or matches some additional conditions), it's added to this feed in the database.

I often use a slightly older version of this project with a fixed pre-downloaded chunk of the posts stream, with ~100k random but real posts in one large binary file, as a benchmark on different versions of Ruby. So I ran it again on 4.0 after it was released a week ago, and I got this: https://bsky.app/profile/mackuba.eu/post/3mb2quhdoqs23

The numbers are the calculated speed of processing, in events per second, after going through the whole sample of 100k events and a full set of feeds. So there was a 5-6% slowdown on Ruby 4.0 vs. Ruby 3.4 (with or without YJIT), putting it below any version of Ruby since 3.0.

I started profiling which specific parts of the code got slower, and I narrowed it down to regexp matching (the full loop also does e.g. decoding of data from a binary format from the file). I made a demo project reproducing this here (note, linking to a slightly earlier commit): https://tangled.org/mackuba.eu/ruby4.0-regexp-test/tree/63e208de944debdc0a6997de0169cd0603a0b441. There are 940 posts here in the sample and 17 regexps, and running 1000 loops takes 7 seconds on Ruby 3.4, but 13.5s on 4.0 (so almost 2x longer).

Then, I started digging into the details to see if some specific posts or specific regexps cause the slowdown. It turned out that most of the 17 regexps, if tested alone, work about the same on 3.4 & 4.0, but two work around 4x slower: /\bslackware\b/i and /\bkde plasma\b/i. (This seems to be not only about the length of the regexp, but also specific characters, because e.g. /\blongstring\b/i does not trigger this effect.) It also seems that this only happens if the searched string contains non-ASCII characters.

An updated, more minimal version of the demo is here: https://tangled.org/mackuba.eu/ruby4.0-regexp-test/tree/d9e145c1e9f39fca6ddcc9042dd54fa4fac838a3. One regexp and one string. With a million loops, it takes 0.75s on Ruby 3.4, but 2.7s on Ruby 4.0. (No change on latest master, ruby 4.1.0dev (2026-01-05T17:18:47Z master 7e81bf5c0c), i.e. it's still much slower than on 3.4.) If the few non-ASCII characters (apostrophe, ndash, ellipsis, TM) are removed from the string, the slowdown goes away.

Finally, with some help from Codex I tried to track down which specific change between 3.4 and 4.0 caused this slowdown. It turned out to be this commit: https://github.com/ruby/ruby/commit/981ee02c7c664f19b983662d618d5e6bd87d1739, authored in November 2017 (!), but merged around 31st October this year. It says it "fixes performance problem with /k/i and /s/i". I'm not familiar with these string search algorithms so I'm not able to evaluate this in detail or say how this could be fixed, but from the description I'm assuming that it improves performance in some cases of text/regexp, but apparently degrades it in some others like mine. Since in my main test of a practical scenario which matches ~100k posts against ~900 regexps, as well as the earlier demo which matches 940 posts against 17 regexps, the updated version in 4.0 is slower (5% and 47% slower, respectively), my guess is that the net result will generally be negative, i.e. that this change committed in October makes more cases slower than it makes faster. But like I said, I don't know enough about regexp algorithms to analyze this deeper, I can only report my results.

Updated by mackuba (Kuba Suder) 2 days ago Actions #1

To clarify: when testing on a Ruby version built from the commit before 981ee02c7c664f19b983662d618d5e6bd87d1739 (i.e. a6379032ee98bc43fb68ce7a6c186f3512558ce0), the test runs as fast as on 3.4. When testing on 981ee02c7c664f19b983662d618d5e6bd87d1739, it runs as slow as on 4.0.

Actions

Also available in: PDF Atom