Feature #19694
openAdd Regexp#timeout= setter
Description
Abstract¶
In addition to allowing for a Regexp timeout to be set on individual instances by setting a timeout argument in Regexp.new, I'm proposing that we also allow setting the timeout on Regexp objects with a #timeout= setter.
Background¶
To be able to roll out a global Regexp timeout for a large application, there are inevitably some individual regexes for which a different timeout is appropriate. While the timeout keyword argument was added to Regexp.new, this isn't always a viable option.
In the case of regex literal syntax (/ab*/ or %r{ab*}, for instance), it's not possible to set a timeout at all right now without converting to Regexp.new, which may be awkward depending on the contents of the regex.
It also is desirable from time to time to be able to set a timeout for a regex object after it's been initialized.
Finally, because we offer a Regexp#timeout getter, for consistency it would be nice to also offer a setter.
The introduction of a Regexp#timeout= setter was mentioned as a possible way to set individual timeouts in https://bugs.ruby-lang.org/issues/19104#Specification.
Proposal¶
I propose that we add the method Regexp#timeout=. It works the same way the timeout argument works in Regexp.new, taking either a float or nil.
This makes it relatively easy to add timeouts to specific regex literals (regex literals are frozen by default so you do have to dup them first):
emoji_filter_pattern = %r{
(?<!#{Regexp.quote(ZERO_WIDTH_JOINER)})
#{EmojiFilter.unicodes_pattern}
(?!#{Regexp.union(EmojiFilter::MODIFIER_CHAR_MAP.keys.map { |k| Regexp.quote k })})
}x.dup
emoji_filter_pattern.timeout = 1.0
emoji_filter_pattern.freeze
Implementation¶
This setter has been implemented in https://github.com/ruby/ruby/pull/7847.
Evaluation¶
It's just a setter, so pretty straightforward in terms of implementation and use.
Discussion¶
It's worth considering other options for overriding Regexp.timeout. I'd love to see something like the following for overriding regexp timeouts as well:
Regexp.timeout = 1.0
Regexp.with_timeout(5.0) do
evaluate_slower_regexes
end
It's possible to implement something like Regexp.with_timeout but it's not thread-safe by default since it would involve overwriting Regexp.timeout.
Summary¶
Regexp instances have a getter for timeout, and adding a corresponding setter adds consistency and will make it easier for developers to adopt adding a global Regexp.timeout by making it simpler to adjust timeouts on a regex by regex basis.
It's a minor change but the added consistency and flexibility help us optimize for developer happiness.