Bug #8287
closedRegexp performance issue
Description
ADDRESS = # RFC-5322 : http://tools.ietf.org/html/rfc5322
/
(?
(?<name_addr>
(?<display_name>
(?
(?
(?
#\g?
#\g+
#\g?
)
|
\g<quoted_string>
)+
)
)?
(?<angle_addr>
\g?
<
\g<addr_spec>
>
\g?
)
)
|
(?<addr_spec>
(?<local_part>
(?<dot_atom>
(?
(?:
(?:
(?
(
\g*
(?
\x0d \x0a
)
)?
(?
\x09 | \x20
)+
)?
(?
(
(?:
\g?
(?
(?
[\x21-\x27] | [\x2a-\x5b] | [\x5d-\x7e]
)
|
(?<quoted_pair>
\
(?:
(?
[\x21-\x7e]
)
|
\g
)
)
|
\g
)
)*
\g?
)
)
)+
\g?
)
|
\g
)?
(?<dot_atom_text>
(?
[-\w!#$%&'+/=?^`{|}~]
)+
(?:
.
\g+
)
)
\g?
)
|
(?<quoted_string>
\g?
(? " )
(?:
\g?
(?
(?
\x21 | [\x23-\x5b] | [\x5d-\x7e]
)
|
\g<quoted_pair>
)
)*
\g?
\g
\g?
)
)
@
(?
\g<dot_atom>
|
(?<domain_literal>
\g?
[
(
\g?
(?
[\x21-\x5a] | [\x5e-\x7e]
)
)*
\g?
]
\g?
)
)
)
)
|
(?
\g<display_name>
:
(?<group_list>
(?<mailbox_list>
\g
(?:
,
\g
)*
)
|
\g
)?
;
\g?
)
/x
puts "start = #{start = Time.now}"
puts 'dH3GFaWn5nqgxtYAiTyG@eu.tv'[ADDRESS]
puts "stop = #{stop = Time.now}"
puts "#{stop - start} seconds"
=begin
C:>err
start = 2013-04-18 12:34:02 +0800
dH3GFaWn5nqgxtYAiTyG@eu.tv
stop = 2013-04-18 12:34:04 +0800
1.662166 seconds
After uncomment line 9~11:
C:>err
start = 2013-04-18 12:34:14 +0800
dH3GFaWn5nqgxtYAiTyG@eu.tv
stop = 2013-04-18 12:34:14 +0800
0.003001 seconds
=end
Updated by ko1 (Koichi Sasada) over 11 years ago
- Category set to core
- Assignee set to naruse (Yui NARUSE)
Updated by jeremyevans0 (Jeremy Evans) over 5 years ago
- Status changed from Open to Rejected
- Backport deleted (
1.9.3: UNKNOWN, 2.0.0: UNKNOWN)
From the general problem statement, and looking at the regexp's nested use of *
and +
along with \g
, this regexp probably exhibits exponential backtracking. See https://docs.ruby-lang.org/en/2.6.0/Regexp.html#class-Regexp-label-Performance. You would need to fix the regexp to avoid the backtracking, possibly using (?>
for some capture groups. I can't confirm that because as displayed, the regexp is not valid Ruby code (I'm guessing a previous Redmine update broke the syntax used).