Project

General

Profile

Actions

Bug #8287

closed

Regexp performance issue

Added by mghomn (Justin Peal) about 9 years ago. Updated almost 3 years ago.

Status:
Rejected
Priority:
Normal
Target version:
-
ruby -v:
ruby 1.9.3p374 (2013-01-15) [i386-mingw32]
Backport:
[ruby-core:54424]

Description

ADDRESS = # RFC-5322 : http://tools.ietf.org/html/rfc5322
/
(?
(?<name_addr>
(?<display_name>
(?
(?
(?
#\g?
#\g+
#\g?
)
|
\g<quoted_string>
)+
)
)?
(?<angle_addr>
\g?
<
\g<addr_spec>
>
\g?
)
)
|
(?<addr_spec>
(?<local_part>
(?<dot_atom>
(?
(?:
(?:
(?
(
\g*
(?
\x0d \x0a
)
)?
(?
\x09 | \x20
)+
)?
(?
(
(?:
\g?
(?
(?
[\x21-\x27] | [\x2a-\x5b] | [\x5d-\x7e]
)
|
(?<quoted_pair>
\
(?:
(?
[\x21-\x7e]
)
|
\g
)
)
|
\g
)
)*
\g?
)
)
)+
\g?
)
|
\g
)?
(?<dot_atom_text>
(?
[-\w!#$%&'+/=?^`{|}~]
)+
(?:
.
\g+
)

)
\g?
)
|
(?<quoted_string>
\g?
(? " )
(?:
\g?
(?
(?
\x21 | [\x23-\x5b] | [\x5d-\x7e]
)
|
\g<quoted_pair>
)
)*
\g?
\g
\g?
)
)
@
(?
\g<dot_atom>
|
(?<domain_literal>
\g?
[
(
\g?
(?
[\x21-\x5a] | [\x5e-\x7e]
)
)*
\g?
]
\g?
)
)
)
)
|
(?
\g<display_name>
:
(?<group_list>
(?<mailbox_list>
\g
(?:
,
\g
)*
)
|
\g
)?
;
\g?
)
/x

puts "start = #{start = Time.now}"
puts ''[ADDRESS]
puts "stop = #{stop = Time.now}"
puts "#{stop - start} seconds"

=begin
C:>err
start = 2013-04-18 12:34:02 +0800

stop = 2013-04-18 12:34:04 +0800
1.662166 seconds

After uncomment line 9~11:

C:>err
start = 2013-04-18 12:34:14 +0800

stop = 2013-04-18 12:34:14 +0800
0.003001 seconds
=end

Updated by ko1 (Koichi Sasada) about 9 years ago

  • Category set to core
  • Assignee set to naruse (Yui NARUSE)

Updated by jeremyevans0 (Jeremy Evans) almost 3 years ago

  • Backport deleted (1.9.3: UNKNOWN, 2.0.0: UNKNOWN)
  • Status changed from Open to Rejected

From the general problem statement, and looking at the regexp's nested use of * and + along with \g, this regexp probably exhibits exponential backtracking. See https://docs.ruby-lang.org/en/2.6.0/Regexp.html#class-Regexp-label-Performance. You would need to fix the regexp to avoid the backtracking, possibly using (?> for some capture groups. I can't confirm that because as displayed, the regexp is not valid Ruby code (I'm guessing a previous Redmine update broke the syntax used).

Actions

Also available in: Atom PDF