Feature #8948

Frozen regex

Added by Tsuyoshi Sawada 7 months ago. Updated 7 months ago.

[ruby-core:57353]
Status:Feedback
Priority:Normal
Assignee:Yukihiro Matsumoto
Category:syntax
Target version:next minor

Description

=begin
I see that frozen string was accepted for Ruby 2.1, and frozen array and hash are proposed in https://bugs.ruby-lang.org/issues/8909. I feel there is even more use case for a frozen regex, i.e., a regex literal that generates a regex only once. It is frequent to have a regex within a frequently repeated portion of code, and generating the same regex each time is a waste of resource. At the moment, we can have a code like:

class Foo
  RE1 = /pattern1/
  RE2 = /pattern1/
  RE3 = /pattern1/
  def classify
    case self
    when RE1 then 1
    when RE2 then 2
    when RE3 then 3
    else 4
    end
  end
end

but suppose we have a frozen Regexp literal //f. Then we can write like:

class Foo
  def classify
    case self
    when /pattern1/f then 1
    when /pattern1/f then 2
    when /pattern1/f then 3
    else 4
    end
  end
end

=end

History

#1 Updated by Tsuyoshi Sawada 7 months ago

Sorry, there was a mistake in the above. The three regexes with the same content /pattern1/ (or /pattern1/f) in the respective examples are supposed to represent different patterns.

#2 Updated by Benoit Daloze 7 months ago

We already have immutable (created only once) regexps: it is always the case for literal regexps and for dynamic regexps you need the 'o' flag: /a#{2}b/o.

So there are in practice immutable, but currently not #frozen?. Do you want to request it? I think it makes sense.

You can check with #object_id to know if 2 references reference the same object.

def r; /ab/; end
r.objectid
=> 2160323760
r.object
id
=> 2160323760

def s; /a#{2}b/; end
s.objectid
=> 2153197860
s.object
id
=> 2160163740

def t; /a#{2}b/o; end
t.objectid
=> 2160181200
t.object
id
=> 2160181200

#3 Updated by Tsuyoshi Sawada 7 months ago

Eregon, thank you for the information.

#4 Updated by Benoit Daloze 7 months ago

  • Status changed from Open to Feedback

sawa: do you want to request Regexp to always be #frozen? or should the issue be closed?

#5 Updated by Jens Wille 7 months ago

=begin
besides regexps being frozen, there might still be a use case for regexp literals that would only be allocated once:

def r1; /ab/; end; r1.objectid #=> 70043421664620
def r2; /ab/; end; r2.object
id #=> 70043421398060

def r3; /ab/f; end; r3.objectid #=> 70043421033140
def r4; /ab/f; end; r4.object
id #=> 70043421033140

i think it's in the same vein as #8579 and #8909.
=end

#6 Updated by Tsuyoshi Sawada 7 months ago

jwille, I agree with the use case, but it would be difficult to tell which regexes are intended to be the same, so I would not request that feature.

Probably, it makes sense to have all static regexes frozen, and have the f flag freeze dynamic regexes as well. I can't think of a use case for a regex that is immutable but not frozen. I am actually not clear about the difference.

#7 Updated by Jens Wille 7 months ago

=begin
((> but it would be difficult to tell which regexes are intended to be the same))

i'm not sure i understand. how is

def r1; /ab/f; end
def r2; /ab/f; end

different from

def s1; 'ab'f; end
def s2; 'ab'f; end

?
=end

#8 Updated by Tsuyoshi Sawada 7 months ago

jwille,

My understanding with the case of string in your example is that the two strings would count as different strings, but for respective method calls would not create new strings. It would mean one of the string can be "ab" and the other a different string such as "cd".

If that is what you intended for your regex examples, then there is no difference.

#9 Updated by Koichi Sasada 7 months ago

  • Category set to syntax
  • Assignee set to Yukihiro Matsumoto
  • Target version set to next minor

I like to freeze normal regexp literal that Eregon said.

2.2 matter?

Anyone set instance variable for each regexp? :)

#10 Updated by Benoit Daloze 7 months ago

ko1 (Koichi Sasada) wrote:

2.2 matter?

2.1 would make sense to me, so it goes along with other frozen literals.

Anyone set instance variable for each regexp? :)

I highly doubt it.

Also available in: Atom PDF