Project

General

Profile

Feature #8948

Frozen regex

Added by sawa (Tsuyoshi Sawada) over 3 years ago. Updated over 1 year ago.

Status:
Assigned
Priority:
Normal
Target version:
-
[ruby-core:57353]

Description

=begin
I see that frozen string was accepted for Ruby 2.1, and frozen array and hash are proposed in https://bugs.ruby-lang.org/issues/8909. I feel there is even more use case for a frozen regex, i.e., a regex literal that generates a regex only once. It is frequent to have a regex within a frequently repeated portion of code, and generating the same regex each time is a waste of resource. At the moment, we can have a code like:

class Foo
  RE1 = /pattern1/
  RE2 = /pattern1/
  RE3 = /pattern1/
  def classify
    case self
    when RE1 then 1
    when RE2 then 2
    when RE3 then 3
    else 4
    end
  end
end

but suppose we have a frozen Regexp literal //f. Then we can write like:

class Foo
  def classify
    case self
    when /pattern1/f then 1
    when /pattern1/f then 2
    when /pattern1/f then 3
    else 4
    end
  end
end

=end

History

#1 [ruby-core:57354] Updated by sawa (Tsuyoshi Sawada) over 3 years ago

Sorry, there was a mistake in the above. The three regexes with the same content /pattern1/ (or /pattern1/f) in the respective examples are supposed to represent different patterns.

#2 [ruby-core:57355] Updated by Eregon (Benoit Daloze) over 3 years ago

We already have immutable (created only once) regexps: it is always the case for literal regexps and for dynamic regexps you need the 'o' flag: /a#{2}b/o.

So there are in practice immutable, but currently not #frozen?. Do you want to request it? I think it makes sense.

You can check with #object_id to know if 2 references reference the same object.

def r; /ab/; end
r.object_id
=> 2160323760
r.object_id
=> 2160323760

def s; /a#{2}b/; end
s.object_id
=> 2153197860
s.object_id
=> 2160163740

def t; /a#{2}b/o; end
t.object_id
=> 2160181200
t.object_id
=> 2160181200

#3 [ruby-core:57359] Updated by sawa (Tsuyoshi Sawada) over 3 years ago

Eregon, thank you for the information.

#4 [ruby-core:57371] Updated by Eregon (Benoit Daloze) over 3 years ago

  • Status changed from Open to Feedback

sawa: do you want to request Regexp to always be #frozen? or should the issue be closed?

#5 [ruby-core:57373] Updated by jwille (Jens Wille) over 3 years ago

besides regexps being frozen, there might still be a use case for regexp literals that would only be allocated once:

def r1; /ab/; end; r1.object_id  #=> 70043421664620
def r2; /ab/; end; r2.object_id  #=> 70043421398060

def r3; /ab/f; end; r3.object_id  #=> 70043421033140
def r4; /ab/f; end; r4.object_id  #=> 70043421033140

i think it's in the same vein as #8579 and #8909.

#6 [ruby-core:57384] Updated by sawa (Tsuyoshi Sawada) over 3 years ago

jwille, I agree with the use case, but it would be difficult to tell which regexes are intended to be the same, so I would not request that feature.

Probably, it makes sense to have all static regexes frozen, and have the f flag freeze dynamic regexes as well. I can't think of a use case for a regex that is immutable but not frozen. I am actually not clear about the difference.

#7 [ruby-core:57434] Updated by jwille (Jens Wille) over 3 years ago

but it would be difficult to tell which regexes are intended to be the same

i'm not sure i understand. how is

def r1; /ab/f; end
def r2; /ab/f; end

different from

def s1; 'ab'f; end
def s2; 'ab'f; end

?

#8 [ruby-core:57464] Updated by sawa (Tsuyoshi Sawada) over 3 years ago

jwille,

My understanding with the case of string in your example is that the two strings would count as different strings, but for respective method calls would not create new strings. It would mean one of the string can be "ab" and the other a different string such as "cd".

If that is what you intended for your regex examples, then there is no difference.

#9 [ruby-core:57471] Updated by ko1 (Koichi Sasada) over 3 years ago

  • Category set to syntax
  • Assignee set to matz (Yukihiro Matsumoto)
  • Target version set to next minor

I like to freeze normal regexp literal that Eregon said.

2.2 matter?

Anyone set instance variable for each regexp? :)

#10 [ruby-core:57475] Updated by Eregon (Benoit Daloze) over 3 years ago

ko1 (Koichi Sasada) wrote:

2.2 matter?

2.1 would make sense to me, so it goes along with other frozen literals.

Anyone set instance variable for each regexp? :)

I highly doubt it.

#11 Updated by ko1 (Koichi Sasada) over 1 year ago

  • Status changed from Feedback to Assigned

There are two options:

  1. Freeze only literal regexps
  2. Freeze all of regexps

I like (2) because I have no idea to change regexp objects.

History of "Frozen":

#12 Updated by ko1 (Koichi Sasada) over 1 year ago

A patch is small.

Index: re.c
===================================================================
--- re.c    (revision 51650)
+++ re.c    (working copy)
@@ -2548,6 +2548,8 @@
     if (!re->ptr) return -1;
     RB_OBJ_WRITE(obj, &re->src, rb_fstring(rb_enc_str_new(s, len, enc)));
     RB_GC_GUARD(unescaped);
+
+    rb_obj_freeze(obj);
     return 0;
 }

But I got many failures on rubyspec.
https://gist.github.com/ko1/da52575de115c928ce4a

Also available in: Atom PDF