Project

General

Profile

Actions

Bug #19932

closed

Regexp o modifier freeze interpolation between method calls

Added by noraj-acceis (Alexandre ZANNI) 9 months ago. Updated 9 months ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
[ruby-core:115082]

Description

Taken the following PoC:

def poc(regexp)
  hs = [
    'azerty',
    'azertyui',
    'azertyuiop'
  ]
  out = []
  hs.each do |h|
    out << h if /#{regexp}/.match?('azerty')
  end
  out
end

p poc('az')
p poc('wxc')

I have the following output:

["azerty", "azertyui", "azertyuiop"]
[]

Now because the regexp never change inside the method once set I could add the o modifier so that #{} interpolation is computed only once.

So the PoC becomes:

def poc(regexp)
  hs = [
    'azerty',
    'azertyui',
    'azertyuiop'
  ]
  out = []
  hs.each do |h|
    out << h if /#{regexp}/o.match?('azerty')
  end
  out
end

p poc('az')
p poc('wxc')

Output:

["azerty", "azertyui", "azertyuiop"]
["azerty", "azertyui", "azertyuiop"]

So each future call to the method will be equals to the result of the first call because the regexp is "frozen".

I was expecting the regexp to be "frozen" only for the current call in an effort of performance but instead it's "frozen" for future calls too.

Another example of usage of o modifier here: https://learnbyexample.github.io/Ruby_Regexp/modifiers.html#o-modifier

So is that a side effect or is that the expected behavior?

Updated by noraj-acceis (Alexandre ZANNI) 9 months ago

I'm not sure I was clear.

It's not that clear reading the documentation that the "freeze" will continue to live outside the local scope. All examples are always outside a method definition. As for local variables I would expect myregexp = /#{myvar}/o to be re-evaluated between several method calls. By reading the doc I understood the regexp object will be frozen in a local context (e.g. in a loop) but if it's I put that in a method I understood the regexp object will be frozen locally during the method call but that it will be dynamically re-evaluated in another method call then re-frozen for that new call duration.

If that's still the expected behavior, it could be nice then to introduce another modifier that will unfroze the regexp object when leaving the method call so that a use case like in the PoC would work. Because there are many use case where you want a regexp object to be frozen for example because you have a loop with 1M items where the regexp will be called o neach item but you still want to be able to call the method again with different arguments.

Updated by jeremyevans0 (Jeremy Evans) 9 months ago

I don't think the documentation could be made significantly clearer, though potentially it could benefit from an additional example. It's expected that the o modifier works the way it does, and does not generate a new regexp with each call to the poc method in your example.

There isn't a modifier that operates like o, but generates a new regexp for each call to poc, but not a new regexp per iteration inside poc. Doing so would be hard to reason about, because the o modifier changes how the code is compiled. What if the code was changed to?:

poc = lambda do |regexp|
  hs = [
    'azerty',
    'azertyui',
    'azertyuiop'
  ]
  out = []
  hs.each do |h|
    out << h if /#{regexp}/o.match?('azerty')
  end
  out
end

Should that still generate a new regexp per call to poc? Remember that lambda is not a keyword, it is a normal method call, so you can redefine it to use define_method, or to be the same as Array#each.

For your example, for what you want, you should create a literal regexp without o modifier before the loop, set that to a local variable, and use that local variable inside the loop.

Updated by noraj-acceis (Alexandre ZANNI) 9 months ago

My apologies for my misunderstanding. Thanks for the detailed explanation. I now understand it's hard to do something else than the current behavior.

Actions

Also available in: Atom PDF

Like1
Like0Like0Like1Like1