Project

General

Profile

Feature #12086

using: option for instance_eval etc.

Added by shugo (Shugo Maeda) almost 3 years ago. Updated over 2 years ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:73886]

Description

Currently refinements can be activated only in toplevel or class/module definitions.
If they can be activated in block-level, it's useful to implement internal DSLs.

How about to add a new option using: for Kernel#instance_eval and Moule#{class,module}_eval?

module FixnumDivExt
  refine Fixnum do
    def /(other)
      quo(other)
    end
  end
end

p 1 / 2 #=> 0
instance_eval(using: FixnumDivExt) do
  p 1 / 2 #=> (1/2)
end
p 1 / 2 #=> 0

Proof-of-concept implementation is available at https://github.com/shugo/ruby/tree/eval_using.

In my previous proposal before Ruby 2.0, refinements used in a class or module are
implicitly activated by instance_eval and class_eval, but now I think it's better to
explicitly specify refinements to be activated.

Considerations:

  • In the PoC implementation, refined methods are not cached inline, and thus it decreases the performance of refined method call. If there is a way to guarantee that blocks never be evaluated in different environments, refined methods can be cached inline.
  • {instance,class,module}_exec cannot be extended in the same way, because they take arbitrary arguments and there's no way to distinguish an option hash from the last argument hash.

Related issues

Related to Ruby trunk - Feature #12281: Allow lexically scoped use of refinements with `using {}` block syntaxAssignedActions

History

Updated by shugo (Shugo Maeda) almost 3 years ago

  • Tracker changed from Bug to Feature

Updated by nobu (Nobuyoshi Nakada) over 2 years ago

I'm against instance_eval under the hood by libraries.

#3

Updated by matz (Yukihiro Matsumoto) over 2 years ago

  • Related to Feature #12281: Allow lexically scoped use of refinements with `using {}` block syntax added

Updated by matz (Yukihiro Matsumoto) over 2 years ago

I like the idea, but I understand this makes implementation harder (especially for performance).
Feel free to comment for, against this idea.

Matz.

Updated by shugo (Shugo Maeda) over 2 years ago

Nobuyoshi Nakada wrote:

I'm against instance_eval under the hood by libraries.

I used to be against it too, but it's common now whether using: is available or not.

Let me talk about a use case.

I wrote a packrat parser library called radd_djur (https://github.com/shugo/radd_djur) before.
radd_djur uses refinements to implement an internal DSL to define grammar rules.

require "radd_djur"

using RaddDjur::DSL

calc = RaddDjur::Grammar.new(:expr) {
  define :expr do
    [:int, "+", :int].bind { |x, *, y|
      ret x + y
    } /
    [:int, "-", :int].bind { |x, *, y|
      ret x - y
    }
  end

  define :int do
    (?0..?9).one_or_more.bind { |xs|
      ret xs.foldl1(&:+).to_i
    }
  end
}
p calc.parse("123+456")

define and ret are provided by instance_eval, and bind, /, and one_or_more
are provided by refinements.

It looks cool, but there are two problems here:

  1. We have to write using RaddDjur::DSL explicitly.
  2. Refinements for DSL is available out of the grammar rules.

If instance_eval(using: refinement) is introduced, RaddDjur::DSL can be activated
only in the block given to RaddDjur::Grammar.new, and these two problems will be solved.

Note that instance_eval and refinement activation have to be done atomically in this case.
That's why I proposed this feature as a new option of instance_eval.

Updated by shugo (Shugo Maeda) over 2 years ago

Yukihiro Matsumoto wrote:

I like the idea, but I understand this makes implementation harder (especially for performance).

Speaking of performance, inline cache cannot store refined methods because the same block can be
executed with different refinements with this feature.

However, there is no performance degradation for code without refinements.

Updated by nobu (Nobuyoshi Nakada) over 2 years ago

Shugo Maeda wrote:

It looks cool, but there are two problems here:

  1. We have to write using RaddDjur::DSL explicitly.
  2. Refinements for DSL is available out of the grammar rules.

These seem irrelevant to instance_eval.

If instance_eval(using: refinement) is introduced, RaddDjur::DSL can be activated
only in the block given to RaddDjur::Grammar.new, and these two problems will be solved.

Note that instance_eval and refinement activation have to be done atomically in this case.
That's why I proposed this feature as a new option of instance_eval.

I don't think that a proc is evaluated under unpredicable context is a good idea.
using(refinement, &block) feels even better than it.

Updated by shugo (Shugo Maeda) over 2 years ago

Nobuyoshi Nakada wrote:

Shugo Maeda wrote:

It looks cool, but there are two problems here:

  1. We have to write using RaddDjur::DSL explicitly.
  2. Refinements for DSL is available out of the grammar rules.

These seem irrelevant to instance_eval.

So, do you have another solution?

If instance_eval(using: refinement) is introduced, RaddDjur::DSL can be activated
only in the block given to RaddDjur::Grammar.new, and these two problems will be solved.

Note that instance_eval and refinement activation have to be done atomically in this case.
That's why I proposed this feature as a new option of instance_eval.

I don't think that a proc is evaluated under unpredicable context is a good idea.
using(refinement, &block) feels even better than it.

I don't catch your point.
using(refinement, &block) can also be used to evaluate a proc under unpredictable context,
can't it?

I think features using instance_eval(using:) are extraordinary, and used refinements
should be described in documentation.

Updated by shyouhei (Shyouhei Urabe) over 2 years ago

We looked at this issue at yesterday's developer meeting.

About performance, Matz wanted to hear opinions of JRuby implementors. It might be true that it is negligible for the MRI, but situation might be different for others.

Updated by enebo (Thomas Enebo) over 2 years ago

What is the scope of instance_eval here? Can I do:

instance_eval(using: MyRefinements), &a_block_from_somewhere)

Or how about?

instance_eval(using: MyRefinements), &objectWhichhasToProc)

Either of these essentially makes a lexically defined feature into a non-lexical one. It also means absolutely any code in the system may potentially be refined.

Updated by headius (Charles Nutter) over 2 years ago

I'll echo Tom's comments...this is dynamically-scoped refinements all over again, which we discussed heavily. There's two big reasons why this is a risk:

  • Performance. We decided that refinements would be lexical only in order to limit the impact of refinements on non-refined code. There's no way to treat a block as non-refined code since it might be refined at any time. I am interested to see how MRI can refine any block anywhere without impacting non-refined performance.
  • Readability. Now EVERY block in the system could potentially get refined. You will NEVER again be able to look at a piece of code in a block and know it's calling the methods you want it to call.

Back when we were first putting refinements together, we all agreed to keep them lexical. This is not lexical anymore.

I guess I'm very confused why this feature is needed for the DSL example. rspec implements a very similar DSL and does not use refinements today.

We could make this work and still keep refinements lexically scoped if we allowed using to happen within a block:

require "radd_djur"

calc = RaddDjur::Grammar.new(:expr) {
  using RaddDjur::DSL

  define :expr do # refined call to "define"
    [:int, "+", :int].bind { |x, *, y|  # refined "bind"
      ret x + y # refined "ret" and "+"
    } / # refined "/"
    [:int, "-", :int].bind { |x, *, y| # etc
      ret x - y
    }  
...

I can't recall why we wanted to avoid "using" within a block or method body.

I still believe this should be a lower-level language feature, as a keyword or similar. The more dynamic you make it, the more unpredictable and unreliable code is going to become.

I'll have a look at the proposed implementation.

Updated by headius (Charles Nutter) over 2 years ago

Is this thread-safe? Would it be possible for two threads to refine the same block in different ways and step on each other?

I see that instance_eval (yield_under) creates a new cref for each instance_eval call...but if I'm reading it right it shares the refinements collection with prev_cref, right? So it seems to me a given block could have its refinements change across threads.

rb_using_module impacts the method cache. Not sure if that's a concern or not, but one thread doing dynamic refinements could impact every unrefined call on other threads, right?

Cross-thread refinement changes could impact work MRI folks are doing on optimization and deoptimization.

I have other questions to understand how MRI reduces the impact of refinements on unrefined code...

Does MRI throw away the block's method cache every time instance_eval is called? Method caches can be se per-activation, rather than just sourced from the iseq?

Does MRI currently check for refinements before every method call?

Updated by headius (Charles Nutter) over 2 years ago

I have threading concerns.

module X; refine Fixnum do; def +(x); puts "X refined"; super; end; end; end

module Y; refine Fixnum do; def +(y); puts "y refined"; super; end; end; end

def eval_with_my_refinements(refinements, &block)
  instance_eval(using: refinements, &block)
end

Thread.new { eval_with_my_refinements(X) { 1 + 1 } }
Thread.new { eval_with_my_refinements(Y) { 1 + 1 } }

I don't believe you can predict which + will be called in each case. It is non-deterministic because it depends on which thread mutates the refinements collection last before the + calls happen.

Updated by headius (Charles Nutter) over 2 years ago

Yes, it appears that every call to instance_eval(using: Foo ...) blows away the global method cache by calling rb_using_module. So one library using instance_eval+using will hurt performance for every method call, in the same way that Object#extend does. This is more an MRI concern, since JRuby and JRuby+Truffle invalidate on a much smaller scale.

New Ruby features should not hurt code that never uses those features, right?

Updated by shugo (Shugo Maeda) over 2 years ago

Thomas Enebo wrote:

What is the scope of instance_eval here? Can I do:

The answer is yes, in my original proposal. But It may be possible to prohibit these uses.

If we add such a restriction, the following way suggested by Charles might be better:

require "radd_djur"

calc = RaddDjur::Grammar.new(:expr) {
  using RaddDjur::DSL

  define :expr do # refined call to "define"
    [:int, "+", :int].bind { |x, *, y|  # refined "bind"
      ret x + y # refined "ret" and "+"
    } / # refined "/"
    [:int, "-", :int].bind { |x, *, y| # etc
      ret x - y
    }  
...

Either of these essentially makes a lexically defined feature into a non-lexical one. It also means absolutely any code in the system may potentially be refined.

Yes.

Updated by shugo (Shugo Maeda) over 2 years ago

Charles Nutter wrote:

I have threading concerns.

module X; refine Fixnum do; def +(x); puts "X refined"; super; end; end; end

module Y; refine Fixnum do; def +(y); puts "y refined"; super; end; end; end

def eval_with_my_refinements(refinements, &block)
  instance_eval(using: refinements, &block)
end

Thread.new { eval_with_my_refinements(X) { 1 + 1 } }
Thread.new { eval_with_my_refinements(Y) { 1 + 1 } }

Do you mean the following case?

b = Proc.new { 1 + 1 }
Thread.new { eval_with_my_refinements(X, &b) }
Thread.new { eval_with_my_refinements(Y, &b) }

I don't believe you can predict which + will be called in each case. It is non-deterministic because it depends on which thread mutates the refinements collection last before the + calls happen.

Yes, you'll get unexpected results in this case.

Updated by shugo (Shugo Maeda) over 2 years ago

Charles Nutter wrote:

Yes, it appears that every call to instance_eval(using: Foo ...) blows away the global method cache by calling rb_using_module. So one library using instance_eval+using will hurt performance for every method call, in the same way that Object#extend does. This is more an MRI concern, since JRuby and JRuby+Truffle invalidate on a much smaller scale.

It's true that cache is invalidated by instance_eval(using:), but global method caching has been improved in MRI, and I don't know how different compared to JRuby.

New Ruby features should not hurt code that never uses those features, right?

I don't know it should be applied to code used with the new features.

Updated by headius (Charles Nutter) over 2 years ago

Yes, you'll get unexpected results in this case.

I think you'd get unexpected results in my original case too, wouldn't you? Both of those two blocks still have the same prev_cref, which is where the refinements collection comes from. Am I wrong?

In any case, even with the single Proc, it seems like a showstopper in this design. There's no way to avoid two threads refining the same block incompatibly.

It's true that cache is invalidated by instance_eval(using:), but global method caching has been improved in MRI, and I don't know how different compared to JRuby.

What is the impact of that invalidation? I am not familiar with how MRI globally invalidates these days and how it reduces the impact.

I don't know it should be applied to code used with the new features.

But if I have chosen not to use this feature, for the performance of my application, I'll also have to check every library I depend on to see if they use the feature. If I don't, such a library might impact the performance of my entire application. I don't think we want that.

So summarizing my concerns up to this point:

  • The current design is not thread-safe. It might be possible to make it thread-safe at the cost of additional complexity, which may mean further reducing performance. (design issue)
  • If any code in the system uses the current implementation of this feature, that impacts the performance unrelated code by invalidating global caches. (implementation issue)
  • All blocks everywhere in the system will now be suspect; you will not be able to tell what method will be called unless you control everywhere that block will be passed (usability issue, in my opinion)

Updated by shugo (Shugo Maeda) over 2 years ago

Charles Nutter wrote:

Yes, you'll get unexpected results in this case.

I think you'd get unexpected results in my original case too, wouldn't you? Both of those two blocks still have the same prev_cref, which is where the refinements collection comes from. Am I wrong?

Ah, I was wrong...in both cases you get expected results because cref is newly created by each call of instance_eval(using:).

I tried the following code, and Thread X always returned "refined by X" and Thread Y always returned "Y".

module X; refine Fixnum do; def +(x); "refined by X"; end; end; end

module Y; refine Fixnum do; def +(y); "refined by Y"; end; end; end

def eval_with_my_refinements(refinements, &block)
  instance_eval(using: refinements, &block)
end

b = Proc.new { 100.times { p [Thread.current.name, 1 + 1]; Thread.pass } }
[
  Thread.new { Thread.current.name = "X"; eval_with_my_refinements(X, &b) },
  Thread.new { Thread.current.name = "Y"; eval_with_my_refinements(Y, &b) },
].each(&:join)

It's true that cache is invalidated by instance_eval(using:), but global method caching has been improved in MRI, and I don't know how different compared to JRuby.

What is the impact of that invalidation? I am not familiar with how MRI globally invalidates these days and how it reduces the impact.

In MRI, global cache is invalidated per class.

I don't know it should be applied to code used with the new features.

But if I have chosen not to use this feature, for the performance of my application, I'll also have to check every library I depend on to see if they use the feature. If I don't, such a library might impact the performance of my entire application. I don't think we want that.

So summarizing my concerns up to this point:

  • The current design is not thread-safe. It might be possible to make it thread-safe at the cost of additional complexity, which may mean further reducing performance. (design issue)
  • If any code in the system uses the current implementation of this feature, that impacts the performance unrelated code by invalidating global caches. (implementation issue)
  • All blocks everywhere in the system will now be suspect; you will not be able to tell what method will be called unless you control everywhere that block will be passed (usability issue, in my opinion)

Anyway, I understand your concerns. Thanks for your feedback.

Also available in: Atom PDF