Feature #19832
closedMethod#destructive?, UnboundMethod#destructive?
Description
I propose to add destructive?
property to Method
and UnboundMethod
instances, which shall behave like:
String.instance_method(:<<).destructive? # => true
String.instance_method(:+).destructive? # => false
One main purpose of using these classes is to inspect and make sure how a certain method behaves. Besides arity and owner, whether a method is destructive or not is one important piece of information, but currently, you cannot achieve that from Method
or UnboundMethod
instances.
The problem is how to implement this. It is best if this information (whether or not a method is destructive) can be extracted automatically from the method definition.
Unlike owner and arity, it may or may not be straightforward by statically analyzing the code. I think that, if a method definition defined at the ruby level does not call a destructive method anywhere within its own definition, and no dynamic method calls (send
, eval
, etc.) are made, then we can say that the method is non-destructive. If it does call, then the method is most likely a destructive method (it would not be destructive if the internally-called destructive method is applied to a different object. Or, we could rather call that a destructive method in the sense that it has a destructive side effect).
If doing that turns out to be difficult for some or all cases, then a practical approach for the difficult cases is to label the methods as destructive or not, manually. We can perhaps have methods Module#destructive
and Module#non_destructive
which take (a) symbol/string argument(s) and return the method name(s) in symbol so that they can be used like:
class A
destructive private def some_destructive_private_method
...
end
end
or
class A
def foo; ... end
def bar; ... end
def baz; ... end
non_destructive :foo, :baz
destructive :bar
end
or
class A
non_destructive
def foo; ... end
def baz; ... end
destructive
def bar; ... end
end
When the method is not (yet) specified whether destructive or not, the return value can be "unknown"
(or :unknown
or nil
) by default.
String.instance_method(:<<).destructive? # => "unknown"
Updated by baweaver (Brandon Weaver) over 1 year ago
This may have some very interesting use-cases around Ractor of forbidding any state mutation methods inside of Ractor contexts.
Many of us who program Ruby enough probably know what is destructive, sure, but having that officially written down and standardized could unlock a lot of interesting potential.
Updated by shyouhei (Shyouhei Urabe) over 1 year ago
Well yes this property could be very useful for Ractor, JIT, and many more... except it's impossible.
Consider for instance Array#map
. Is this method destructive? Well it could be used in a destructive way. But who knows.
Also because methods can (and actually tend to) be redefined on the fly, any methods that call other methods inside (~ 99.99% of the case) cannot tell if they are destructive or not. Situation changes from time to time.
Whether a method call modifies its receiver or not ultimately does not fix until that method call actually modifies its receiver. This is how the language is designed to be.
Updated by Dan0042 (Daniel DeLorme) over 1 year ago
This is a great idea. No doubt there are many cases where it's impossible to know for sure if a method is destructive or not, but it should be possible to make it good enough to be useful.
Consider for instance Array#map. Is this method destructive? Well it could be used in a destructive way. But who knows.
Can you explain that? I don't see how Array#map could be destructive; even if the array is modified in the block, that is not a property of the #map method itself.
Whether a method call modifies its receiver or not ultimately does not fix until that method call actually modifies its receiver. This is how the language is designed to be.
I must ask, is that really relevant? "x".freeze.concat("")
raises a FrozenError even though the receiver would not have been modified. Because String#concat is considered a destructive method no matter if it doesn't actually modify its receiver. We want to know if the method is generally destructive, not if a particular call is destructive.
For core class methods I think this 'destructive' flag can be inferred from the presence of rb_check_frozen
. In fact this would mesh very well with the way method signatures are defined in pseudo-ruby with the body defined via Primitive
. Then rb_check_frozen
could be pulled out of the C code and expressed as part of the method signature. For example something like
def concat(ary)
Primitive.modify! #=> call rb_check_frozen, and also mark this method as destructive
Primitive.rb_ary_concat(ary)
end
For regular ruby code, probably the only way to know if a method is destructive is to check for instance variable assignments. It's not perfect but it should serve as a good enough definition of 'destructive'. It's almost certainly impossible to propagate the 'destructive' flag transitively (#foo would be considered non-destructive even if it calls #bar destructive method). But if a method has the super
keyword it may be possible to inherit the 'destructive' flag from up the call chain.
Updated by janosch-x (Janosch Müller) over 1 year ago
Dan0042 (Daniel DeLorme) wrote in #note-10:
For regular ruby code, probably the only way to know if a method is destructive is to check for instance variable assignments.
A lot of everyday Ruby code seems to be destructive, not so much by setting instance variables, but rather by modifying them (e.g. Arrays or Hashes).
It's almost certainly impossible to propagate the 'destructive' flag transitively (#foo would be considered non-destructive even if it calls #bar destructive method).
You mean it would be impossible at load time? Maybe it's worth exploring a "static analysis" variant of this feature? Could it be tied in to RBS? Such an approach might allow for transitivity, which would make it much easier to provide this information for most existing code outside the stdlib. I guess these generated method attributes would still need to be able to change at runtime, e.g. in case of method overrides, and these changes would need to be propagated down all known call chains. (Using send
and such might need to propagate unknown
destructiveness down all call chains.)
Without transitivity, this feature might still be nice for inspecting the stdlib, adding visual hints to the docs etc.
@sawa (Tsuyoshi Sawada) Did you have a particular use case in mind for this feature?
Updated by shyouhei (Shyouhei Urabe) over 1 year ago
Asserting that a method is destructive in spite of it does not modify its receiver is kind of safe. The problem is to prove that a method marked as non-destructive actually never do so. This is arguably impossible.
Dan0042 (Daniel DeLorme) wrote in #note-10:
Consider for instance Array#map. Is this method destructive? Well it could be used in a destructive way. But who knows.
Can you explain that? I don't see how Array#map could be destructive; even if the array is modified in the block, that is not a property of the #map method itself.
Do you mean IO#printf
is not destructive because everything destructive is implemented by IO#write
and printf is merely calling it? That sounds counter-intuitive to me.
Updated by Dan0042 (Daniel DeLorme) over 1 year ago
janosch-x (Janosch Müller) wrote in #note-11:
A lot of everyday Ruby code seems to be destructive, not so much by setting instance variables, but rather by modifying them (e.g. Arrays or Hashes).
shyouhei (Shyouhei Urabe) wrote in #note-12:
Do you mean
IO#printf
is not destructive because everything destructive is implemented byIO#write
and printf is merely calling it? That sounds counter-intuitive to me.
Ok, I think we have different ideas of what is "destructive", because to me it's not about side-effects that would require a Monad if we were coding in Haskell. Because we are coding in Ruby, I would define a "destructive" operation as something that fails if the object is frozen. And @buf << 42
does not fail if self
is frozen. This is how the frozen flag has always worked and I can't really imagine introducing a new way of defining/handling destructive operations. So in that sense neither IO#printf
nor IO#write
are destructive.
janosch-x (Janosch Müller) wrote in #note-11:
Maybe it's worth exploring a "static analysis" variant of this feature? Could it be tied in to RBS? Such an approach might allow for transitivity, which would make it much easier to provide this information for most existing code outside the stdlib.
That sounds very cool and very hard to implement. Also it opens a big can of worms in terms of where do you draw the line for how the 'destructive' flag propagates? Which of those foo methods would you consider to be destructive?
def mut! = @x = 42 #destructive
def foo1 = mut! #call a destructive method on self
def foo2(buf) = buf << 42 #call a destructive method on an argument
def X.indirect(v) = v.mut!
def foo3 = X.indirect(self) #indirectly call a destructive method on self
def foo4 = (buf = []; buf << 42) #call a destructive method on a new object created in the method
shyouhei (Shyouhei Urabe) wrote in #note-12:
Asserting that a method is destructive in spite of it does not modify its receiver is kind of safe. The problem is to prove that a method marked as non-destructive actually never do so. This is arguably impossible.
I definitely understand what you mean about safety, but the opposite can also be said. There can be value in knowing that a method is provably destructive. We could fail early if the object is frozen. Maybe the JIT can avoid speculative optimizations (and the cost of de-optimization) that don't hold for destructive methods. etc.
Updated by shyouhei (Shyouhei Urabe) over 1 year ago
Dan0042 (Daniel DeLorme) wrote in #note-13:
shyouhei (Shyouhei Urabe) wrote in #note-12:
Do you mean
IO#printf
is not destructive because everything destructive is implemented byIO#write
and printf is merely calling it? That sounds counter-intuitive to me.Ok, I think we have different ideas of what is "destructive", because to me it's not about side-effects that would require a Monad if we were coding in Haskell. Because we are coding in Ruby, I would define a "destructive" operation as something that fails if the object is frozen. And
@buf << 42
does not fail ifself
is frozen. This is how the frozen flag has always worked and I can't really imagine introducing a new way of defining/handling destructive operations. So in that sense neitherIO#printf
norIO#write
are destructive.
STDOUT.freeze.printf("Hello, World!")
But yes, I agree that the word "destructive" here is kind of vague, and we are seeing different things. Maybe @sawa (Tsuyoshi Sawada) could show us his intended usage of the propsed functionality for more pragmatic discussions.
Updated by janosch-x (Janosch Müller) over 1 year ago
Dan0042 (Daniel DeLorme) wrote in #note-13:
it opens a big can of worms in terms of where do you draw the line for how the 'destructive' flag propagates? Which of those foo methods would you consider to be destructive?
I was really thinking all of them – treating "destructive" as an opposite of "functionally pure".
In addition to the pitfalls mentioned in #note-11, this would probably also require to differentiate between objects that existed before a method call and those created within, almost like a borrow checker, because even in the stdlib, a lot of "outwardly non-mutating" code does mutate freshly created ruby objects internally before returning them, and we wouldn't want to mark such cases as "destructive".
Thus I admit that automatic transitivity is a bridge too far.
Updated by kddnewton (Kevin Newton) over 1 year ago
I don't understand how you would go about statically detecting instance variable mutations. Consider
class Foo
def bar = instance_variable_set(:@baz, 1)
end
is bar
"functionally pure" or not? Is it "destructive"? Very difficult to say. Now do:
module Wat
def instance_variable_set(name, value) = "wat"
end
Foo.include(Wat)
I don't think it's viable to do this statically, certainly not reliably.
Updated by Dan0042 (Daniel DeLorme) over 1 year ago
STDOUT.freeze.printf("Hello, World!")
Well look at that! You learn something new every day :-D
Updated by Dan0042 (Daniel DeLorme) over 1 year ago
kddnewton (Kevin Newton) wrote in #note-16:
I don't understand how you would go about statically detecting instance variable mutations.
Actually I'm starting to think this is doable. If we restrict ourselves to methods invoked on self
only, and we compute at the time the #destructive?
method is called (so it's not statically at load time), it should be possible to figure out which methods are destructive based on the inheritance graph at that time, and propagate the 'destructive' flag to the calling method. So if a method calls instance_variable_set
or self.instance_variable_set
we can find out if we're calling the original destructive method or an overridden version. For Method#destructive?
we can even take into account singleton methods. If the method calls via x=self; x.instance_variable_set
or send(:instance_variable_set)
then it's not possible, but that's a limitation I could live with.
Updated by rubyFeedback (robert heiler) over 1 year ago
I like sawa's idea from an "awesome introspection" point of view, that is, if ruby is
able to determine which methods are "destructive" (or "dangerous", however we want to
call it actually), then I believe having access to that additional information can
be useful, as others have pointed out here in this discussion. Personally I have not
needed this feature/information yet, though.
I understand the rationale used by sawa for ad-hoc method calls, but the more elegant
solution would be to have ruby yield this information automatically, if this would
be at all possible, as Dan0042 opined.
Updated by luke-gru (Luke Gruber) over 1 year ago
IMO this feature wouldn't really add anything new to ractors that I can think of, because ractors already work with either new "moved" objects or deeply frozen objects, which can still have methods called on them inside the ractors. For JIT it could be useful but like others have said, it's not viable (too costly) to track all this information at run-time and statically it's just not possible.
Updated by matz (Yukihiro Matsumoto) over 1 year ago
- Status changed from Open to Rejected
It should be fundamentally covered by the reference document.
Matz.