Feature #19000
closedData: Add "Copy with changes method" [Follow-on to #16122 Data: simple immutable value object]
Description
As requested: extracted a follow-up to #16122 Data: simple immutable value object from this comment
Proposal: Add a "Copy with changes" method to Data¶
Assume the proposed Data.define
exists.
Seeing examples from the [Values gem]:
require "values"
# A new class
Point = Value.new(:x, :y)
# An immutable instance
Origin = Point.with(x: 0, y: 0)
# Q: How do we make copies that change 1 or more values?
right = Origin.with(x: 1.0)
up = Origin.with(y: 1.0)
up_and_right = right.with(y: up.y)
# In loops
movements = [
[ :x, +0.5 ],
[ :x, +0.5 ],
[ :y, -1.0 ],
[ :x, +0.5 ],
]
# position = Point(x: 1.5, y: -1.0)
position = movements.inject(Origin) do |p, (field, delta)|
p.with(field => p.send(field) + delta)
end
Proposed detail: Call this method: #with
Money = Data.define(:amount, :currency)
account = Money.new(amount: 100, currency: 'USD')
transactions = [+10, -5, +15]
account = transactions.inject(account) { |a, t| a.with(amount: a.amount + t) }
#=> Money(amount: 120, currency: "USD")
Why add this "Copy with changes" method to the Data simple immutable value class?¶
Called on an instance, it returns a new instance with only the provided parameters changed.
This API affordance is now widely adopted across many languages for its usefulness. Why is it so useful? Because copying immutable value object instances, with 1 or more discrete changes to specific fields, is the proper and ubiquitous pattern that takes the place of mutation when working with immutable value objects.
Other languages
C# Records: “immutable record structs — Non-destructive mutation” — is called with { ... }
https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/record#nondestructive-mutation
Scala Case Classes — is called #copy
https://docs.scala-lang.org/tour/case-classes.html
Java 14+ Records — Brian Goetz at Oracle is working on adding a with copy constructor inspired by C# above as we speak, likely to be called #with
https://mail.openjdk.org/pipermail/amber-spec-experts/2022-June/003461.html
Rust “Struct Update Syntax” via ..
syntax in constructor
https://doc.rust-lang.org/book/ch05-01-defining-structs.html#creating-instances-from-other-instances-with-struct-update-syntax
Alternatives¶
Without a copy-with-changes method, one must construct entirely new instances using the constructor. This can either be (a) fully spelled out as boilerplate code, or (b) use a symmetrical #to_h
to feed the keyword-args constructor.
(a) Boilerplate using constructor
Point = Data.define(:x, :y, :z)
Origin = Point.new(x: 0.0, y: 0.0, z: 0.0)
change = { z: -1.5 }
# Have to use full constructor -- does this even work?
point = Point.new(x: Origin.x, y: Origin.y, **change)
(b) Using a separately proposed #to_h
method and constructor symmetry
Point = Data.define(:x, :y, :z)
Origin = Point.new(x: 0.0, y: 0.0, z: 0.0)
change = { z: -1.5 }
# Have to use full constructor -- does this even work?
point = Point.new(**(Origin.to_h.merge(change)))
Notice that the above are not ergonomic -- leading so many of our peer language communities to adopt the #with
method to copy an instance with discrete changes.
Updated by bkuhlmann (Brooke Kuhlmann) about 2 years ago
💡 In case it's of interest, I've partially solved this problem in the Refinements gem several years back by adding the #merge
and #merge!
methods to Struct
as a refinement. Here's the source code. I'm mostly posting this here in case it helps provide more example solutions to the problem mentioned in Alternative B above:
point = Point.new(**(Origin.to_h.merge(change)))
I've found that using keyword arguments, regardless of whether the struct was constructed using positionals or keywords, has been very effective in practice.
Updated by bdewater (Bart de Water) about 2 years ago
Example from Sorbet which uses with
as well:
class Point < T::Struct
const :x, Numeric
const :y, Numeric
end
# An immutable instance
Origin = Point.new(x: 0, y: 0)
right = Origin.with(x: 1.0)
up = Origin.with(y: 1.0)
up_and_right = right.with(y: up.y)
Updated by RubyBugs (A Nonymous) about 2 years ago
Example from value_semantics gem which uses with
as well:
Point = ValueSemantics::Struct.new do
x Numeric
y Numeric
end
# An immutable instance
Origin = Point.new(x: 0, y: 0)
right = Origin.with(x: 1.0)
up = Origin.with(y: 1.0)
up_and_right = right.with(y: up.y)
Updated by ufuk (Ufuk Kayserilioglu) about 2 years ago
Since Ruby already has the Object#dup
method, I think we don't need to create a new method to duplicate a Data
instance with potentially slightly different values.
I think allowing the dup
method on Data
instances to take optional keyword arguments would be the ideal API for this:
Point = Data.define(:x, :y, :z)
ORIGIN = Point.new(x: 0.0, y: 0.0, z: 0.0)
right = ORIGIN.dup(x: 1.0)
up = ORIGIN.dup(y: 1.0)
up_and_right = right.dup(y: up.y)
Updated by jeremyevans0 (Jeremy Evans) about 2 years ago
ufuk (Ufuk Kayserilioglu) wrote in #note-4:
I think allowing the
dup
method onData
instances to take optional keyword arguments would be the ideal API for this:
I agree. This is the pattern that Sequel uses for datasets (which are always frozen, as Data
instances are), except Sequel uses clone
instead of dup
(for historical reasons and because it also needs to copy singleton classes).
Updated by Dan0042 (Daniel DeLorme) about 2 years ago
ufuk (Ufuk Kayserilioglu) wrote in #note-4:
I think allowing the
dup
method onData
instances to take optional keyword arguments would be the ideal API for this
I had a strong visceral "no" reaction to this. To a certain extent it's a logical idea, we are after all creating a kind of copy. But the dup/clone API is so old, so well established and so consistent throughout ruby, changing/extending it would be a mistake.
Updated by ufuk (Ufuk Kayserilioglu) about 2 years ago
... Sequel uses
clone
instead ofdup
(for historical reasons and because it also needs to copy singleton classes).
I think copying singleton classes into such copies might NOT be desired behaviour for Data
objects. So, it might make more sense to use dup
for this. That means, it should work in the following way:
Point = Data.define(:x, :y, :z)
ORIGIN = Point.new(x: 0.0, y: 0.0, z: 0.0)
ORIGIN.singleton_class.define_method(:distance_from_origin) { 0 }
right = ORIGIN.dup(x: 1.0)
up = ORIGIN.dup(y: 1.0)
up_and_right = right.dup(y: up.y)
ORIGIN.distance_from_origin # => 0
right.distance_from_origin # => undefined method `distance_from_origin' for #<data Point x=1.0, y=0.0, z=0.0> (NoMethodError)
up.distance_from_origin # => undefined method `distance_from_origin' for #<data Point x=0.0, y=1.0, z=0.0> (NoMethodError)
up_and_right.distance_from_origin # => undefined method `distance_from_origin' for #<data Point x=1.0, y=1.0, z=0.0> (NoMethodError)
Dan0042 (Daniel DeLorme) wrote in #note-6:
ufuk (Ufuk Kayserilioglu) wrote in #note-4:
... But the dup/clone API is so old, so well established and so consistent throughout ruby, changing/extending it would be a mistake.
I am not sure we would be extending the API in any incompatible way. Calling dup
with no arguments would still work as it should, by creating a copy of the original object. If one supplies, optional, keyword arguments to dup
, though, then one would get the special copying that is suitable for Data
objects. BTW, it should be an error to supply a keyword argument that the Data class does not have as members.
Point = Data.define(:x, :y, :z)
ORIGIN = Point.new(x: 0.0, y: 0.0, z: 0.0)
cloned_origin = ORIGIN.dup # => works as expected
illegal = ORIGIN.dup(w: 1.0) # => ArgumentError
Updated by RubyBugs (A Nonymous) about 2 years ago
it should be an error to supply a keyword argument that the Data class does not have as members
+1 Agree!
Please note, performance work should be planned to ensure that such checking of the keyword args is as fast as possible and avoids unnecessary allocations.
For example, here's a real life pull request which sped up this error checking the Values gem #with
method by 2.29x:
https://github.com/ms-ati/Values/commit/c3f00e3e8d67ff5d6470315ad137166177f079e2
Updated by tomstuart (Tom Stuart) about 2 years ago
While it’s undeniable that this method duplicates its receiver in some sense, I’d like to add my support for the #with
naming choice.
I don’t have any technical objection to using #dup
, it’s just a stylistic preference: in the interests of programmer happiness, to me it’s more natural for the name to communicate the intent of the sender (“I want an object just like ORIGIN
, except #with
an x
of 1.0
”) rather than the lower-level implementation detail of what that operation involves (“I want a shallow clone of ORIGIN
via #allocate
& #initialize_copy
, except assigning 1.0
as the value of the @x
instance variable”).
Perhaps another way of putting this is that I’d estimate most Ruby programmers rarely call Object#dup
in the course of their work, whereas I would expect users of Data
to need this new “copy with changes” operation somewhat frequently, especially if they’re migrating existing code from Struct
. But of course I have no data to support that guess!
Updated by p8 (Petrik de Heus) about 2 years ago
I'm seeing some overlap with "Object#with to set and restore attributes around a block" https://bugs.ruby-lang.org/issues/18951
Should we have a single method Object#with_attrs
that:
- returns a copy with the attributes set by default
- sets and restores the attributes when passed a block? (although that doesn't make sense for the immutable Data)
Origin = Point.new(x: 0, y: 0)
right = Origin.with_attrs(x: 1.0)
right.x # => 1.0
SomeLibrary.with_attrs(enabled: true) do
SomeLibrary.enabled # => true
end
SomeLibrary.enabled # => false
Updated by nobu (Nobuyoshi Nakada) about 2 years ago
- Description updated (diff)
At the developers meeting today, someone proposed something like this.
point = Point.new(**(Origin.to_h.merge(change)))
point = Point.new(**Origin.to_h, **change)
Updated by tomstuart (Tom Stuart) about 2 years ago
nobu (Nobuyoshi Nakada) wrote in #note-11:
At the developers meeting today, someone proposed something like this.
point = Point.new(**Origin.to_h, **change)
Agreed — this is what I do at the moment. There’s no need to #merge
explicitly because “last keyword wins”, and although repeating a literal keyword will generate a warning, the same isn’t true when double-splatting the result of #to_h
.
Updated by mame (Yusuke Endoh) about 2 years ago
@RubyBugs (A Nonymous) I cannot understand the following parts of your first motivative example
# In loops
movements = [
{ x: +0.5 },
{ x: +0.5 },
{ y: -1.0 },
{ x: +0.5 },
]
# position = Point(x: 1.5, y: -1.0)
position = movements.inject(Origin) { |p, move| p.with(**move) }
According to the comment, do you really expect Point.new(x: 0.0+0.5+0.5+0.5, y: 0.0-1.0)
? I don't think the proposed operation should mean addition implicitly.
If it is a mistake, could you update the code example?
Updated by ufuk (Ufuk Kayserilioglu) about 2 years ago
I just put up a PR implementing this using the dup
proposal: https://github.com/ruby/ruby/pull/6766
Here are my objections to other proposals:
-
point = Point.new(**Origin.to_h, **change)
works but is very verbose and not intention revealing, in my opinion. For the reader of this code, it is very hard to understand what the intended operation is. -
with_attrs
(from https://bugs.ruby-lang.org/issues/18951): That proposal is explicitly about mutating the receiver ofwith_attrs
, which is the exact opposite of what this "copy with changes" operation is trying to do. For that reason, I don't think we should be conflating that with this proposal. -
with
: While I understand that there is precedent for usingwith
from other languages, I have always found the name to not make it clear that a copy is being made. Sincewith
implies doing something "with" original receiver, it leads less experienced folks to think that the receiver might be modified in the process. I think we have an opportunity to make the operation more clear by giving it a better name.
My reasoning for dup
is to make it explicit that a copy is being created (just as how a string copy is created when "foo".dup
is called) but that the copy has updated values for the supplied member fields. dup
is already an existing protocol for creating shallow copies of objects, and can already be used on Data
class instance even without this change, of course without being able to affect the copy in any way. Extending the existing copying protocol of Ruby to optionally add the ability to affect the copy feels like the best way forward to me.
I would be happy to hear feedback from the community on my PR and if a different name than dup
is picked in the end, also would be happy to update the PR to conform.
Updated by bdewater (Bart de Water) almost 2 years ago
I like dup
as the method name 👍
tomstuart (Tom Stuart) wrote in #note-9:
Perhaps another way of putting this is that I’d estimate most Ruby programmers rarely call
Object#dup
in the course of their work, whereas I would expect users ofData
to need this new “copy with changes” operation somewhat frequently, especially if they’re migrating existing code fromStruct
. But of course I have no data to support that guess!
I don't think dup
is that rare, IME most Ruby programmers are familiar with it even if they don't use it daily. Some unscientific data points of usage in Rails apps:
Updated by RubyBugs (A Nonymous) almost 2 years ago
bdewater (Bart de Water) wrote in #note-15:
I like
dup
as the method name 👍
Is there a way we could get more active Rubyists to weigh in? My sense is that there is a real tension in that:
- Nearly every major value objects gem use
#with
- Most other language with value objects use
#with
On the other hand, there seem to be a number of voices on this Bug thread, who while they don't necessarily currently work with code that uses this exact pattern, feel strongly that overloading the meaning of #dup
is a better choice for Ruby. Even the amazing and incomparable @jeremyevans, whose Sequel gem our team depends on as well :)
While it might not generally be the practice of the Ruby community, would we consider a way to get more "working Rubyist" eyes on this question? Perhaps by getting this thread in to the Ruby Weekly News for example?
In the end, having the method is more important than the name. But it does seem important to let voices be heard?
Updated by mame (Yusuke Endoh) almost 2 years ago
@RubyBugs (A Nonymous) Please check my comment https://bugs.ruby-lang.org/issues/19000#note-13 . A wrong motivation example raises the suspicion that this API is actually confusing to users.
Updated by p8 (Petrik de Heus) almost 2 years ago
If dup
is chosen would it make sense to always allow dup
methods to take arguments for consistency?
For example, it would allow the following code in Rails
firm = Firm.first.dup
firm.account = Account.first
assert_queries(2) { firm.save! }
to be written as:
firm = Firm.first.dup(account: Account.first)
assert_queries(2) { firm.save! }
Updated by Eregon (Benoit Daloze) almost 2 years ago
p8 (Petrik de Heus) wrote in #note-18:
If
dup
is chosen would it make sense to always allowdup
methods to take arguments for consistency?
I'm rather against that, it makes the standard dup
so much more complicated, and it would most likely be pretty slow in CRuby.
If it's only for Data it's a much smaller concern.
Updated by Dan0042 (Daniel DeLorme) almost 2 years ago
p8 (Petrik de Heus) wrote in #note-18:
If
dup
is chosen would it make sense to always allowdup
methods to take arguments for consistency?
It would make sense, but it would require everyone who wrote a custom #dup
to extend their version to support this new API pattern. Not likely to happen.
On the other hand with #with
it would be possible to implement a general version for all objects
def with(attr={})
attr.each_with_object(self.dup) do |(k,v),obj|
obj.send("#{k}=", v)
end
end
(but let's keep in mind this proposal is about Data#with, not Object#with)
Updated by RubyBugs (A Nonymous) almost 2 years ago
mame (Yusuke Endoh) wrote in #note-17:
@RubyBugs (A Nonymous) Please check my comment https://bugs.ruby-lang.org/issues/19000#note-13 . A wrong motivation example raises the suspicion that this API is actually confusing to users.
Quite right @mame (Yusuke Endoh), thank you for catching that! I provided two examples. The first example is intended to use the previous values in creating the next values, and that part was broken.
How is this?
##
# Example of proposed Data#with
#
# To try this out:
#
# ruby-install ruby-3.2.0-preview3
#
##
Point = Data.define(:x, :y) do
# Example only, too slow. Real implementation should eliminate allocations.
def with(**args)
raise ArgumentError unless args.keys.all? { |k| members.include? k }
self.class.new(**(to_h.merge(args)))
end
end
Origin = Point.new(x: 0, y: 0)
# Q: How do we make copies that change 1 or more values?
right = Origin.with(x: 1.0)
up = Origin.with(y: 1.0)
up_and_right = right.with(y: up.y)
begin
Origin.with(x: 1, z: 0)
rescue ArgumentError
puts "#with(z: 0) raised ArgumentError as expected"
end
# In loops
movements = [
[ :x, +0.5 ],
[ :x, +0.5 ],
[ :y, -1.0 ],
[ :x, +0.5 ],
]
position = movements.inject(Origin) do |p, (field, delta)|
p.with(field => p.send(field) + delta)
end
# position = Point(x: 1.5, y: -1.0)
Updated by RubyBugs (A Nonymous) almost 2 years ago
For comparison, you can plug in the following example code for a faster implementation of #with
and it should work the same:
Point = Data.define(:x, :y) do
# Example only, too slow. Real implementation should eliminate allocations.
# def with(**args)
# raise ArgumentError unless args.keys.all? { |k| members.include? k }
# self.class.new(**(to_h.merge(args)))
# end
# A trick to avoid `nil` for defaults, so that `nil` can be a valid value
DEFAULT = Class.new.new.freeze
# An example of a faster implementation that could inspire auto-generation
def with(x: DEFAULT, y: DEFAULT)
self.class.new(
x: x.equal?(DEFAULT) ? self.x : x,
y: y.equal?(DEFAULT) ? self.y : y,
)
end
end
Updated by mame (Yusuke Endoh) almost 2 years ago
RubyBugs (A Nonymous) wrote in #note-21:
How is this?
Thanks for the update.
Now I have a question. Do you really want to write p.with(field => p.send(field) + delta)
? I don't think it is very elegant. It is not very convincing (at least, to me) as a first motivation example.
Also, do you need the ability to update multiple fields at once? Both motivation examples only update a single field. This may be the result of simplifying the motivation example, though.
Looking at these motivation examples, there may be room to consider an API like p.with(field) {|old_value| old_value + delta }
or something.
Updated by mame (Yusuke Endoh) almost 2 years ago
Discussed at the dev meeting.
Regarding method names, @matz (Yukihiro Matsumoto) likes Data#update
. However, @ko1 (Koichi Sasada) objects because "update" is used as a destructive vocabulary in Hash#update
. @matz (Yukihiro Matsumoto) said Data#with is acceptable; Data#dup
is not acceptable.
Updated by ufuk (Ufuk Kayserilioglu) almost 2 years ago
Thanks for the discussion and the name suggestion. I updated the PR https://github.com/ruby/ruby/pull/6766 to use Data#with
. I would be grateful if it could get a review.
However, that raises a question: How should Data#with
behave if it is passed no arguments? Semantically, that should mean "create a copy of the current data object with no fields changed", but then that ends up being identical to what Kernel#dup
does. Also, with
with no arguments does not read that well.
@matz (Yukihiro Matsumoto) and @mame (Yusuke Endoh) Should it be an error?
Example:
Point = Data.define(:x, :y)
origin = Point.new(x: 0, y: 0)
new_origin = origin.with # should this be an error?
Updated by matz (Yukihiro Matsumoto) almost 2 years ago
I think it should be an error. I am still not 100% sure with
is the best name.
Matz.
Updated by ufuk (Ufuk Kayserilioglu) almost 2 years ago
Thank you @matz (Yukihiro Matsumoto). I will update the implementation to make the no-args case an error, and point people to use dup
to make identical shallow clones instead in the error message.
I read the dev meeting notes and saw that you made a point about the main of the operation not being about duplication. While I agree with that, I also feel that makes people think that they are making a cheap operation, when they are actually creating a clone with modified values. I had suggested dup
to make that explicit, but happy to go with another name that is chosen that gives a similar sense of an extra clone operation happening under the hood. Another name that was suggested was a dup_with
that I also find acceptable personally.
Along the same lines, I feel like update
gives the wrong impression that the receiver instance will be updated, so my vote would be against it.
Updated by RubyBugs (A Nonymous) almost 2 years ago
ufuk (Ufuk Kayserilioglu) wrote in #note-28:
Thank you @matz (Yukihiro Matsumoto). I will update the implementation to make the no-args case an error, and point people to use
dup
to make identical shallow clones instead in the error message.
Hi Matz and Ufuk, please consider that making no-args an error makes programming around this API dynamically more complicated — to generate a set of changes as a Hash and pass then goes from:
changes = {} # calculated somehow
p = Origin.with(**changes) # p == Origin is ok
to:
changes = {} # calculated somehow
p = unless changes.empty
Origin.with(**changes)
else
Origin
end
Because dynamically generated changes is an anticipated use for such an API, would we consider allowing calling #with
with no arguments to return self
?
Semantically one could describe this API behavior as: “an immutable value object with no changes is itself”?
As another reference point: in Scala case class
this operation is called #copy
, and I think it returns the original object instance in the no-args case.
https://docs.scala-lang.org/tour/case-classes.html
Updated by tomstuart (Tom Stuart) almost 2 years ago
RubyBugs (A Nonymous) wrote in #note-29:
please consider that making no-args an error makes programming around this API dynamically more complicated
I second this request. In the interests of regularity, it should be fine to ask for a Data
instance #with
“no changes”, otherwise the programmer has to anticipate & handle that as an edge case unnecessarily if they intend to generically handle changes which originate from elsewhere rather than supplying keyword arguments inline.
Whether doing so returns self
or a shallow copy shouldn’t be very important since the instances are immutable, but I agree that returning self
would be more semantically predictable, slightly more efficient, and a nice way to differentiate it from the behaviour of #dup
.
Updated by ufuk (Ufuk Kayserilioglu) almost 2 years ago
RubyBugs (A Nonymous) wrote in #note-29:
Hi Matz and Ufuk, please consider that making no-args an error makes programming around this API dynamically more complicated
Indeed, that is true. However, personally, I find a with
call with no arguments very hard to read and reason about. It just reads badly as an English construct. It is like saying, "can I have tea with".
Semantically one could describe this API behavior as: “an immutable value object with no changes is itself”?
Indeed this is a good idea for what the behaviour should be but does not alleviate the naming/readability concern that I've raised above.
As another reference point: in Scala
case class
this operation is called#copy
, and I think it returns the original object instance in the no-args case.
https://docs.scala-lang.org/tour/case-classes.html
Indeed, it was along these lines that I'd suggested dup
originally. Like I've said before, I am flexible on what the name should be and copy
actually addresses the naming concern equally as well as dup
in my opinion. It also makes sense to ask for a "copy" with no values changed and readability is not hurt in any way.
Updated by austin (Austin Ziegler) almost 2 years ago
ufuk (Ufuk Kayserilioglu) wrote in #note-31:
RubyBugs (A Nonymous) wrote in #note-29:
As another reference point: in Scala
case class
this operation is called#copy
, and I think it returns the original object instance in the no-args case.
https://docs.scala-lang.org/tour/case-classes.htmlIndeed, it was along these lines that I'd suggested
dup
originally. Like I've said before, I am flexible on what the name should be andcopy
actually addresses the naming concern equally as well asdup
in my opinion. It also makes sense to ask for a "copy" with no values changed and readability is not hurt in any way.
I think that #copy
is OK, but this is more of a (checked?) #merge
, right? I think that would be where my question comes in. If origin
is a Point
containing x
and y
values, then origin#copy(x: 1, y: 2, z: 3)
should do…what? Should it error or just take the values that it knows? Because the shape provided to #copy
is no longer a Point
shape.
Updated by ufuk (Ufuk Kayserilioglu) almost 2 years ago
austin (Austin Ziegler) wrote in #note-32:
If
origin
is aPoint
containingx
andy
values, thenorigin#copy(x: 1, y: 2, z: 3)
should do…what? Should it error or just take the values that it knows?
The current implementation in the PR treats unknown keywords passed to this method as an error. I had already made it clear that I thought it should be an error at the end of my message earlier in this thread https://bugs.ruby-lang.org/issues/19000#note-7 and I think we are all in agreement on this point in this thread.
Updated by RubyBugs (A Nonymous) almost 2 years ago
…treats unknown keywords passed to this method as an error. I had already made it clear that I thought it should be an error at the end of my message earlier in this thread … and I think we are all in agreement on this point in this thread.
+1 agreement on treating unknown keywords as error! :-)
Updated by RubyBugs (A Nonymous) almost 2 years ago
Hi @mame (Yusuke Endoh)! Thank you for your questions.
mame (Yusuke Endoh) wrote in #note-24:
Thanks for the update.
Now I have a question. Do you really want to write
p.with(field => p.send(field) + delta)
? I don't think it is very elegant. It is not very convincing (at least, to me) as a first motivation example.
The challenge is that this operation "make a copy of an immutable value object with 0 or more fields changed" is so fundamental, it's hard to find examples that strip away every other concern.
Staying with the Point
example, would translate and invert work better as examples that change multiple fields? In this example, let's introduce a 3 dimensional Point, and then 2-dimensional operations on it, as follows:
##
# Example of proposed Data#with
#
# To try this out:
#
# ruby-install ruby-3.2.0-preview3
#
##
Point3d = Data.define(:x, :y, :z) do
# Example only, too slow due to allocations
def with(**args)
raise ArgumentError unless args.keys.all? { |k| members.include? k }
self.class.new(**(to_h.merge(args)))
end
def translate_2d(dx: 0, dy: 0)
with(x: x + dx, y: y + dy)
end
def invert_2d
with(x: -x, y: -y)
end
end
Origin3d = Point3d.new(x: 0, y: 0, z: 0)
north = Origin3d.translate_2d(dy: 1.0)
south = north.invert_2d
east = Origin3d.translate_2d(dx: 1.0)
west = east.invert_2d
west == Point3d.new(x: -1.0, y: 0, z: 0)
# => true
mountain_height = 2.0
western_mountain = west.with(z: mountain_height)
# => #<data Point3d x=-1.0, y=0, z=2.0>
eastern_mountain = western_mountain.invert_2d
# => #<data Point3d x=1.0, y=0, z=2.0>
Also, do you need the ability to update multiple fields at once? Both motivation examples only update a single field. This may be the result of simplifying the motivation example, though.
Yes! Definitely need to update multiple fields at once
Looking at these motivation examples, there may be room to consider an API like
p.with(field) {|old_value| old_value + delta }
or something.
Agreed that this probably comes out of a motivating example which is trying to calculate the new value for a single field based on only the previous value of that field.
Updated by Eregon (Benoit Daloze) almost 2 years ago
I also agree the 0-kwargs case should be supported for general use like data.with(**changes)
, whether changes
is empty or not. And that reads fine of course.
It's also like Hash#merge
with no arguments.
I think #with
with empty kwargs should do the same as dup
. So this depends on what Data#dup
does.
If Data#dup
returns self
like immutable objects (Fixnum/Symbol/true/false/nil/etc) then let's do that too, if not let's dup in that case.
Note that Data
is not truly immutable (nor always Ractor-shareable) because it can refer to mutable objects.
Currently Data#dup
creates a new instance.
Updated by k0kubun (Takashi Kokubun) almost 2 years ago
Summary of the discussion so far:
- Interface
- Method name
-
with
- Proposal: [Feature #19000] by @RubyBugs (A Nonymous)
- Other languages: C#, Java (WIP),
- Existing gems: sorbet, value_semantics
- Pull request: https://github.com/ruby/ruby/pull/6766
- Matz #note-25 said it's acceptable.
-
update
-
dup
- Proposal: #note-4 by @ufuk (Ufuk Kayserilioglu)
- Matz #note-25 said it's not acceptable.
-
copy
- Proposal: #note-29 by @RubyBugs (A Nonymous)
- Other languages: Scala, Kotlin
-
clone
- Proposal: #note-5 by @jeremyevans0 (Jeremy Evans)
- Existing gems: sequel
-
- Alternatives
Point.new(foo: Origin.foo, **change)
Point.new(**(Origin.to_h.merge(change)))
-
Point.new(**Origin.to_h, **change)
- Suggested at the Ruby developers' meeting in November
My personal notes:
I'm late to the party, but in my experience of using data classes in other languages, this feels like a must-have for data classes. It'd be unfortunate if Data were released without it.
Also, do you need the ability to update multiple fields at once? Both motivation examples only update a single field.
Yes! Definitely need to update multiple fields at once
I second that. In my previous company, we had an immutable class that represented a state/snapshot for stream-processing multiple events at once (with Amazon Kinesis), which is used by a function that takes a state and returns the next state, therefore not mutating it in the middle of the function. It groups multiple events and dispatches a batch operation when a certain threshold is met. Because we had multiple thresholds (e.g. max size, max duration), we naturally needed to create a copy with multiple updates when we create the next state.
I also agree the 0-kwargs case should be supported for general use like data.with(**changes), whether changes is empty or not. And that reads fine of course.
It's also like Hash#merge with no arguments.
+1. I'm not sure if it's possible to distinguish them in Ruby, but ideally data.with
or data.with()
should be rejected even if we accept data.with(**changes)
when changes
are empty. If it's feasible, then it might clear Matz's concern at #note-27.
Updated by Eregon (Benoit Daloze) almost 2 years ago
k0kubun (Takashi Kokubun) wrote in #note-37:
+1. I'm not sure if it's possible to distinguish them in Ruby, but ideally
data.with
ordata.with()
should be rejected even if we acceptdata.with(**changes)
whenchanges
are empty.
That's not possible to differentiate and also it shouldn't be (it would be very confusing if it was different).
Because foo(**empty_hash)
should always be exactly the same as foo()
.
If it's feasible, then it might clear Matz's concern at #note-27.
@matz (Yukihiro Matsumoto) Is it really that big a concern?
I think it's far far more important that with(**changes)
works, whether changes
is empty or not.
Honestly I think there is no point to prevent that, it's necessary for the with(**changes)
use case and it seems pretty much harmless.
Updated by ufuk (Ufuk Kayserilioglu) almost 2 years ago
Status update
@alanwu (Alan Wu) and I reworked the PR at https://github.com/ruby/ruby/pull/6766 to ensure the following:
- Calls to
.with
with no arguments is valid and returns the receiver as-is. As @Eregon (Benoit Daloze) said, there is no way to differentiate between no args and empty kwargs, so the ability to do.with(**hash)
freely won the discussion over. - The implementation works correctly with Data objects being frozen now.
- The implementation checks explicitly for keyword arguments and raises argument error for hashes passed as positional argument, as well.
Apart from the decision on the name, the PR should be good to merge. As @k0kubun (Takashi Kokubun) said above, I also feel like this should be a part of the Ruby 3.2 release.
If the only blocker is the name, @matz (Yukihiro Matsumoto) can you pick the final name in the next few days, so that we can ship this with 3.2?
Thank you everyone for the discussion and ideas.
Updated by matz (Yukihiro Matsumoto) almost 2 years ago
with
seems slightly better than other candidates. Data
is (or can be considered as) a value object without identity. So names with identity implication (dup
, copy
, clone
) are not ideal. I like update
but I understand @ko1's concern (Hash#update
modifies the receiver).
Regarding error-on-no-argument, I thought passing no argument (even via double splat **
) indicates some kind of logic error, and honored fail-early principle, but as @Eregon (Benoit Daloze) said, it's not a big deal.
I am not sure @naruse (Yui NARUSE) accepts this last minute change, but I am OK with the last PR from @ufuk (Ufuk Kayserilioglu).
Matz.
Updated by naruse (Yui NARUSE) almost 2 years ago
I'm also ok to merge!
This will ship as a part of Ruby 3.2!
Updated by ufuk (Ufuk Kayserilioglu) almost 2 years ago
- Status changed from Open to Closed
Applied in changeset git|99cee85775bc5656b942bf4e1b0568a8f40addcd.
Add copy with changes functionality for Data objects (#6766)
Implements [Feature #19000]
This commit adds copy with changes functionality for Data
objects
using a new method Data#with
.
Since Data objects are immutable, the only way to change them is by
creating a copy. This PR adds a with
method for Data
class instances
that optionally takes keyword arguments.
If the with
method is called with no arguments, the behaviour is the
same as the Kernel#dup
method, i.e. a new shallow copy is created
with no field values changed.
However, if keyword arguments are supplied to the with
method, then
the copy is created with the specified field values changed. For
example:
Point = Data.define(:x, :y)
point = Point.new(x: 1, y: 2)
point.with(x: 3) # => #<data Point x: 3, y: 2>
Passing positional arguments to with
or passing keyword arguments to
it that do not correspond to any of the members of the Data class will
raise an ArgumentError
.
Co-authored-by: Alan Wu XrXr@users.noreply.github.com
Updated by ufuk (Ufuk Kayserilioglu) almost 2 years ago
Thank you @matz (Yukihiro Matsumoto) and @naruse (Yui NARUSE). I hate that this ended up being a last-minute change, but it seemed to be little risk adding a method to a new interface and the benefit seemed to greatly outweigh the risk. Otherwise, I would not have asked for consideration.
Thanks again for your understanding. 🙇♂️
Updated by Eregon (Benoit Daloze) almost 2 years ago
- Related to Bug #19259: `Data#with` doesn't call `initialize` nor `initialize_copy` added