Misc #10560
openconfusion between x=x+y, x+=y, x.concat(y) and y.each{|z| x<<z}
Description
while discussing a ticket I have noticed that there is no documentation for +=
I was expecting +=
to behave as concat
but instead it behaves as x=x+y
which for every operation clones the array and updates the variable with new value so it behaves similarly to x=x.dup.concat(y)
and is slightly faster, but using plane x.concat(y)
is a lot faster from both each<<
and +=
I would either like to get:
- updated docs that describe concept of
+=
and show the difference fromconcat
- or change
+=
to useconcat
which is faster - and add docs ;) (I would expect+=
to useconcat
when available)
here is a test:
require 'benchmark'
rep = 10_000
Benchmark.bmbm do |x|
{
1..25 => [],
"a".."z" => "",
}.each do |base, storage|
base = base.to_a
basej = base
class_name = storage.class.to_s
x.report(class_name+'#concat') do
a = storage.dup
basej = base.join if storage == ""
rep.times { a.concat(basej) }
end
x.report(class_name+'#<<') do
a = storage.dup
basej = base.join if storage == ""
rep.times { base.each { |e| a << e } }
end
x.report(class_name+'#+=') do
a = storage.dup
basej = base.join if storage == ""
rep.times { a += basej }
end
x.report(class_name+'#dup.concat') do
a = storage.dup
basej = base.join if storage == ""
rep.times { a = a.dup.concat(basej) }
end
end
end
and here are results on my machine:
user system total real
Array#concat 0.000000 0.000000 0.000000 ( 0.001422)
Array#<< 0.020000 0.000000 0.020000 ( 0.014356)
Array#+= 1.270000 0.230000 1.500000 ( 1.498558)
Array#dup.concat 2.720000 0.190000 2.910000 ( 2.915701)
String#concat 0.000000 0.000000 0.000000 ( 0.001072)
String#<< 0.030000 0.000000 0.030000 ( 0.025828)
String#+= 0.130000 0.010000 0.140000 ( 0.135143)
String#dup.concat 0.210000 0.020000 0.230000 ( 0.227470)
Updated by recursive-madman (Recursive Madman) almost 10 years ago
+= doesn't change the object itself.
For strings for example:
x = y = 'foo'
x += 'bar'
x #=> 'foobar'
y #=> 'bar'
As well as for integers:
x = y = 7
x += 3
x #=> 10
y #=> 7
That is x += y
is semantically identical to x = x + y
.
concat
on the other hand does change the object itself (and is specific to arrays, not everything that has +
defined):
x = y = [1,2,3]
x.concat([4, 5, 6])
x #=> [1,2,3,4,5,6]
y #=> [1,2,3,4,5,6]
Updated by nobu (Nobuyoshi Nakada) almost 10 years ago
- Category set to doc
- Status changed from Open to Assigned
- Assignee set to zzak (zzak _)
- Target version set to 2.2.0
Updated by chrisseaton (Chris Seaton) almost 10 years ago
I disagree with making the proposed change to +=. I would find it extremely surprising for += to modify an existing Array object. I really can't imagine any mental model of Ruby where it would make intuitive sense to do that. It goes against existing Ruby semantics and would have be taught as a special case. It will likely break existing Ruby code. It increases the complexity of Ruby semantics and Ruby implementations. It introduces implicit mutation, which is probably something we want less of, not more.
However I do agree that we need better documentation for things like +=. I'm not sure where I would look for documentation of something like that. We don't really have language documentation, do we?
I would also be in favour of transparently implementing Array#+ as something similar to concat where through escape analysis it can be determined that the original object is never needed again, but that is a lot of ask of Ruby implementations.
Updated by mpapis (Michal Papis) almost 10 years ago
Recursive Madman it's what I said in the ticket (but unwinded)
Chris Seaton I would assume the doc
categorization means only the first part was approved to update docs.
The rest is optimization and does not have to be part of the specification ... or would it have to be part of specification that cloning object can be dropped if it's not used anywhere else?
Updated by chrisseaton (Chris Seaton) almost 10 years ago
Ah right sorry I didn't see the 'doc' note. I think we could (theoretically) implement this optimisation without any visible change to Ruby - so MRI, JRuby, Rbx etc could still implement the optimisation.
Updated by duerst (Martin Dürst) almost 10 years ago
I added some explanation to the documentation of Array#+ in r48682. I haven't been able to make RDoc create a separate entry for Array#+=, but I'm not an expert on RDoc.
Updated by javawizard (Alex Boyd) almost 10 years ago
Of note, Python does implement +=
this way. +=
is (almost) an alias for list.extend
:
>>> x = []
>>> y = x
>>> x += [42]
>>> y
[42]
I think Ruby's behavior is more sensible, but I can imagine this causing confusion for Ruby users with a Python background - I, for one, was mildly surprised when I first found this out.
Updated by david_macmahon (David MacMahon) almost 10 years ago
I imagine that Python also lets one override the +=
method. I think this is impossible on Ruby because I think it is the parser that interprets a += b
to be a = a + b
, so only the #+
method will be called (with coercion) regardless of which way it was expressed in the code. This was the conclusion I reached after trying to add a #+=
method to a class (in a C extension for MRI) and discovering that it was never called by a += b
. FWIW, it was callable via #send
.
Maybe this could become a feature request to allow <OP>=
to be methods instead of just syntactic sugar?
Updated by nobu (Nobuyoshi Nakada) almost 10 years ago
David MacMahon wrote:
Maybe this could become a feature request to allow
<OP>=
to be methods instead of just syntactic sugar?
Of course you can.
I believe it will be rejected soon, however.
Updated by naruse (Yui NARUSE) almost 7 years ago
- Target version deleted (
2.2.0)