Misc #10560

confusion between x=x+y, x+=y, x.concat(y) and y.each{|z| x<<z}

Added by mpapis (Michal Papis) almost 5 years ago. Updated almost 2 years ago.



while discussing a ticket I have noticed that there is no documentation for +=

I was expecting += to behave as concat but instead it behaves as x=x+y which for every operation clones the array and updates the variable with new value so it behaves similarly to x=x.dup.concat(y) and is slightly faster, but using plane x.concat(y) is a lot faster from both each<< and +=

I would either like to get:

  • updated docs that describe concept of += and show the difference from concat
  • or change += to use concat which is faster - and add docs ;) (I would expect += to use concat when available)

here is a test:

require 'benchmark'

rep = 10_000

Benchmark.bmbm do |x|
    1..25 => [],
    "a".."z" => "",
  }.each do |base, storage|

    base = base.to_a
    basej = base
    class_name = storage.class.to_s'#concat') do
      a = storage.dup
      basej = base.join if storage == ""
      rep.times { a.concat(basej) }
    end'#<<') do
      a = storage.dup
      basej = base.join if storage == ""
      rep.times { base.each { |e| a << e } }
    end'#+=') do
      a = storage.dup
      basej = base.join if storage == ""
      rep.times { a += basej }
    end'#dup.concat') do
      a = storage.dup
      basej = base.join if storage == ""
      rep.times { a = a.dup.concat(basej) }


and here are results on my machine:

                        user     system      total        real
Array#concat        0.000000   0.000000   0.000000 (  0.001422)
Array#<<            0.020000   0.000000   0.020000 (  0.014356)
Array#+=            1.270000   0.230000   1.500000 (  1.498558)
Array#dup.concat    2.720000   0.190000   2.910000 (  2.915701)
String#concat       0.000000   0.000000   0.000000 (  0.001072)
String#<<           0.030000   0.000000   0.030000 (  0.025828)
String#+=           0.130000   0.010000   0.140000 (  0.135143)
String#dup.concat   0.210000   0.020000   0.230000 (  0.227470)


Updated by recursive-madman (Recursive Madman) almost 5 years ago

+= doesn't change the object itself.

For strings for example:

x = y = 'foo'
x += 'bar'
x #=> 'foobar'
y #=> 'bar'

As well as for integers:

x = y = 7
x += 3
x #=> 10
y #=> 7

That is x += y is semantically identical to x = x + y.

concat on the other hand does change the object itself (and is specific to arrays, not everything that has + defined):

x = y = [1,2,3]
x.concat([4, 5, 6])
x #=> [1,2,3,4,5,6]
y #=> [1,2,3,4,5,6]

Updated by nobu (Nobuyoshi Nakada) almost 5 years ago

  • Category set to doc
  • Status changed from Open to Assigned
  • Assignee set to zzak (Zachary Scott)
  • Target version set to 2.2.0

Updated by chrisseaton (Chris Seaton) almost 5 years ago

I disagree with making the proposed change to +=. I would find it extremely surprising for += to modify an existing Array object. I really can't imagine any mental model of Ruby where it would make intuitive sense to do that. It goes against existing Ruby semantics and would have be taught as a special case. It will likely break existing Ruby code. It increases the complexity of Ruby semantics and Ruby implementations. It introduces implicit mutation, which is probably something we want less of, not more.

However I do agree that we need better documentation for things like +=. I'm not sure where I would look for documentation of something like that. We don't really have language documentation, do we?

I would also be in favour of transparently implementing Array#+ as something similar to concat where through escape analysis it can be determined that the original object is never needed again, but that is a lot of ask of Ruby implementations.

Updated by mpapis (Michal Papis) almost 5 years ago

Recursive Madman it's what I said in the ticket (but unwinded)

Chris Seaton I would assume the doc categorization means only the first part was approved to update docs.
The rest is optimization and does not have to be part of the specification ... or would it have to be part of specification that cloning object can be dropped if it's not used anywhere else?

Updated by chrisseaton (Chris Seaton) almost 5 years ago

Ah right sorry I didn't see the 'doc' note. I think we could (theoretically) implement this optimisation without any visible change to Ruby - so MRI, JRuby, Rbx etc could still implement the optimisation.

Updated by duerst (Martin Dürst) almost 5 years ago

I added some explanation to the documentation of Array#+ in r48682. I haven't been able to make RDoc create a separate entry for Array#+=, but I'm not an expert on RDoc.

Updated by javawizard (Alex Boyd) almost 5 years ago

Of note, Python does implement += this way. += is (almost) an alias for list.extend:

>>> x = []
>>> y = x
>>> x += [42]
>>> y

I think Ruby's behavior is more sensible, but I can imagine this causing confusion for Ruby users with a Python background - I, for one, was mildly surprised when I first found this out.

Updated by david_macmahon (David MacMahon) almost 5 years ago

I imagine that Python also lets one override the += method. I think this is impossible on Ruby because I think it is the parser that interprets a += b to be a = a + b, so only the #+ method will be called (with coercion) regardless of which way it was expressed in the code. This was the conclusion I reached after trying to add a #+= method to a class (in a C extension for MRI) and discovering that it was never called by a += b. FWIW, it was callable via #send.

Maybe this could become a feature request to allow <OP>= to be methods instead of just syntactic sugar?

Updated by nobu (Nobuyoshi Nakada) almost 5 years ago

David MacMahon wrote:

Maybe this could become a feature request to allow <OP>= to be methods instead of just syntactic sugar?

Of course you can.

I believe it will be rejected soon, however.


Updated by naruse (Yui NARUSE) almost 2 years ago

  • Target version deleted (2.2.0)

Also available in: Atom PDF