Feature #8259

Atomic attributes accessors

Added by Yura Sokolov about 1 year ago. Updated 7 months ago.

[ruby-core:54218]
Status:Open
Priority:Normal
Assignee:-
Category:-
Target version:Ruby 2.1.0

Description

=begin
Motivated by this gist ((URL:https://gist.github.com/jstorimer/5298581)) and atomic gem

I propose Class.attr_atomic which will add methods for atomic swap and CAS:

class MyNode
attraccessor :item
attr
atomic :successor

def initialize(item, successor)
  @item = item
  @successor = successor
end

end
node = MyNode.new(i, other_node)

# attr_atomic ensures at least #{attr} reader method exists. May be, it should
# be sure it does volatile access.
node.successor

# #{attr}cas(oldvalue, newvalue) do CAS: atomic compare and swap
if node.successor
cas(othernode, newnode)
print "there were no interleaving with other threads"
end

# #{attr}swap atomically swaps value and returns old value.
# It ensures that no other thread interleaves getting old value and setting
# new one by cas (or other primitive if exists, like in Java 8)
node.successor
swap(new_node)

It will be very simple for MRI cause of GIL, and it will use atomic primitives for
other implementations.

Note: both (({#{attr}swap})) and (({#{attr}cas})) should raise an error if instance variable were not explicitly set before.

Example for nonblocking queue: ((URL:https://gist.github.com/funny-falcon/5370416))

Something similar should be proposed for Structs. May be override same method as (({Struct.attr_atomic}))

Open question for reader:
should (({attratomic :myattr})) ensure that #myattr reader method exists?
Should it guarantee that (({#my
attr})) provides 'volatile' access?
May be, (({attrreader :myattr})) already ought to provide 'volatile' semantic?
May be, semantic of (({@my_attr})) should have volatile semantic (i doubt for that)?
=end

History

#1 Updated by Charles Nutter about 1 year ago

Great to see this proposed officially!

I implemented something very much like this for JRuby as a proof-of-concept. It was in response to thedarkone's recent work on making Rails truly thread-safe/thread-aware.

My feature was almost exactly like yours, with an explicit call to declare an attribute as "volatile" (the characteristic that makes atomic operations possible). Doing so created a _cas method, made accessors do volatile operations, and I may also have had a simple atomic swap (getAndSet). CAS is probably enough to add, though.

thedarkone had concerns about my proposed API. I believe he wanted to be able to treat any variable access as volatile (even direct access via @foo = 1) or perhaps he simply didn't like having to go through a special method. I'll try to get him to comment here.

One concern about CAS (which has actually become an issue for my "atomic" gem): treatment of numerics. Specifically, what does it mean to CAS a numeric value when numeric idempotence varies across implementations:

MRI 1.9.x and lower only have idempotent signed fixnums up to 31 bits on 32-bit builds and 63 bits on 64-bit builds. Rubinius and MacRuby follow suit.

MRI 2.0.0 has idempotent floats only on 64-bit and only up to some number of bits of precision (is that correct?). MacRuby does something similar.

I believe MagLev has fixnums but not flonums. Unsure.

JRuby has idempotent fixnums only up to 8 bits (signed) due to the cost of caching Fixnum objects (JVM does not have fixnums, so we have to mitigate the cost of objects).

Topaz does not have fixnums or flonums and relies on escape analysis/detection to eliminate Fixnum objects.

IronRuby does something similar to JRuby, but could potentially make Fixnums and Floats be value types; I'm not sure if this would make them idempotent or not.

And this all ignores the fact that Fixnum transparently overflows into Bignum, which is represented as a full, non-idempotent object on all implementations.

So we've got a case where this code would start to fail at different times on different implementations:

number = obj.number
success = obj.number_cas(number, number + 1)
fail unless success

In the atomic gem, I'm going to be adding AtomicInteger and AtomicFloat for this purpose that either use value equality rather than reference equality (at potentially greater cost) or limit the value range of integers to 64 bits.

Other concerns:

  • The JVM does not, until Java 8, have a way to insert an explicit memory barrier into Java code without having a volatile field access or a lock acquisition (which does volatile-like things). Even in Java 8, it is via a non-standard "fences" API. JRuby currently uses it to improve volatility guarantees of instance variables. On Java 6 and 7 we fall back on a slower implementation that uses explicit volatile operations on a larger scale.

  • The JVM also does not provide a way to make only a single element of an array be volatile, but you can use nonstandard back-door APIs to simulate it (which is what AtomicReferenceArray and friends do).

  • JVM folks have introduced the concept of a "lazy set" which is intended to mean you don't really expect full volatile semantics for this write (and don't want to pay for volatile semantics every time).

  • Optimizing implementations may get to a point where they can optimize away repeated accesses of instance variables. In the Java world, these optimizations are limited by the volatile field modifier and the Java Memory Model, which inserts explicit ordering and visibility constraints on volatile accesses. It would seem to me that Ruby needs to more formally define volatile semantics along with adding this feature.

That's all I have for now :-)

#2 Updated by Yura Sokolov about 1 year ago

I think, @ivar access should not be volatile as in any other language,
but obj.ivar could be volatile if attr_atomic :ivar were called.

Number idempotention should not be a great problem cause most of time the same
old object is used for CAS. But, yeah, we could treat numbers as a special case,
and do two step CAS (ruby-like pseudocode):

def ivar_cas(old, new)
  if Number === old
    stored = @ivar
    if stored == old
      ivar_hardware_cas(stored, new)
    end
  else
    ivar_hardware_cas(old, new)
  end
end

But I could not help with JVM internals :(

#3 Updated by Charles Nutter about 1 year ago

funny_falcon (Yura Sokolov) wrote:

I think, @ivar access should not be volatile as in any other language,
but obj.ivar could be volatile if attr_atomic :ivar were called.

Agreed. The dynamic nature by which @ivar can be instantiated makes marking them as volatile very tricky, on any implementation.

Number idempotention should not be a great problem cause most of time the same
old object is used for CAS. But, yeah, we could treat numbers as a special case,
and do two step CAS (ruby-like pseudocode):

def ivar_cas(old, new)
  if Number === old
    stored = @ivar
    if stored == old
      ivar_hardware_cas(stored, new)
    end
  else
    ivar_hardware_cas(old, new)
  end
end

This logic would be sufficient in JRuby as well, but comes with a fairly high cost: an === call even when the value is non-numeric.

The same logic implemented natively in the atomic accessors would probably be simple enough to optimize (e.g. in JRuby it would be an instanceof RubyNumeric check).

#4 Updated by Charles Nutter about 1 year ago

FYI, link to a current issue with the atomic gem I'm fixing using a loop + == + CAS: https://github.com/headius/ruby-atomic/issues/19

#5 Updated by Nobuyoshi Nakada about 1 year ago

  • Description updated (diff)

#6 Updated by Nobuyoshi Nakada about 1 year ago

Why do you consider comparison atomic?

#7 Updated by Yura Sokolov about 1 year ago

Comparison is not atomic. It is used to be ensure, we could use value, stored in @ivar for real CAS. Semantic of method at whole doesn't change, cause if comparison fails, then CAS will fail also.

#8 Updated by Charles Nutter about 1 year ago

Comparison of two numeric values should be consistent and unchanging, or else I feel that various contracts of numbers are being violated. In Java, this is handled by having numeric values be primitives, and therefore all representations of equality are consistent. In Ruby, where some numerics are idempotent and some are not, I think it is reasonable to extend the CAS operation to do a value equality check. So, the contract would be:

  • For non-numeric types, CAS checks only reference equality (hardware CAS).
  • For numeric types, CAS checks value equality (using reference equality -- hardware CAS -- to ensure nothing has changed while checking value equality).

This is how version 1.1.8 of the atomic gem will work, once I (or someone else) implements value equality CAS for the C ext.

#9 Updated by Charles Nutter about 1 year ago

I have completed adding the numeric logic to the atomic gem and pushed 1.1.8.

The version for JRuby is here: https://github.com/headius/ruby-atomic/blob/master/ext/org/jruby/ext/atomic/AtomicReferenceLibrary.java#L129

The version for MRI, Rubinius, and others is here: https://github.com/headius/ruby-atomic/blob/master/lib/atomic/numeric_cas_wrapper.rb

#10 Updated by Yura Sokolov about 1 year ago

Charles, I really sure there is no need for while true in your numeric
handling cas -
the nature of cas is "change if no one changes yet", so that your while
true
violates natures of cas.

2013/4/16 headius (Charles Nutter) headius@headius.com

Issue #8259 has been updated by headius (Charles Nutter).

I have completed adding the numeric logic to the atomic gem and pushed
1.1.8.

The version for JRuby is here:
https://github.com/headius/ruby-atomic/blob/master/ext/org/jruby/ext/atomic/AtomicReferenceLibrary.java#L129

The version for MRI, Rubinius, and others is here:

https://github.com/headius/ruby-atomic/blob/master/lib/atomic/numeric_cas_wrapper.rb

Feature #8259: Atomic attributes accessors
https://bugs.ruby-lang.org/issues/8259#change-38617

Author: funny_falcon (Yura Sokolov)
Status: Open
Priority: Normal
Assignee:
Category:
Target version:

=begin
Motivated by this gist ((URL:https://gist.github.com/jstorimer/5298581))
and atomic gem

I propose Class.attr_atomic which will add methods for atomic swap and CAS:

class MyNode
attraccessor :item
attr
atomic :successor

def initialize(item, successor)
  @item = item
  @successor = successor
end

end
node = MyNode.new(i, other_node)

# attr_atomic ensures at least #{attr} reader method exists. May be, it
should
# be sure it does volatile access.
node.successor

# #{attr}cas(oldvalue, newvalue) do CAS: atomic compare and swap
if node.successor
cas(othernode, newnode)
print "there were no interleaving with other threads"
end

# #{attr}swap atomically swaps value and returns old value.
# It ensures that no other thread interleaves getting old value and
setting
# new one by cas (or other primitive if exists, like in Java 8)
node.successor
swap(new_node)

It will be very simple for MRI cause of GIL, and it will use atomic
primitives for
other implementations.

Note: both (({#{attr}swap})) and (({#{attr}cas})) should raise an error
if instance variable were not explicitly set before.

Example for nonblocking queue: (())

Something similar should be proposed for Structs. May be override same
method as (({Struct.attr_atomic}))

Open question for reader:
should (({attratomic :myattr})) ensure that #myattr reader method
exists?
Should it guarantee that (({#my
attr})) provides 'volatile' access?
May be, (({attrreader :myattr})) already ought to provide 'volatile'
semantic?
May be, semantic of (({@my_attr})) should have volatile semantic (i doubt
for that)?
=end

http://bugs.ruby-lang.org/

#11 Updated by Charles Nutter about 1 year ago

The "while true" loop is there in order to re-check if the value is == after a change. My justification is that the only atomic part of this is the final CAS, but we want to pretend that the whole == + CAS is atomic; so this loops until either the current value is non-numeric, non-equal, or numeric + equal + has not changed since we last got it.

This pattern is used fairly often in the concurrency utilities on JVM for performing non-atomic logic surrounding an atomic update. Without the loop, an update that happens after the == check and before the CAS but which does not change the value of the currently-referenced object would fail. I don't think it should.

#12 Updated by Dirkjan Bussink about 1 year ago

What I'm wondering is, do we want to enforce the overhead of numeric CAS for all applications of CAS? Also in the case of numeric handling, the pattern in which I've used CAS most often is that I base the old value on the existing one, which of course still works fine for CAS operations on references.

What I see from this discussion is perhaps two API's. One that is basically identity based and one that is equality based. Wouldn't it be a better idea to provide these two api's separate? That case we don't have to special case numeric handling and people also get equality like handling for non-Numeric classes which would work like the numeric logic here. People can then decide which kind of CAS they need based which kind if comparison they need.

#13 Updated by Charles Nutter about 1 year ago

dbussink (Dirkjan Bussink) wrote:

What I'm wondering is, do we want to enforce the overhead of numeric CAS for all applications of CAS? Also in the case of numeric handling, the pattern in which I've used CAS most often is that I base the old value on the existing one, which of course still works fine for CAS operations on references.

To be clear, in the Ruby impls of numeric CAS, the only additional cost for non-numerics is a kind_of? check.

In JRuby, it's an instanceof check, which is pretty darn fast.

What I see from this discussion is perhaps two API's. One that is basically identity based and one that is equality based. Wouldn't it be a better idea to provide these two api's separate? That case we don't have to special case numeric handling and people also get equality like handling for non-Numeric classes which would work like the numeric logic here. People can then decide which kind of CAS they need based which kind if comparison they need.

That would certainly avoid overhead in the non-numeric case, but I worry it would lead to too much confusion. People would forget about the equality CAS and use the other one and get weird bugs because they didn't have the same numeric object in hand. It's also hard to track whether you actually have the same object, since most impls emulate the same value / same object_id behavior from MRI.

In the Atomic gem, I still think it's valid to explicitly have AtomicInteger and AtomicFloat to speed up how those are handled (we can use native CAS against 64-bit long and double rather than against the object reference), but this is a case where it seems like everyone would expect numbers to CAS based on equality rather than reference identity, and not doing it will lead to neverending complaints. I could be wrong.

#14 Updated by Dirkjan Bussink about 1 year ago

I highly doubt the neverending complaints case, since this I think people using CAS would usually know what they are doing (at least my experience with using constructs like this). The overhead is actually bigger for the numeric case where a CAS would work for example for Fixnum on MRI and Rubinius without the extra checks. That could of course be optimized in implementations for those platforms.

If you're worried about confusion between equality and identity, we could also have an equality based CAS be the default and have the possibility of using an identity based version if people know that is what they want.

#15 Updated by Charles Nutter about 1 year ago

dbussink (Dirkjan Bussink) wrote:

I highly doubt the neverending complaints case, since this I think people using CAS would usually know what they are doing (at least my experience with using constructs like this). The overhead is actually bigger for the numeric case where a CAS would work for example for Fixnum on MRI and Rubinius without the extra checks. That could of course be optimized in implementations for those platforms.

You may be right about complaints...I use atomics all the time but I'm unusual. Skewed viewpoint, perhaps.

I'm not sure what you mean by "the overhead is actually bigger". It's a kind_of? type check at worst...is that expensive in Rubinius?

Also, Fixnum will not work consistently on MRI or Rubinius without extra checks if any of the values are close to the Fixnum boundary. Optimization specific to those impls would still have to confirm the Fixnum is within a certain range. Perhaps working with Fixnums that are at the failover point into Bignum is not common, but you can't just omit those checks. And you have to know you're dealing with a Fixnum anyway...so you have to check every time.

My justification for doing it unconditionally for all numerics is largely because of the overflow into Bignum. Ruby pretends that integers are one continuum, but only part of that continuum (varying across impls and architectures) is actually idempotent. As a result, all integers would need some portion of the equality checking logic on every implementation.

If you're worried about confusion between equality and identity, we could also have an equality based CAS be the default and have the possibility of using an identity based version if people know that is what they want.

That's not a bad option, I guess. The main problem here is that I feel like people expect numerics of equal value to essentially be identical, and that's not the case for most numerics on most implementations. If people think they're essentially identical, they might expect CAS to work properly. I don't believe people would have that expectation of non-numerics, so extending equality CAS to all types seems like overkill.

#16 Updated by Charles Nutter about 1 year ago

Having some discussions with dbussink on IRC...

I think the most universally correct option is to have two different paths for reference CAS and value CAS, and rely upon users to choose the right one. As dbussink points out, the people who are going to use atomic ivars are much more likely to know what they're doing.

We don't want to have an additional method for ever single atomic ivar, so perhaps a parameter?

class MyClass
atomic_accessor :foo

# generates CAS accessor similar to...
def foo_cas(old, new, compare_value = false)
  ...
end

end

mc = MyClass.new
mc.foo = 5 # fixnum, so we'll want value CAS
...

# true causes value comparison CAS like I implemented in atomic gem
# false does reference CAS as normal, with variable behavior for some numeric values.
mc.foo_cas(5, 6, true)

I can't imagine two separate methods getting approved but this form seems like it would be acceptable.

#17 Updated by Charles Nutter 7 months ago

  • Target version set to Ruby 2.1.0

Trying to wake this one up in hopes of getting it into 2.1. Is there any chance?

Forgive me if I'm breaking process somehow, but ko1 told me to mark the issues I want in 2.1 with Target version=2.1, so I've been doing that.

Also available in: Atom PDF