Feature #17763
closedImplement cache for cvars
Description
Introduce inline cache for class variable reads¶
@tenderlove and I would like to introduce an inline cache for class variable reads. We've attached a patch that introduces the cache. Class variable reads are popular in Rails applications for example, Active Record's #logger
.
GitHub PR: https://github.com/ruby/ruby/pull/4340
Cache Design¶
This patch introduces a hash table that's stored on the same class as the class variable value.
For example:
class A
@@foo = 1
end
class B < A
def self.read_foo
@@foo
end
end
The above code stores the value for @@foo
on the A
class and stores an inline cache value on the A
class as well. The instruction sequences for the read_foo
method point at the CVAR inline cache entry stored on class A
.
The lifecycle of these caches are similar to instance variable inline caches.
Diagram of the cache:¶
Performance Characteristics¶
When class variables are read, Ruby needs to check each class in the inheritance tree to ensure that the class variable isn't set on any other classes in the tree. If the same cvar is set on a class in the inheritance tree then a "cvar overtaken" error will be raised.
Because of how cvar reads work, the more classes in the inheritance tree the more expensive a cvar read is. To demonstrate this here is a benchmark that reads a cvar from a class with 1 module, 30 modules, and 100 modules in the inheritance chain. On Ruby master 100 modules is 8.5x slower than including 1 module. With the cache, there is no performance difference between including 1 module and including 100 modules.
Benchmark script:
require "benchmark/ips"
MODULES = ["B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "AA", "BB", "CC", "DD", "EE", "FF", "GG", "HH", "II", "JJ", "KK", "LL", "MM", "NN", "OO", "PP", "QQ", "RR", "SS", "TT", "UU", "VV", "WW", "XX", "YY", "ZZ", "AAA", "BBB", "CCC", "DDD", "EEE", "FFF", "GGG", "HHH", "III", "JJJ", "KKK", "LLL", "MMM", "NNN", "OOO", "PPP", "QQQ", "RRR", "SSS", "TTT", "UUU", "VVV", "WWW", "XXX", "YYY", "ZZZ", "AAAA", "BBBB", "CCCC", "DDDD", "EEEE", "FFFF", "GGGG", "HHHH", "IIII", "JJJJ", "KKKK", "LLLL", "MMMM", "NNNN", "OOOO", "PPPP", "QQQQ", "RRRR", "SSSS", "TTTT", "UUUU", "VVVV", "WWWW"]
class A
@@foo = 1
def self.foo
@@foo
end
eval <<-EOM
module #{MODULES.first}
end
include #{MODULES.first}
EOM
end
class Athirty
@@foo = 1
def self.foo
@@foo
end
MODULES.take(30).each do |module_name|
eval <<-EOM
module #{module_name}
end
include #{module_name}
EOM
end
end
class Ahundred
@@foo = 1
def self.foo
@@foo
end
MODULES.each do |module_name|
eval <<-EOM
module #{module_name}
end
include #{module_name}
EOM
end
end
Benchmark.ips do |x|
x.report "1 module" do
A.foo
end
x.report "30 modules" do
Athirty.foo
end
x.report "100 modules" do
Ahundred.foo
end
x.compare!
end
Ruby 3.0 master:
Warming up --------------------------------------
1 module 1.231M i/100ms
30 modules 432.020k i/100ms
100 modules 145.399k i/100ms
Calculating -------------------------------------
1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s
30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s
100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s
Comparison:
1 module: 12209958.3 i/s
30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower
100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower
Ruby 3.0 with cvar cache:
Warming up --------------------------------------
1 module 1.641M i/100ms
30 modules 1.655M i/100ms
100 modules 1.620M i/100ms
Calculating -------------------------------------
1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s
30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s
100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s
Comparison:
1 module: 16279458.0 i/s
100 modules: 16087484.6 i/s - same-ish: difference falls within error
30 modules: 15891406.2 i/s - same-ish: difference falls within error
Rails Application Benchmarks¶
We also benchmarked ActiveRecord::Base.logger
since logger
is a cvar and there are 63 modules in the inheritance chain. This is an example of a real-world improvement to Rails applications.
Benchmark:
require "benchmark/ips"
require_relative "config/environment"
Benchmark.ips do |x|
x.report "logger" do
ActiveRecord::Base.logger
end
end
Ruby 3.0 master:
Warming up --------------------------------------
logger 155.251k i/100ms
Calculating -------------------------------------
Ruby 3.0 with cvar cache:
Warming up --------------------------------------
logger 1.546M i/100ms
Calculating -------------------------------------
logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s
We also measured database queries in Rails and with the cvar cache they are about ~9% faster.
Benchmark code:
class BugTest < Minitest::Test
def test_association_stuff
post = Post.create!
Benchmark.ips do |x|
x.report "query" do
Post.first
end
end
end
end
Ruby 3.0 master / Rails 6.1:
Warming up --------------------------------------
query 790.000 i/100ms
Calculating -------------------------------------
query 7.601k (± 3.8%) i/s - 38.710k in 5.100534s
Ruby 3.0 cvar cache / Rails 6.1:
Warming up --------------------------------------
query 731.000 i/100ms
Calculating -------------------------------------
query 7.089k (± 3.3%) i/s - 35.819k in 5.058215s
Updated by eileencodes (Eileen Uchitelle) almost 4 years ago
This is the missing benchmark I copy and pasted incorrectly.
Ruby master / Rails 6.1:
```
Warming up ———————————————————
logger 155.251k I/100ms
Calculating ——————————————————
logger 1.502M (± 4.5%) I/s - 7.607M in 5.076869s
```
This branch / Rails 6.1:
```
Warming up ———————————————————
logger 1.546M I/100ms
Calculating ——————————————————
logger 14.857M (± 4.8%) I/s - 74.198M in 5.006202s
```
Updated by Eregon (Benoit Daloze) almost 4 years ago
Nice work.
I guess using a global serial here is the only way to handle overtaking without redoing the lookup every time like before.
I wonder, could the serial be global (not per module) but per name? (not asking to change anything, just curious)
Personally I wish we would just deprecate class variables, because I believe nobody wants the strange semantics, and they are inherently less efficient than instance variables on modules.
Or potentially make them equivalent in behavior to instance variables on modules (#14541).
But that is probably never going to happen.
Updated by tenderlovemaking (Aaron Patterson) almost 4 years ago
Eregon (Benoit Daloze) wrote in #note-2:
Nice work.
I guess using a global serial here is the only way to handle overtaking without redoing the lookup every time like before.
I wonder, could the serial be global (not per module) but per name? (not asking to change anything, just curious)
I think we probably could do that. Keeping a global counter just seemed like the easiest solution at the moment. Also, cvars seem very unpopular (compared with ivars) so I'm not sure adding complexity would be worth while.
Updated by Eregon (Benoit Daloze) almost 4 years ago
Eregon (Benoit Daloze) wrote in #note-2:
they are inherently less efficient than instance variables on modules.
@chrisseaton told me that it's not necessarily the case, and I think they can be exactly the same for a regular read (Shape check + load field in TruffleRuby).
I was thinking before this optimization, where CRuby would do an ancestor lookup every time for the overtaking logic IIRC.
Also, cvars seem very unpopular (compared with ivars) so I'm not sure adding complexity would be worth while.
Yeah, definitely. Which is why I've been hesitant about optimizing class variables, it's unclear if the cost of the not-so-trivial optimization would pay off.
It clearly seems worth it if code like the Rails logger keep using class variables though.
Updated by duerst (Martin Dürst) almost 4 years ago
Eregon (Benoit Daloze) wrote in #note-4:
Eregon (Benoit Daloze) wrote in #note-2:
Also, cvars seem very unpopular (compared with ivars) so I'm not sure adding complexity would be worth while.
Yeah, definitely. Which is why I've been hesitant about optimizing class variables, it's unclear if the cost of the not-so-trivial optimization would pay off.
It clearly seems worth it if code like the Rails logger keep using class variables though.
I'm not sure that "better optimize, because some important code keeps using this, but don't really optimize all the way, because it's no so popular" makes sense. (I'm not blaming Eregon, nor Aaron, nor Eileen, nor anybody else.)
What would it take e.g. to switch Rails logger to something else?
If the reason that some places are keeping class variables, then maybe we need to up with more convenient syntax for class instance variables.
Just trying to think out loud, sorry.
Updated by tenderlovemaking (Aaron Patterson) almost 4 years ago
duerst (Martin Dürst) wrote in #note-5:
What would it take e.g. to switch Rails logger to something else?
If the reason that some places are keeping class variables, then maybe we need to up with more convenient syntax for class instance variables.
Class variables have different semantics than class instance variables. It's possible to switch some things to use class instance variables without breaking existing behavior, but in other places we would have to basically re-implement class variable behavior (and at that point, you may as well use class variables).
I think the logger is one of the places where we depend on class variable semantics and it's unlikely to change.
Also in this case, it makes me feel weird to change the implementation of Rails when we can just make Ruby perform better. Changing Rails to suit Ruby seems like the tail wagging the dog (obviously not all cases are clear cut though)
Updated by Eregon (Benoit Daloze) almost 4 years ago
tenderlovemaking (Aaron Patterson) wrote in #note-6:
Also in this case, it makes me feel weird to change the implementation of Rails when we can just make Ruby perform better. Changing Rails to suit Ruby seems like the tail wagging the dog (obviously not all cases are clear cut though)
I have a different view of this, before this change, class variables were always extremely slow compared to class instance variables (10x in the case of ActiveRecord::Base.logger
).
So changing it in Rails would fix it for all Ruby versions.
The logger seems handled by mattr_accessor
:
https://github.com/rails/rails/blob/d612542336d9a61381311c95a27d801bb4094779/activerecord/lib/active_record/core.rb#L20
And mattr_accessor
is what uses class variables:
https://github.com/rails/rails/blob/d612542336d9a61381311c95a27d801bb4094779/activesupport/lib/active_support/core_ext/module/attribute_accessors.rb
But it seems like it could use an instance variable on the class instead, and use attr_accessor + for instance methods def foo; self.class.foo
and the writer.
but in other places we would have to basically re-implement class variable behavior (and at that point, you may as well use class variables).
That might be worth it for performance, i.e., using class methods + class instance variables instead of class variables.
Also https://github.com/rails/rails/blob/d612542336d9a61381311c95a27d801bb4094779/activesupport/lib/active_support/core_ext/class/attribute.rb#L85 seems to have some inheritance and yet already does not use class variables.
My personal point of view is class variables are de-facto deprecated syntax due to their strange and often unexpected semantics, and which about every Ruby tutorial/book recommends against using.
The least usages of class variables we have the better we are IMHO.
I once went through removing many class variables in the CRuby repo and it was fairly easy for most cases IIRC, in part because most of these cases did not want any inheritance.
To clarify, my point is this is good work and now that it's done and I think we should merge it,
but it seems worthwhile to look in Rails and other places if class variables are really needed.
If they can be replaced with class ivars, we could speed those accesses on all Ruby versions, not just 3.1+.
Updated by Eregon (Benoit Daloze) over 3 years ago
FWIW, ActiveRecord::Base.logger
no longer uses class variables since
https://github.com/rails/rails/commit/dcc2530af74cf6355a9206bb1d0b084a734fae3e (nice!)
Maybe we should really just deprecate class variables since their semantics are confusing (see below), their performance is not great / they are complicated to optimize, and they seem to be recommended to not be used since a long time in Ruby (de-facto deprecated).
That could start with a $VERBOSE=true
warning, then $VERBOSE=false
warning and then removal.
Example of weird semantics, which I would think no one would ever want:
class C; @@a = 1; end
class D < C; end
class D; @@a = 2; end
class C; p @@a; end # => 2
Updated by byroot (Jean Boussier) over 3 years ago
FWIW, ActiveRecord::Base.logger no longer uses class variables since https://github.com/rails/rails/commit/dcc2530af74cf6355a9206bb1d0b084a734fae3e
Yes, we've talked about changing it since, it's not really related to this ticket. IMHO the two are orthogonal, AR::Base.logger
just happened to be a good example of the potential performance impact.
Also note that it was replaced by class_attribute
, which is a "Railsism" and has a non-trivial implementation. Non-rails projects with a similar problems don't really have a solution for this.
Maybe we should really just deprecate class variables since their semantics are confusing
The problem is that in some cases there's not really any alternative, so I'm all for deprecating them, but IMHO a replacement with better semantic is needed.
Or maybe their behavior could be changed with some kind of switch:
class MyClass
new_class_variable_semantic
@@foo = 1
end
Which would open the door to changing their semantic over the course of a few releases.
Updated by ko1 (Koichi Sasada) over 3 years ago
We also benchmarked ActiveRecord::Base.logger since logger is a cvar and there are 63 modules in the inheritance chain. This is an example of a real-world improvement to Rails applications.
Calculating -------------------------------------
It seems you missed to write the result.
We also measured database queries in Rails and with the cvar cache they are about ~9% faster.
It seems w/o cache is faster. Could you check it? (opposite results?)
Do you have application benchmark (request per second, for example)?
I understand the performance benefit, but it can introduce complexity so I want to confirm the value of this proposal.
Thanks,
Koichi
Updated by byroot (Jean Boussier) over 3 years ago
It seems you missed to write the result.
It's in the first comment: https://bugs.ruby-lang.org/issues/17763#note-1
Ruby master / Rails 6.1:
Warming up ———————————————————
logger 155.251k I/100ms
Calculating ——————————————————
logger 1.502M (± 4.5%) I/s - 7.607M in 5.076869s
This branch / Rails 6.1:
Warming up ———————————————————
logger 1.546M I/100ms
Calculating ——————————————————
logger 14.857M (± 4.8%) I/s - 74.198M in 5.006202s
Updated by eileencodes (Eileen Uchitelle) over 3 years ago
I ran benchmarks using railsbench and the branch with the CVAR cache is a lot faster, 657 requests per second over the 615 requests per second on master.
Master:
$ RAILS_ENV=production INTERVAL=100 WARMUP=1 BENCHMARK=10000 ruby bin/bench
ruby 3.1.0dev (2021-06-04T00:24:57Z master 91c542ad05) [x86_64-darwin19]
Warming up...
Warmup: 1 requests
Benchmark: 10000 requests
Request per second: 615.1 [#/s] (mean)
Percentage of the requests served within a certain time (ms)
50% 1.57
66% 1.68
75% 1.74
80% 1.78
90% 1.91
95% 2.06
98% 2.36
99% 2.67
100% 35.15
CVAR Branch
$ RAILS_ENV=production INTERVAL=100 WARMUP=1 BENCHMARK=10000 ruby bin/bench
ruby 3.1.0dev (2021-06-04T17:40:20Z add-cache-for-clas.. 37c96af98b) [x86_64-darwin19]
Warming up...
Warmup: 1 requests
Benchmark: 10000 requests
Request per second: 657.1 [#/s] (mean)
Percentage of the requests served within a certain time (ms)
50% 1.46
66% 1.56
75% 1.63
80% 1.68
90% 1.82
95% 2.01
98% 2.28
99% 2.50
100% 35.13
``
Updated by eileencodes (Eileen Uchitelle) over 3 years ago
This is the script/app I used to measure https://github.com/k0kubun/railsbench
Updated by ko1 (Koichi Sasada) over 3 years ago
Thank you for the benchmark.
How about that?
We also measured database queries in Rails and with the cvar cache they are about ~9% faster.
It seems w/o cache is faster. Could you check it? (opposite results?)
Updated by eileencodes (Eileen Uchitelle) over 3 years ago
I re-ran the benchmarks. I accidentally switched them when I pasted them into here.
Before: Rails 6.1 / Ruby master
ruby 3.1.0dev (2021-06-04T00:24:57Z master 91c542ad05) [x86_64-darwin19]
-- create_table(:posts, {:force=>true})
-> 0.0067s
-- create_table(:comments, {:force=>true})
-> 0.0003s
Run options: --seed 33069
# Running:
Warming up --------------------------------------
query 708.000 i/100ms
Calculating -------------------------------------
query 7.177k (± 5.6%) i/s - 36.108k in 5.047561s
After: Rails 6.1 / CVAR cache branch
ruby 3.1.0dev (2021-06-04T17:40:20Z add-cache-for-clas.. 37c96af98b) [x86_64-darwin19]
last_commit=Add a cache for class variables
-- create_table(:posts, {:force=>true})
-> 0.0067s
-- create_table(:comments, {:force=>true})
-> 0.0003s
Run options: --seed 45750
# Running:
Warming up --------------------------------------
query 837.000 i/100ms
Calculating -------------------------------------
query 8.343k (± 1.3%) i/s - 41.850k in 5.017218s
Updated by matz (Yukihiro Matsumoto) over 3 years ago
It seems great. The code complexity is my concern. But I agree with trying it.
Matz.
Updated by ko1 (Koichi Sasada) about 3 years ago
- Status changed from Open to Closed
merged: b91b3bc7717a97f4f1cdf6131b1688e1958dcfed
Updated by byroot (Jean Boussier) almost 2 years ago
- Related to Bug #19394: cvars in instance of cloned class point to source class's cvars even after class_variable_set on clone added