Feature #17763
closedImplement cache for cvars
Description
Introduce inline cache for class variable reads¶
@tenderlove and I would like to introduce an inline cache for class variable reads. We've attached a patch that introduces the cache. Class variable reads are popular in Rails applications for example, Active Record's #logger
.
GitHub PR: https://github.com/ruby/ruby/pull/4340
Cache Design¶
This patch introduces a hash table that's stored on the same class as the class variable value.
For example:
class A
@@foo = 1
end
class B < A
def self.read_foo
@@foo
end
end
The above code stores the value for @@foo
on the A
class and stores an inline cache value on the A
class as well. The instruction sequences for the read_foo
method point at the CVAR inline cache entry stored on class A
.
The lifecycle of these caches are similar to instance variable inline caches.
Diagram of the cache:¶
Performance Characteristics¶
When class variables are read, Ruby needs to check each class in the inheritance tree to ensure that the class variable isn't set on any other classes in the tree. If the same cvar is set on a class in the inheritance tree then a "cvar overtaken" error will be raised.
Because of how cvar reads work, the more classes in the inheritance tree the more expensive a cvar read is. To demonstrate this here is a benchmark that reads a cvar from a class with 1 module, 30 modules, and 100 modules in the inheritance chain. On Ruby master 100 modules is 8.5x slower than including 1 module. With the cache, there is no performance difference between including 1 module and including 100 modules.
Benchmark script:
require "benchmark/ips"
MODULES = ["B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "AA", "BB", "CC", "DD", "EE", "FF", "GG", "HH", "II", "JJ", "KK", "LL", "MM", "NN", "OO", "PP", "QQ", "RR", "SS", "TT", "UU", "VV", "WW", "XX", "YY", "ZZ", "AAA", "BBB", "CCC", "DDD", "EEE", "FFF", "GGG", "HHH", "III", "JJJ", "KKK", "LLL", "MMM", "NNN", "OOO", "PPP", "QQQ", "RRR", "SSS", "TTT", "UUU", "VVV", "WWW", "XXX", "YYY", "ZZZ", "AAAA", "BBBB", "CCCC", "DDDD", "EEEE", "FFFF", "GGGG", "HHHH", "IIII", "JJJJ", "KKKK", "LLLL", "MMMM", "NNNN", "OOOO", "PPPP", "QQQQ", "RRRR", "SSSS", "TTTT", "UUUU", "VVVV", "WWWW"]
class A
@@foo = 1
def self.foo
@@foo
end
eval <<-EOM
module #{MODULES.first}
end
include #{MODULES.first}
EOM
end
class Athirty
@@foo = 1
def self.foo
@@foo
end
MODULES.take(30).each do |module_name|
eval <<-EOM
module #{module_name}
end
include #{module_name}
EOM
end
end
class Ahundred
@@foo = 1
def self.foo
@@foo
end
MODULES.each do |module_name|
eval <<-EOM
module #{module_name}
end
include #{module_name}
EOM
end
end
Benchmark.ips do |x|
x.report "1 module" do
A.foo
end
x.report "30 modules" do
Athirty.foo
end
x.report "100 modules" do
Ahundred.foo
end
x.compare!
end
Ruby 3.0 master:
Warming up --------------------------------------
1 module 1.231M i/100ms
30 modules 432.020k i/100ms
100 modules 145.399k i/100ms
Calculating -------------------------------------
1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s
30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s
100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s
Comparison:
1 module: 12209958.3 i/s
30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower
100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower
Ruby 3.0 with cvar cache:
Warming up --------------------------------------
1 module 1.641M i/100ms
30 modules 1.655M i/100ms
100 modules 1.620M i/100ms
Calculating -------------------------------------
1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s
30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s
100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s
Comparison:
1 module: 16279458.0 i/s
100 modules: 16087484.6 i/s - same-ish: difference falls within error
30 modules: 15891406.2 i/s - same-ish: difference falls within error
Rails Application Benchmarks¶
We also benchmarked ActiveRecord::Base.logger
since logger
is a cvar and there are 63 modules in the inheritance chain. This is an example of a real-world improvement to Rails applications.
Benchmark:
require "benchmark/ips"
require_relative "config/environment"
Benchmark.ips do |x|
x.report "logger" do
ActiveRecord::Base.logger
end
end
Ruby 3.0 master:
Warming up --------------------------------------
logger 155.251k i/100ms
Calculating -------------------------------------
Ruby 3.0 with cvar cache:
Warming up --------------------------------------
logger 1.546M i/100ms
Calculating -------------------------------------
logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s
We also measured database queries in Rails and with the cvar cache they are about ~9% faster.
Benchmark code:
class BugTest < Minitest::Test
def test_association_stuff
post = Post.create!
Benchmark.ips do |x|
x.report "query" do
Post.first
end
end
end
end
Ruby 3.0 master / Rails 6.1:
Warming up --------------------------------------
query 790.000 i/100ms
Calculating -------------------------------------
query 7.601k (± 3.8%) i/s - 38.710k in 5.100534s
Ruby 3.0 cvar cache / Rails 6.1:
Warming up --------------------------------------
query 731.000 i/100ms
Calculating -------------------------------------
query 7.089k (± 3.3%) i/s - 35.819k in 5.058215s