Project

General

Profile

Actions

Feature #17763

open

Implement cache for cvars

Added by eileencodes (Eileen Uchitelle) 4 months ago. Updated about 1 month ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:103105]

Description

Introduce inline cache for class variable reads

@tenderlove and I would like to introduce an inline cache for class variable reads. We've attached a patch that introduces the cache. Class variable reads are popular in Rails applications for example, Active Record's #logger.

GitHub PR: https://github.com/ruby/ruby/pull/4340

Cache Design

This patch introduces a hash table that's stored on the same class as the class variable value.

For example:

class A
  @@foo = 1
end

class B < A
  def self.read_foo
    @@foo
  end
end

The above code stores the value for @@foo on the A class and stores an inline cache value on the A class as well. The instruction sequences for the read_foo method point at the CVAR inline cache entry stored on class A.

The lifecycle of these caches are similar to instance variable inline caches.

Diagram of the cache:

cvar cache

Performance Characteristics

When class variables are read, Ruby needs to check each class in the inheritance tree to ensure that the class variable isn't set on any other classes in the tree. If the same cvar is set on a class in the inheritance tree then a "cvar overtaken" error will be raised.

Because of how cvar reads work, the more classes in the inheritance tree the more expensive a cvar read is. To demonstrate this here is a benchmark that reads a cvar from a class with 1 module, 30 modules, and 100 modules in the inheritance chain. On Ruby master 100 modules is 8.5x slower than including 1 module. With the cache, there is no performance difference between including 1 module and including 100 modules.

Benchmark script:

require "benchmark/ips"

MODULES = ["B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "AA", "BB", "CC", "DD", "EE", "FF", "GG", "HH", "II", "JJ", "KK", "LL", "MM", "NN", "OO", "PP", "QQ", "RR", "SS", "TT", "UU", "VV", "WW", "XX", "YY", "ZZ", "AAA", "BBB", "CCC", "DDD", "EEE", "FFF", "GGG", "HHH", "III", "JJJ", "KKK", "LLL", "MMM", "NNN", "OOO", "PPP", "QQQ", "RRR", "SSS", "TTT", "UUU", "VVV", "WWW", "XXX", "YYY", "ZZZ", "AAAA", "BBBB", "CCCC", "DDDD", "EEEE", "FFFF", "GGGG", "HHHH", "IIII", "JJJJ", "KKKK", "LLLL", "MMMM", "NNNN", "OOOO", "PPPP", "QQQQ", "RRRR", "SSSS", "TTTT", "UUUU", "VVVV", "WWWW"]
class A
  @@foo = 1

  def self.foo
    @@foo
  end

  eval <<-EOM
    module #{MODULES.first}
    end

    include #{MODULES.first}
  EOM
end

class Athirty
  @@foo = 1

  def self.foo
    @@foo
  end

  MODULES.take(30).each do |module_name|
    eval <<-EOM
      module #{module_name}
      end

      include #{module_name}
    EOM
  end
end

class Ahundred
  @@foo = 1

  def self.foo
    @@foo
  end

  MODULES.each do |module_name|
    eval <<-EOM
      module #{module_name}
      end

      include #{module_name}
    EOM
  end
end

Benchmark.ips do |x|
  x.report "1 module" do
    A.foo
  end

  x.report "30 modules" do
    Athirty.foo
  end

  x.report "100 modules" do
    Ahundred.foo
  end

  x.compare!
end

Ruby 3.0 master:

Warming up --------------------------------------
            1 module     1.231M i/100ms
          30 modules   432.020k i/100ms
         100 modules   145.399k i/100ms
Calculating -------------------------------------
            1 module     12.210M (± 2.1%) i/s -     61.553M in   5.043400s
          30 modules      4.354M (± 2.7%) i/s -     22.033M in   5.063839s
         100 modules      1.434M (± 2.9%) i/s -      7.270M in   5.072531s

Comparison:
            1 module: 12209958.3 i/s
          30 modules:  4354217.8 i/s - 2.80x  (± 0.00) slower
         100 modules:  1434447.3 i/s - 8.51x  (± 0.00) slower

Ruby 3.0 with cvar cache:

Warming up --------------------------------------
            1 module     1.641M i/100ms
          30 modules     1.655M i/100ms
         100 modules     1.620M i/100ms
Calculating -------------------------------------
            1 module     16.279M (± 3.8%) i/s -     82.038M in   5.046923s
          30 modules     15.891M (± 3.9%) i/s -     79.459M in   5.007958s
         100 modules     16.087M (± 3.6%) i/s -     81.005M in   5.041931s

Comparison:
            1 module: 16279458.0 i/s
         100 modules: 16087484.6 i/s - same-ish: difference falls within error
          30 modules: 15891406.2 i/s - same-ish: difference falls within error

Rails Application Benchmarks

We also benchmarked ActiveRecord::Base.logger since logger is a cvar and there are 63 modules in the inheritance chain. This is an example of a real-world improvement to Rails applications.

Benchmark:

require "benchmark/ips"
require_relative "config/environment"

Benchmark.ips do |x|
  x.report "logger" do
    ActiveRecord::Base.logger
  end
end

Ruby 3.0 master:

Warming up --------------------------------------
              logger   155.251k i/100ms
Calculating -------------------------------------

Ruby 3.0 with cvar cache:

Warming up --------------------------------------
              logger     1.546M i/100ms
Calculating -------------------------------------
              logger     14.857M (± 4.8%) i/s -     74.198M in   5.006202s

We also measured database queries in Rails and with the cvar cache they are about ~9% faster.

Benchmark code:

class BugTest < Minitest::Test                                                                                                                               
  def test_association_stuff                                                                                                                                 
    post = Post.create!                                                                                                                                      

    Benchmark.ips do |x|                                                                                                                                     
      x.report "query" do                                                                                                                                    
        Post.first                                                                                                                                           
      end                                                                                                                                                    
    end                                                                                                                                                      
  end                                                                                                                                                        
end                                                                                                                                                          

Ruby 3.0 master / Rails 6.1:

Warming up --------------------------------------
               query   790.000  i/100ms
Calculating -------------------------------------
               query      7.601k (± 3.8%) i/s -     38.710k in   5.100534s

Ruby 3.0 cvar cache / Rails 6.1:

Warming up --------------------------------------
               query   731.000  i/100ms
Calculating -------------------------------------
               query      7.089k (± 3.3%) i/s -     35.819k in   5.058215s

Updated by eileencodes (Eileen Uchitelle) 4 months ago

This is the missing benchmark I copy and pasted incorrectly.

Ruby master / Rails 6.1:

```
Warming up ———————————————————
              logger   155.251k I/100ms
Calculating ——————————————————
              logger      1.502M (± 4.5%) I/s -      7.607M in   5.076869s
```

This branch / Rails 6.1:

```
Warming up ———————————————————
              logger     1.546M I/100ms
Calculating ——————————————————
              logger     14.857M (± 4.8%) I/s -     74.198M in   5.006202s
```

Updated by Eregon (Benoit Daloze) 4 months ago

Nice work.

I guess using a global serial here is the only way to handle overtaking without redoing the lookup every time like before.
I wonder, could the serial be global (not per module) but per name? (not asking to change anything, just curious)

Personally I wish we would just deprecate class variables, because I believe nobody wants the strange semantics, and they are inherently less efficient than instance variables on modules.
Or potentially make them equivalent in behavior to instance variables on modules (#14541).
But that is probably never going to happen.

Updated by tenderlovemaking (Aaron Patterson) 4 months ago

Eregon (Benoit Daloze) wrote in #note-2:

Nice work.

I guess using a global serial here is the only way to handle overtaking without redoing the lookup every time like before.
I wonder, could the serial be global (not per module) but per name? (not asking to change anything, just curious)

I think we probably could do that. Keeping a global counter just seemed like the easiest solution at the moment. Also, cvars seem very unpopular (compared with ivars) so I'm not sure adding complexity would be worth while.

Updated by Eregon (Benoit Daloze) 4 months ago

Eregon (Benoit Daloze) wrote in #note-2:

they are inherently less efficient than instance variables on modules.

chrisseaton (Chris Seaton) told me that it's not necessarily the case, and I think they can be exactly the same for a regular read (Shape check + load field in TruffleRuby).
I was thinking before this optimization, where CRuby would do an ancestor lookup every time for the overtaking logic IIRC.

Also, cvars seem very unpopular (compared with ivars) so I'm not sure adding complexity would be worth while.

Yeah, definitely. Which is why I've been hesitant about optimizing class variables, it's unclear if the cost of the not-so-trivial optimization would pay off.
It clearly seems worth it if code like the Rails logger keep using class variables though.

Updated by duerst (Martin Dürst) 4 months ago

Eregon (Benoit Daloze) wrote in #note-4:

Eregon (Benoit Daloze) wrote in #note-2:

Also, cvars seem very unpopular (compared with ivars) so I'm not sure adding complexity would be worth while.

Yeah, definitely. Which is why I've been hesitant about optimizing class variables, it's unclear if the cost of the not-so-trivial optimization would pay off.
It clearly seems worth it if code like the Rails logger keep using class variables though.

I'm not sure that "better optimize, because some important code keeps using this, but don't really optimize all the way, because it's no so popular" makes sense. (I'm not blaming Eregon, nor Aaron, nor Eileen, nor anybody else.)

What would it take e.g. to switch Rails logger to something else?

If the reason that some places are keeping class variables, then maybe we need to up with more convenient syntax for class instance variables.

Just trying to think out loud, sorry.

Updated by tenderlovemaking (Aaron Patterson) 4 months ago

duerst (Martin Dürst) wrote in #note-5:

What would it take e.g. to switch Rails logger to something else?

If the reason that some places are keeping class variables, then maybe we need to up with more convenient syntax for class instance variables.

Class variables have different semantics than class instance variables. It's possible to switch some things to use class instance variables without breaking existing behavior, but in other places we would have to basically re-implement class variable behavior (and at that point, you may as well use class variables).

I think the logger is one of the places where we depend on class variable semantics and it's unlikely to change.

Also in this case, it makes me feel weird to change the implementation of Rails when we can just make Ruby perform better. Changing Rails to suit Ruby seems like the tail wagging the dog (obviously not all cases are clear cut though)

Updated by Eregon (Benoit Daloze) 4 months ago

tenderlovemaking (Aaron Patterson) wrote in #note-6:

Also in this case, it makes me feel weird to change the implementation of Rails when we can just make Ruby perform better. Changing Rails to suit Ruby seems like the tail wagging the dog (obviously not all cases are clear cut though)

I have a different view of this, before this change, class variables were always extremely slow compared to class instance variables (10x in the case of ActiveRecord::Base.logger).
So changing it in Rails would fix it for all Ruby versions.

The logger seems handled by mattr_accessor:
https://github.com/rails/rails/blob/d612542336d9a61381311c95a27d801bb4094779/activerecord/lib/active_record/core.rb#L20

And mattr_accessor is what uses class variables:
https://github.com/rails/rails/blob/d612542336d9a61381311c95a27d801bb4094779/activesupport/lib/active_support/core_ext/module/attribute_accessors.rb
But it seems like it could use an instance variable on the class instead, and use attr_accessor + for instance methods def foo; self.class.foo and the writer.

but in other places we would have to basically re-implement class variable behavior (and at that point, you may as well use class variables).

That might be worth it for performance, i.e., using class methods + class instance variables instead of class variables.
Also https://github.com/rails/rails/blob/d612542336d9a61381311c95a27d801bb4094779/activesupport/lib/active_support/core_ext/class/attribute.rb#L85 seems to have some inheritance and yet already does not use class variables.

My personal point of view is class variables are de-facto deprecated syntax due to their strange and often unexpected semantics, and which about every Ruby tutorial/book recommends against using.
The least usages of class variables we have the better we are IMHO.
I once went through removing many class variables in the CRuby repo and it was fairly easy for most cases IIRC, in part because most of these cases did not want any inheritance.


To clarify, my point is this is good work and now that it's done and I think we should merge it,
but it seems worthwhile to look in Rails and other places if class variables are really needed.
If they can be replaced with class ivars, we could speed those accesses on all Ruby versions, not just 3.1+.

Updated by Eregon (Benoit Daloze) 2 months ago

FWIW, ActiveRecord::Base.logger no longer uses class variables since
https://github.com/rails/rails/commit/dcc2530af74cf6355a9206bb1d0b084a734fae3e (nice!)

Maybe we should really just deprecate class variables since their semantics are confusing (see below), their performance is not great / they are complicated to optimize, and they seem to be recommended to not be used since a long time in Ruby (de-facto deprecated).
That could start with a $VERBOSE=true warning, then $VERBOSE=false warning and then removal.

Example of weird semantics, which I would think no one would ever want:

class C; @@a = 1; end
class D < C; end
class D; @@a = 2; end
class C; p @@a; end # => 2

Updated by byroot (Jean Boussier) 2 months ago

FWIW, ActiveRecord::Base.logger no longer uses class variables since https://github.com/rails/rails/commit/dcc2530af74cf6355a9206bb1d0b084a734fae3e

Yes, we've talked about changing it since, it's not really related to this ticket. IMHO the two are orthogonal, AR::Base.logger just happened to be a good example of the potential performance impact.

Also note that it was replaced by class_attribute, which is a "Railsism" and has a non-trivial implementation. Non-rails projects with a similar problems don't really have a solution for this.

Maybe we should really just deprecate class variables since their semantics are confusing

The problem is that in some cases there's not really any alternative, so I'm all for deprecating them, but IMHO a replacement with better semantic is needed.

Or maybe their behavior could be changed with some kind of switch:

class MyClass
  new_class_variable_semantic

  @@foo = 1
end

Which would open the door to changing their semantic over the course of a few releases.

Updated by ko1 (Koichi Sasada) about 2 months ago

We also benchmarked ActiveRecord::Base.logger since logger is a cvar and there are 63 modules in the inheritance chain. This is an example of a real-world improvement to Rails applications.
Calculating -------------------------------------

It seems you missed to write the result.

We also measured database queries in Rails and with the cvar cache they are about ~9% faster.

It seems w/o cache is faster. Could you check it? (opposite results?)


Do you have application benchmark (request per second, for example)?

I understand the performance benefit, but it can introduce complexity so I want to confirm the value of this proposal.

Thanks,
Koichi

Updated by byroot (Jean Boussier) about 2 months ago

It seems you missed to write the result.

It's in the first comment: https://bugs.ruby-lang.org/issues/17763#note-1

Ruby master / Rails 6.1:

Warming up ———————————————————
              logger   155.251k I/100ms
Calculating ——————————————————
              logger      1.502M (± 4.5%) I/s -      7.607M in   5.076869s

This branch / Rails 6.1:

Warming up ———————————————————
              logger     1.546M I/100ms
Calculating ——————————————————
              logger     14.857M (± 4.8%) I/s -     74.198M in   5.006202s

Updated by eileencodes (Eileen Uchitelle) about 2 months ago

I ran benchmarks using railsbench and the branch with the CVAR cache is a lot faster, 657 requests per second over the 615 requests per second on master.

Master:

$ RAILS_ENV=production INTERVAL=100 WARMUP=1 BENCHMARK=10000 ruby bin/bench
ruby 3.1.0dev (2021-06-04T00:24:57Z master 91c542ad05) [x86_64-darwin19]
Warming up...
Warmup: 1 requests
Benchmark: 10000 requests

Request per second: 615.1 [#/s] (mean)

Percentage of the requests served within a certain time (ms)
  50%    1.57
  66%    1.68
  75%    1.74
  80%    1.78
  90%    1.91
  95%    2.06
  98%    2.36
  99%    2.67
 100%   35.15

CVAR Branch

$ RAILS_ENV=production INTERVAL=100 WARMUP=1 BENCHMARK=10000 ruby bin/bench
ruby 3.1.0dev (2021-06-04T17:40:20Z add-cache-for-clas.. 37c96af98b) [x86_64-darwin19]
Warming up...
Warmup: 1 requests
Benchmark: 10000 requests

Request per second: 657.1 [#/s] (mean)

Percentage of the requests served within a certain time (ms)
  50%    1.46
  66%    1.56
  75%    1.63
  80%    1.68
  90%    1.82
  95%    2.01
  98%    2.28
  99%    2.50
 100%   35.13
``

Updated by ko1 (Koichi Sasada) about 2 months ago

Thank you for the benchmark.

How about that?

We also measured database queries in Rails and with the cvar cache they are about ~9% faster.

It seems w/o cache is faster. Could you check it? (opposite results?)

Updated by eileencodes (Eileen Uchitelle) about 2 months ago

I re-ran the benchmarks. I accidentally switched them when I pasted them into here.

Before: Rails 6.1 / Ruby master

ruby 3.1.0dev (2021-06-04T00:24:57Z master 91c542ad05) [x86_64-darwin19]
-- create_table(:posts, {:force=>true})
   -> 0.0067s
-- create_table(:comments, {:force=>true})
   -> 0.0003s
Run options: --seed 33069

# Running:

Warming up --------------------------------------
               query   708.000  i/100ms
Calculating -------------------------------------
               query      7.177k (± 5.6%) i/s -     36.108k in   5.047561s

After: Rails 6.1 / CVAR cache branch

ruby 3.1.0dev (2021-06-04T17:40:20Z add-cache-for-clas.. 37c96af98b) [x86_64-darwin19]
last_commit=Add a cache for class variables
-- create_table(:posts, {:force=>true})
   -> 0.0067s
-- create_table(:comments, {:force=>true})
   -> 0.0003s
Run options: --seed 45750

# Running:

Warming up --------------------------------------
               query   837.000  i/100ms
Calculating -------------------------------------
               query      8.343k (± 1.3%) i/s -     41.850k in   5.017218s

Updated by matz (Yukihiro Matsumoto) about 1 month ago

It seems great. The code complexity is my concern. But I agree with trying it.

Matz.

Actions

Also available in: Atom PDF