Project

General

Profile

Actions

Bug #17373

closed

Ruby 3.0 is slower at Discourse bench than Ruby 2.7

Added by sam.saffron (Sam Saffron) about 4 years ago. Updated over 1 year ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:101276]

Description

We have a continuous effort to keep https://rubybench.org/ up to date. Unfortunately we have lapsed a bit on updates so I decided to run Discourse bench by hand.

On my local system I am noticing the timing are much slower on Discourse bench than they were on Ruby 2.7:

98c9120cc00ba691b4abcc13a49a30fa54638535

---
categories:
  50: 34
  75: 35
  90: 36
  99: 40
home:
  50: 47
  75: 50
  90: 56
  99: 92
topic:
  50: 53
  75: 55
  90: 58
  99: 92
categories_admin:
  50: 33
  75: 34
  90: 38
  99: 68
home_admin:
  50: 45
  75: 47
  90: 51
  99: 95
topic_admin:
  50: 51
  75: 52
  90: 54
  99: 59
timings:
  load_rails: 1449
ruby-version: 3.0.0-p-1
rss_kb: 293808
pss_kb: 284244

2.7.2


---
categories:
  50: 29
  75: 30
  90: 33
  99: 43
home:
  50: 42
  75: 43
  90: 47
  99: 73
topic:
  50: 47
  75: 47
  90: 52
  99: 54
categories_admin:
  50: 30
  75: 31
  90: 36
  99: 56
home_admin:
  50: 41
  75: 43
  90: 48
  99: 71
topic_admin:
  50: 47
  75: 48
  90: 52
  99: 55
timings:
  load_rails: 1339
ruby-version: 2.7.2-p137
rss_kb: 296836
pss_kb: 287310


2.6.6


---
categories:
  50: 30
  75: 30
  90: 31
  99: 51
home:
  50: 40
  75: 42
  90: 60
  99: 79
topic:
  50: 47
  75: 48
  90: 53
  99: 71
categories_admin:
  50: 30
  75: 31
  90: 35
  99: 50
home_admin:
  50: 41
  75: 43
  90: 62
  99: 100
topic_admin:
  50: 48
  75: 49
  90: 65
  99: 93
timings:
  load_rails: 1464
ruby-version: 2.6.6-p146
rss_kb: 332268
pss_kb: 322865

Concretely this means a typical "homepage" view is taking 47 milliseconds now, when in the past it was taking 42 or even 40 ms.

There may be some issues with the bench, you can see it here:

https://github.com/discourse/discourse/tree/ruby-3

The script to run the bench is ruby script/bench.rb

I re-tested against a cleanly built Ruby 2.7 direct from the Git branch to ensure compilation was not at fault.

This is pretty concerning, Discourse bench usually tracks Rails performance pretty accurately.


Files


Related issues 2 (0 open2 closed)

Related to Ruby master - Feature #16614: New method cache mechanism for GuildClosedko1 (Koichi Sasada)Actions
Related to Ruby master - Bug #17383: 3.0 recursion memory|speed issuesClosedActions

Updated by k0kubun (Takashi Kokubun) about 4 years ago

Could you, or anybody reading this ticket, use the benchmark to find revisions which seem to have contributed to the performance decrease? Detecting which change was bad has been the most time-consuming part even while we have a good benchmark. If we know what revision should be fixed, at the very least we'd have an option to revert it.

I'm aware of the performance trend and I've actually spent some time for looking at graphs like https://rubybench.org/ruby/ruby/commits?result_type=Optcarrot%20Lan_Master.nes&display_count=2000, but when we benchmark a suspicious revision and its previous one found from the graph locally at the same time, results are often flipped and we've failed to find problematic commits that way.

Updated by tenderlovemaking (Aaron Patterson) about 4 years ago

Here is what I've found so far:

I've been testing with RailsBench. I ran this command:

RAILS_ENV=production INTERVAL=100 WARMUP=1 BENCHMARK=10000 perf record -F 99 --call-graph dwarf (which ruby) bin/bench

Then I used this command to find where time is spent:

perf report -g graph --no-children

Here is the report on Ruby 2.7:

Ruby 2.7 report

Here is the report on Ruby 3.0 (master):

Ruby 3.0 report

rb_id_table_lookup is a new line in the Ruby 3.0 report. Looking at the callers, it looks like these are coming from the slow path method lookup:

Slow path lookup

b9007b6c54 touched these methods, so I ran the benchmarks on that commit:

aaron@whiteclaw ~/g/railsbench (ruby-3.0)> RAILS_ENV=production INTERVAL=100 WARMUP=1 BENCHMARK=10000 ruby -v bin/bench
ruby 2.8.0dev (2020-02-22T00:58:59Z pz-mvh-updated-mas.. b9007b6c54) [x86_64-linux]
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32454: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32615: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32651: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32740: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32756: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32822: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32863: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32888: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:31984: warning: assigned but unused variable - testEof
/home/aaron/.gem/ruby/2.7.3/gems/sassc-rails-2.1.2/lib/sassc/rails/functions.rb:7: warning: method redefined; discarding old asset_data_url
/home/aaron/.gem/ruby/2.7.3/gems/sprockets-4.0.1/lib/sprockets/sass_processor.rb:274: warning: previous definition of asset_data_url was here
/home/aaron/.gem/ruby/2.7.3/gems/sassc-rails-2.1.2/lib/sassc/rails/compressor.rb:7: warning: method redefined; discarding old initialize
/home/aaron/.gem/ruby/2.7.3/gems/sprockets-4.0.1/lib/sprockets/sass_compressor.rb:39: warning: previous definition of initialize was here
/home/aaron/.gem/ruby/2.7.3/gems/sassc-rails-2.1.2/lib/sassc/rails/compressor.rb:17: warning: method redefined; discarding old call
/home/aaron/.gem/ruby/2.7.3/gems/sprockets-4.0.1/lib/sprockets/sass_compressor.rb:49: warning: previous definition of call was here
Warming up...
Warmup: 1 requests
Benchmark: 10000 requests

Request per second: 993.5 [#/s] (mean)

Percentage of the requests served within a certain time (ms)
  50%    0.98
  66%    0.99
  75%    0.99
  80%    1.00
  90%    1.01
  95%    1.03
  98%    1.60
  99%    2.14
 100%   10.14

Mean request per second is 993.5.

If I run with the previous commit f2286925f0 here are the results:

aaron@whiteclaw ~/g/railsbench (ruby-3.0)> RAILS_ENV=production INTERVAL=100 WARMUP=1 BENCHMARK=10000 ruby -v bin/bench
ruby 2.8.0dev (2020-02-22T00:58:59Z pz-mvh-updated-mas.. f2286925f0) [x86_64-linux]
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32454: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32615: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32651: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32740: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32756: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32822: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32863: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:32888: warning: statement not reached
/home/aaron/.gem/ruby/2.7.3/gems/mail-2.7.1/lib/mail/parsers/address_lists_parser.rb:31984: warning: assigned but unused variable - testEof
/home/aaron/.gem/ruby/2.7.3/gems/sassc-rails-2.1.2/lib/sassc/rails/functions.rb:7: warning: method redefined; discarding old asset_data_url
/home/aaron/.gem/ruby/2.7.3/gems/sprockets-4.0.1/lib/sprockets/sass_processor.rb:274: warning: previous definition of asset_data_url was here
/home/aaron/.gem/ruby/2.7.3/gems/sassc-rails-2.1.2/lib/sassc/rails/compressor.rb:7: warning: method redefined; discarding old initialize
/home/aaron/.gem/ruby/2.7.3/gems/sprockets-4.0.1/lib/sprockets/sass_compressor.rb:39: warning: previous definition of initialize was here
/home/aaron/.gem/ruby/2.7.3/gems/sassc-rails-2.1.2/lib/sassc/rails/compressor.rb:17: warning: method redefined; discarding old call
/home/aaron/.gem/ruby/2.7.3/gems/sprockets-4.0.1/lib/sprockets/sass_compressor.rb:49: warning: previous definition of call was here
Warming up...
Warmup: 1 requests
Benchmark: 10000 requests

Request per second: 1164.9 [#/s] (mean)

Percentage of the requests served within a certain time (ms)
  50%    0.85
  66%    0.85
  75%    0.85
  80%    0.86
  90%    0.86
  95%    0.87
  98%    0.89
  99%    1.76
 100%    3.44

Mean RPS is 1164.

I think b9007b6c54 introduced the performance problem, but I'm not sure why. The caller lookup_method_table didn't change, it just got moved from here to here, and rb_id_table_lookup is exactly the same too.

My current guess is that we're getting cache misses more frequently for some reason. I will look in to that now, but I wanted to share what I've found so far.

Updated by tenderlovemaking (Aaron Patterson) about 4 years ago

I backported the mc_search debug counter. On Ruby master, search_method is called 2,480,052 times, but on f2286925f08406bc857f7b03ad6779a5d61443ae it's only called 501,903 times. I think this is the source of our performance degradation, and it seems related to cache misses.

Actions #4

Updated by Eregon (Benoit Daloze) about 4 years ago

  • Related to Feature #16614: New method cache mechanism for Guild added

Updated by tenderlovemaking (Aaron Patterson) about 4 years ago

It looks like the global method cache was removed.

On f2286925f0 the debug counters looked like this:

[RUBY_DEBUG_COUNTER]	mc_inline_hit                 	    54,445,333
[RUBY_DEBUG_COUNTER]	mc_inline_miss                	     1,800,036
[RUBY_DEBUG_COUNTER]	mc_global_hit                 	    14,489,677
[RUBY_DEBUG_COUNTER]	mc_global_miss                	       790,656

mc_global_hit is the number of times the global method cache was hit. I think for Rails, the problem is with respond_to. It looks like Object#respond_to? used to use the global method cache:

callers

Here is a benchmark that demonstrates the impact of removing the global method cache:

require "benchmark/ips"

class Base
  def bar; end
end

class Foo
  30.times do |i|
    class_eval "module M#{i}; end; include M#{i}"
  end
end

foo = Foo.new

Benchmark.ips do |x|
  x.report("respond to") { foo.respond_to?(:bar) }
end

On master:

aaron@whiteclaw ~/git> ruby/ruby -v test.rb
ruby 3.0.0dev (2020-12-07T17:10:03Z master 146b3695c0) [x86_64-linux]
Warming up --------------------------------------
          respond to   585.997k i/100ms
Calculating -------------------------------------
          respond to      5.869M (± 0.9%) i/s -     29.886M in   5.092708s

On f2286925f0:

aaron@whiteclaw ~/git> ruby/ruby -v test.rb
ruby 2.8.0dev (2020-02-22T00:58:59Z before-imc f2286925f0) [x86_64-linux]
Warming up --------------------------------------
          respond to     1.686M i/100ms
Calculating -------------------------------------
          respond to     16.888M (± 0.6%) i/s -     85.971M in   5.090695s

f2286925f0 is about 3x faster than master.

It seems that the global method cache was important for Rails benchmarks. I guess we can't have a global method cache because of Ractors? Would it be possible to make a per-Ractor "global" cache?

Updated by k0kubun (Takashi Kokubun) about 4 years ago

https://github.com/ruby/ruby/pull/2583 was an attempt to reduce method cache misses on Discourse, and IIRC https://github.com/ruby/ruby/pull/2888 removed it. It might be related at least for some extent.

Updated by sam.saffron (Sam Saffron) about 4 years ago

I made this tiny script, which you can use to test (in the branch in the first post)

#!/usr/bin/env ruby
# frozen_string_literal: true

ENV['RAILS_ENV'] = 'profile'

require File.expand_path("../../config/environment", __FILE__)

def make_request
  req = {
    "REQUEST_METHOD" => "GET",
    "SCRIPT_NAME" => "",
    "PATH_INFO" => "/",
    "QUERY_STRING" => "",
    "SERVER_NAME" => "127.0.0.1",
    "SERVER_PORT" => "80",
    "REQUEST_PATH" => "/",
    "REMOTE_ADDR" => "127.0.0.1",
    "HTTP_VERSION" => "HTTP/1.0",
    "rack.input" => StringIO.new(""),
    "rack.version" => [1, 2],
    "rack.url_scheme" => "http"
  }

  Rails.application.call(req)
end

10.times do
  make_request
end

3.times do
  GC.start(full_mark: true, immediate_sweep: true)
end

start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
RubyVM.reset_debug_counters
make_request
RubyVM.show_debug_counters
elapsed = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start

puts "Duration was #{elapsed}"
gets

Results for 2.7 are:


[RUBY_DEBUG_COUNTER]	417420 method call
[RUBY_DEBUG_COUNTER]	mc_inline_hit                 	      407,308
[RUBY_DEBUG_COUNTER]	mc_inline_miss                	       17,502
[RUBY_DEBUG_COUNTER]	mc_global_hit                 	       86,554
[RUBY_DEBUG_COUNTER]	mc_global_miss                	          207
[RUBY_DEBUG_COUNTER]	mc_global_state_miss          	        2,762
[RUBY_DEBUG_COUNTER]	mc_class_serial_miss          	       14,740
[RUBY_DEBUG_COUNTER]	mc_cme_complement             	       10,191
[RUBY_DEBUG_COUNTER]	mc_cme_complement_hit         	       10,191
[RUBY_DEBUG_COUNTER]	mc_search_super               	        2,770
[RUBY_DEBUG_COUNTER]	mc_miss_by_nome               	            0
[RUBY_DEBUG_COUNTER]	mc_miss_by_distinct           	       10,348
[RUBY_DEBUG_COUNTER]	mc_miss_by_refine             	            0
[RUBY_DEBUG_COUNTER]	mc_miss_by_visi               	          218
[RUBY_DEBUG_COUNTER]	mc_miss_spurious              	        6,936
[RUBY_DEBUG_COUNTER]	mc_miss_reuse_call            	        2,245
[RUBY_DEBUG_COUNTER]	ccf_general                   	       16,704
[RUBY_DEBUG_COUNTER]	ccf_iseq_setup                	       43,129
[RUBY_DEBUG_COUNTER]	ccf_iseq_setup_0start         	        1,055
[RUBY_DEBUG_COUNTER]	ccf_iseq_setup_tailcall_0start	            0
[RUBY_DEBUG_COUNTER]	ccf_iseq_fix                  	      134,778
[RUBY_DEBUG_COUNTER]	ccf_iseq_opt                  	       14,039
[RUBY_DEBUG_COUNTER]	ccf_iseq_kw1                  	          124
[RUBY_DEBUG_COUNTER]	ccf_iseq_kw2                  	          195
[RUBY_DEBUG_COUNTER]	ccf_cfunc                     	      139,250
[RUBY_DEBUG_COUNTER]	ccf_ivar                      	       51,208
[RUBY_DEBUG_COUNTER]	ccf_attrset                   	          698
[RUBY_DEBUG_COUNTER]	ccf_method_missing            	          524
[RUBY_DEBUG_COUNTER]	ccf_zsuper                    	            0
[RUBY_DEBUG_COUNTER]	ccf_bmethod                   	       18,238
[RUBY_DEBUG_COUNTER]	ccf_opt_send                  	        8,025
[RUBY_DEBUG_COUNTER]	ccf_opt_call                  	          857
[RUBY_DEBUG_COUNTER]	ccf_opt_block_call            	            0
[RUBY_DEBUG_COUNTER]	ccf_super_method              	       13,155
[RUBY_DEBUG_COUNTER]	frame_push                    	      392,112
[RUBY_DEBUG_COUNTER]	frame_push_method             	      193,320
[RUBY_DEBUG_COUNTER]	frame_push_block              	       35,436
[RUBY_DEBUG_COUNTER]	frame_push_class              	            0
[RUBY_DEBUG_COUNTER]	frame_push_top                	            0
[RUBY_DEBUG_COUNTER]	frame_push_cfunc              	      159,590
[RUBY_DEBUG_COUNTER]	frame_push_ifunc              	        3,764
[RUBY_DEBUG_COUNTER]	frame_push_eval               	            0
[RUBY_DEBUG_COUNTER]	frame_push_rescue             	            2
[RUBY_DEBUG_COUNTER]	frame_push_dummy              	            0
[RUBY_DEBUG_COUNTER]	frame_R2R                     	      204,735
[RUBY_DEBUG_COUNTER]	frame_R2C                     	      148,739
[RUBY_DEBUG_COUNTER]	frame_C2C                     	       14,615
[RUBY_DEBUG_COUNTER]	frame_C2R                     	       24,023
[RUBY_DEBUG_COUNTER]	ivar_get_ic_hit               	      105,173
[RUBY_DEBUG_COUNTER]	ivar_get_ic_miss              	       20,240
[RUBY_DEBUG_COUNTER]	ivar_get_ic_miss_serial       	       28,221
[RUBY_DEBUG_COUNTER]	ivar_get_ic_miss_unset        	       11,963
[RUBY_DEBUG_COUNTER]	ivar_get_ic_miss_noobject     	            0
[RUBY_DEBUG_COUNTER]	ivar_set_ic_hit               	       10,802
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss              	       13,473
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss_serial       	        8,262
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss_unset        	            4
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss_oorange      	        3,924
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss_noobject     	        1,283
[RUBY_DEBUG_COUNTER]	ivar_get_base                 	       21,572
[RUBY_DEBUG_COUNTER]	ivar_set_base                 	       13,494
[RUBY_DEBUG_COUNTER]	lvar_get                      	      355,050
[RUBY_DEBUG_COUNTER]	lvar_get_dynamic              	       40,160
[RUBY_DEBUG_COUNTER]	lvar_set                      	       78,325
[RUBY_DEBUG_COUNTER]	lvar_set_dynamic              	        1,252
[RUBY_DEBUG_COUNTER]	lvar_set_slowpath             	          101
[RUBY_DEBUG_COUNTER]	gc_count                      	            0
[RUBY_DEBUG_COUNTER]	gc_minor_newobj               	            0
[RUBY_DEBUG_COUNTER]	gc_minor_malloc               	            0
[RUBY_DEBUG_COUNTER]	gc_minor_method               	            0
[RUBY_DEBUG_COUNTER]	gc_minor_capi                 	            0
[RUBY_DEBUG_COUNTER]	gc_minor_stress               	            0
[RUBY_DEBUG_COUNTER]	gc_major_nofree               	            0
[RUBY_DEBUG_COUNTER]	gc_major_oldgen               	            0
[RUBY_DEBUG_COUNTER]	gc_major_shady                	            0
[RUBY_DEBUG_COUNTER]	gc_major_force                	            0
[RUBY_DEBUG_COUNTER]	gc_major_oldmalloc            	            0
[RUBY_DEBUG_COUNTER]	gc_isptr_trial                	            0
[RUBY_DEBUG_COUNTER]	gc_isptr_range                	            0
[RUBY_DEBUG_COUNTER]	gc_isptr_align                	            0
[RUBY_DEBUG_COUNTER]	gc_isptr_maybe                	            0
[RUBY_DEBUG_COUNTER]	obj_newobj                    	       58,178
[RUBY_DEBUG_COUNTER]	obj_newobj_slowpath           	        1,517
[RUBY_DEBUG_COUNTER]	obj_newobj_wb_unprotected     	          877
[RUBY_DEBUG_COUNTER]	obj_free                      	            0
[RUBY_DEBUG_COUNTER]	obj_promote                   	            0
[RUBY_DEBUG_COUNTER]	obj_wb_unprotect              	            0
[RUBY_DEBUG_COUNTER]	obj_obj_embed                 	            0
[RUBY_DEBUG_COUNTER]	obj_obj_transient             	            0
[RUBY_DEBUG_COUNTER]	obj_obj_ptr                   	            0
[RUBY_DEBUG_COUNTER]	obj_str_ptr                   	            0
[RUBY_DEBUG_COUNTER]	obj_str_embed                 	            0
[RUBY_DEBUG_COUNTER]	obj_str_shared                	            0
[RUBY_DEBUG_COUNTER]	obj_str_nofree                	            0
[RUBY_DEBUG_COUNTER]	obj_str_fstr                  	            0
[RUBY_DEBUG_COUNTER]	obj_ary_embed                 	            0
[RUBY_DEBUG_COUNTER]	obj_ary_transient             	            0
[RUBY_DEBUG_COUNTER]	obj_ary_ptr                   	            1
[RUBY_DEBUG_COUNTER]	obj_ary_extracapa             	            0
[RUBY_DEBUG_COUNTER]	obj_ary_shared_create         	           81
[RUBY_DEBUG_COUNTER]	obj_ary_shared                	            0
[RUBY_DEBUG_COUNTER]	obj_ary_shared_root_occupied  	            0
[RUBY_DEBUG_COUNTER]	obj_hash_empty                	            0
[RUBY_DEBUG_COUNTER]	obj_hash_1                    	            0
[RUBY_DEBUG_COUNTER]	obj_hash_2                    	            0
[RUBY_DEBUG_COUNTER]	obj_hash_3                    	            0
[RUBY_DEBUG_COUNTER]	obj_hash_4                    	            0
[RUBY_DEBUG_COUNTER]	obj_hash_5_8                  	            0
[RUBY_DEBUG_COUNTER]	obj_hash_g8                   	            0
[RUBY_DEBUG_COUNTER]	obj_hash_null                 	            0
[RUBY_DEBUG_COUNTER]	obj_hash_ar                   	            0
[RUBY_DEBUG_COUNTER]	obj_hash_st                   	            0
[RUBY_DEBUG_COUNTER]	obj_hash_transient            	            0
[RUBY_DEBUG_COUNTER]	obj_hash_force_convert        	            0
[RUBY_DEBUG_COUNTER]	obj_struct_embed              	            0
[RUBY_DEBUG_COUNTER]	obj_struct_transient          	            0
[RUBY_DEBUG_COUNTER]	obj_struct_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_data_empty                	            0
[RUBY_DEBUG_COUNTER]	obj_data_xfree                	            0
[RUBY_DEBUG_COUNTER]	obj_data_imm_free             	            0
[RUBY_DEBUG_COUNTER]	obj_data_zombie               	            0
[RUBY_DEBUG_COUNTER]	obj_match_under4              	            0
[RUBY_DEBUG_COUNTER]	obj_match_ge4                 	            0
[RUBY_DEBUG_COUNTER]	obj_match_ge8                 	            0
[RUBY_DEBUG_COUNTER]	obj_match_ptr                 	            0
[RUBY_DEBUG_COUNTER]	obj_iclass_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_class_ptr                 	            0
[RUBY_DEBUG_COUNTER]	obj_module_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_bignum_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_bignum_embed              	            0
[RUBY_DEBUG_COUNTER]	obj_float                     	            0
[RUBY_DEBUG_COUNTER]	obj_complex                   	            0
[RUBY_DEBUG_COUNTER]	obj_rational                  	            0
[RUBY_DEBUG_COUNTER]	obj_regexp_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_file_ptr                  	            0
[RUBY_DEBUG_COUNTER]	obj_symbol                    	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_ment                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_iseq                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_env                 	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_tmpbuf              	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_ast                 	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_cref                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_svar                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_throw_data          	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_ifunc               	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_memo                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_parser_strterm      	            0
[RUBY_DEBUG_COUNTER]	artable_hint_hit              	       23,024
[RUBY_DEBUG_COUNTER]	artable_hint_miss             	          174
[RUBY_DEBUG_COUNTER]	artable_hint_notfound         	       27,875
[RUBY_DEBUG_COUNTER]	heap_xmalloc                  	        7,642
[RUBY_DEBUG_COUNTER]	heap_xrealloc                 	          153
[RUBY_DEBUG_COUNTER]	heap_xfree                    	          539
[RUBY_DEBUG_COUNTER]	theap_alloc                   	        9,295
[RUBY_DEBUG_COUNTER]	theap_alloc_fail              	            0
[RUBY_DEBUG_COUNTER]	theap_evacuate                	            0
[RUBY_DEBUG_COUNTER]	mjit_exec                     	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_not_added           	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_not_added_add_iseq  	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_not_ready           	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_not_compiled        	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_call_func           	            0
[RUBY_DEBUG_COUNTER]	mjit_frame_VM2VM              	            0
[RUBY_DEBUG_COUNTER]	mjit_frame_VM2JT              	            0
[RUBY_DEBUG_COUNTER]	mjit_frame_JT2JT              	            0
[RUBY_DEBUG_COUNTER]	mjit_frame_JT2VM              	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel                   	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_ivar_inline       	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_send_inline       	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_opt_insn          	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_invalidate_all    	            0
[RUBY_DEBUG_COUNTER]	mjit_length_unit_queue        	            0
[RUBY_DEBUG_COUNTER]	mjit_length_active_units      	            0
[RUBY_DEBUG_COUNTER]	mjit_length_compact_units     	            0
[RUBY_DEBUG_COUNTER]	mjit_length_stale_units       	            0
[RUBY_DEBUG_COUNTER]	mjit_compile_failures         	            0

Results for 3.0 are:

[RUBY_DEBUG_COUNTER]	417140 show_debug_counters
[RUBY_DEBUG_COUNTER]	mc_inline_hit                 	      383,937
[RUBY_DEBUG_COUNTER]	mc_inline_miss_klass          	       37,470
[RUBY_DEBUG_COUNTER]	mc_inline_miss_invalidated    	            0
[RUBY_DEBUG_COUNTER]	mc_cme_complement             	          166
[RUBY_DEBUG_COUNTER]	mc_cme_complement_hit         	          166
[RUBY_DEBUG_COUNTER]	mc_search                     	        7,248
[RUBY_DEBUG_COUNTER]	mc_search_notfound            	        4,254
[RUBY_DEBUG_COUNTER]	mc_search_super               	      107,104
[RUBY_DEBUG_COUNTER]	obj_respond_to                	          613
[RUBY_DEBUG_COUNTER]	ci_packed                     	       13,043
[RUBY_DEBUG_COUNTER]	ci_kw                         	            0
[RUBY_DEBUG_COUNTER]	ci_nokw                       	            0
[RUBY_DEBUG_COUNTER]	ci_runtime                    	       13,043
[RUBY_DEBUG_COUNTER]	cc_new                        	          228
[RUBY_DEBUG_COUNTER]	cc_temp                       	            0
[RUBY_DEBUG_COUNTER]	cc_found_ccs                  	       36,718
[RUBY_DEBUG_COUNTER]	cc_ent_invalidate             	            0
[RUBY_DEBUG_COUNTER]	cc_cme_invalidate             	            0
[RUBY_DEBUG_COUNTER]	cc_invalidate_leaf            	            0
[RUBY_DEBUG_COUNTER]	cc_invalidate_leaf_ccs        	            0
[RUBY_DEBUG_COUNTER]	cc_invalidate_leaf_callable   	            0
[RUBY_DEBUG_COUNTER]	cc_invalidate_tree            	            0
[RUBY_DEBUG_COUNTER]	cc_invalidate_tree_cme        	            0
[RUBY_DEBUG_COUNTER]	cc_invalidate_tree_callable   	            0
[RUBY_DEBUG_COUNTER]	ccs_free                      	            0
[RUBY_DEBUG_COUNTER]	ccs_maxlen                    	            2
[RUBY_DEBUG_COUNTER]	ccs_found                     	       57,960
[RUBY_DEBUG_COUNTER]	iseq_num                      	            0
[RUBY_DEBUG_COUNTER]	iseq_cd_num                   	            0
[RUBY_DEBUG_COUNTER]	ccf_general                   	       11,291
[RUBY_DEBUG_COUNTER]	ccf_iseq_setup                	       36,053
[RUBY_DEBUG_COUNTER]	ccf_iseq_setup_0start         	        1,068
[RUBY_DEBUG_COUNTER]	ccf_iseq_setup_tailcall_0start	            0
[RUBY_DEBUG_COUNTER]	ccf_iseq_fix                  	      154,419
[RUBY_DEBUG_COUNTER]	ccf_iseq_opt                  	       16,076
[RUBY_DEBUG_COUNTER]	ccf_iseq_kw1                  	          124
[RUBY_DEBUG_COUNTER]	ccf_iseq_kw2                  	          231
[RUBY_DEBUG_COUNTER]	ccf_cfunc                     	        1,878
[RUBY_DEBUG_COUNTER]	ccf_cfunc_with_frame          	      123,495
[RUBY_DEBUG_COUNTER]	ccf_ivar                      	       51,189
[RUBY_DEBUG_COUNTER]	ccf_attrset                   	          698
[RUBY_DEBUG_COUNTER]	ccf_method_missing            	          524
[RUBY_DEBUG_COUNTER]	ccf_zsuper                    	            0
[RUBY_DEBUG_COUNTER]	ccf_bmethod                   	       18,238
[RUBY_DEBUG_COUNTER]	ccf_opt_send                  	        8,025
[RUBY_DEBUG_COUNTER]	ccf_opt_call                  	          857
[RUBY_DEBUG_COUNTER]	ccf_opt_block_call            	            0
[RUBY_DEBUG_COUNTER]	ccf_super_method              	           32
[RUBY_DEBUG_COUNTER]	frame_push                    	      390,819
[RUBY_DEBUG_COUNTER]	frame_push_method             	      207,971
[RUBY_DEBUG_COUNTER]	frame_push_block              	       35,437
[RUBY_DEBUG_COUNTER]	frame_push_class              	            0
[RUBY_DEBUG_COUNTER]	frame_push_top                	            0
[RUBY_DEBUG_COUNTER]	frame_push_cfunc              	      143,645
[RUBY_DEBUG_COUNTER]	frame_push_ifunc              	        3,764
[RUBY_DEBUG_COUNTER]	frame_push_eval               	            0
[RUBY_DEBUG_COUNTER]	frame_push_rescue             	            2
[RUBY_DEBUG_COUNTER]	frame_push_dummy              	            0
[RUBY_DEBUG_COUNTER]	frame_R2R                     	      219,554
[RUBY_DEBUG_COUNTER]	frame_R2C                     	      132,840
[RUBY_DEBUG_COUNTER]	frame_C2C                     	       14,569
[RUBY_DEBUG_COUNTER]	frame_C2R                     	       23,856
[RUBY_DEBUG_COUNTER]	ivar_get_ic_hit               	      106,326
[RUBY_DEBUG_COUNTER]	ivar_get_ic_miss              	       20,239
[RUBY_DEBUG_COUNTER]	ivar_get_ic_miss_serial       	       27,800
[RUBY_DEBUG_COUNTER]	ivar_get_ic_miss_unset        	       10,739
[RUBY_DEBUG_COUNTER]	ivar_get_ic_miss_noobject     	            0
[RUBY_DEBUG_COUNTER]	ivar_set_ic_hit               	       17,837
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss              	        1,283
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss_serial       	        5,148
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss_unset        	            7
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss_iv_hit       	        5,155
[RUBY_DEBUG_COUNTER]	ivar_set_ic_miss_noobject     	        1,283
[RUBY_DEBUG_COUNTER]	ivar_get_base                 	       21,570
[RUBY_DEBUG_COUNTER]	ivar_set_base                 	        1,304
[RUBY_DEBUG_COUNTER]	lvar_get                      	      354,236
[RUBY_DEBUG_COUNTER]	lvar_get_dynamic              	       40,161
[RUBY_DEBUG_COUNTER]	lvar_set                      	       78,173
[RUBY_DEBUG_COUNTER]	lvar_set_dynamic              	        1,252
[RUBY_DEBUG_COUNTER]	lvar_set_slowpath             	          101
[RUBY_DEBUG_COUNTER]	gc_count                      	            0
[RUBY_DEBUG_COUNTER]	gc_minor_newobj               	            0
[RUBY_DEBUG_COUNTER]	gc_minor_malloc               	            0
[RUBY_DEBUG_COUNTER]	gc_minor_method               	            0
[RUBY_DEBUG_COUNTER]	gc_minor_capi                 	            0
[RUBY_DEBUG_COUNTER]	gc_minor_stress               	            0
[RUBY_DEBUG_COUNTER]	gc_major_nofree               	            0
[RUBY_DEBUG_COUNTER]	gc_major_oldgen               	            0
[RUBY_DEBUG_COUNTER]	gc_major_shady                	            0
[RUBY_DEBUG_COUNTER]	gc_major_force                	            0
[RUBY_DEBUG_COUNTER]	gc_major_oldmalloc            	            0
[RUBY_DEBUG_COUNTER]	gc_isptr_trial                	            0
[RUBY_DEBUG_COUNTER]	gc_isptr_range                	            0
[RUBY_DEBUG_COUNTER]	gc_isptr_align                	            0
[RUBY_DEBUG_COUNTER]	gc_isptr_maybe                	            0
[RUBY_DEBUG_COUNTER]	obj_newobj                    	       57,746
[RUBY_DEBUG_COUNTER]	obj_newobj_slowpath           	          873
[RUBY_DEBUG_COUNTER]	obj_newobj_wb_unprotected     	          733
[RUBY_DEBUG_COUNTER]	obj_free                      	            0
[RUBY_DEBUG_COUNTER]	obj_promote                   	            0
[RUBY_DEBUG_COUNTER]	obj_wb_unprotect              	            0
[RUBY_DEBUG_COUNTER]	obj_obj_embed                 	            0
[RUBY_DEBUG_COUNTER]	obj_obj_transient             	            0
[RUBY_DEBUG_COUNTER]	obj_obj_ptr                   	            0
[RUBY_DEBUG_COUNTER]	obj_str_ptr                   	            0
[RUBY_DEBUG_COUNTER]	obj_str_embed                 	            0
[RUBY_DEBUG_COUNTER]	obj_str_shared                	            0
[RUBY_DEBUG_COUNTER]	obj_str_nofree                	            0
[RUBY_DEBUG_COUNTER]	obj_str_fstr                  	            0
[RUBY_DEBUG_COUNTER]	obj_ary_embed                 	            0
[RUBY_DEBUG_COUNTER]	obj_ary_transient             	            0
[RUBY_DEBUG_COUNTER]	obj_ary_ptr                   	            1
[RUBY_DEBUG_COUNTER]	obj_ary_extracapa             	            0
[RUBY_DEBUG_COUNTER]	obj_ary_shared_create         	           81
[RUBY_DEBUG_COUNTER]	obj_ary_shared                	            0
[RUBY_DEBUG_COUNTER]	obj_ary_shared_root_occupied  	            0
[RUBY_DEBUG_COUNTER]	obj_hash_empty                	            0
[RUBY_DEBUG_COUNTER]	obj_hash_1                    	            0
[RUBY_DEBUG_COUNTER]	obj_hash_2                    	            0
[RUBY_DEBUG_COUNTER]	obj_hash_3                    	            0
[RUBY_DEBUG_COUNTER]	obj_hash_4                    	            0
[RUBY_DEBUG_COUNTER]	obj_hash_5_8                  	            0
[RUBY_DEBUG_COUNTER]	obj_hash_g8                   	            0
[RUBY_DEBUG_COUNTER]	obj_hash_null                 	            0
[RUBY_DEBUG_COUNTER]	obj_hash_ar                   	            0
[RUBY_DEBUG_COUNTER]	obj_hash_st                   	            0
[RUBY_DEBUG_COUNTER]	obj_hash_transient            	            0
[RUBY_DEBUG_COUNTER]	obj_hash_force_convert        	            0
[RUBY_DEBUG_COUNTER]	obj_struct_embed              	            0
[RUBY_DEBUG_COUNTER]	obj_struct_transient          	            0
[RUBY_DEBUG_COUNTER]	obj_struct_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_data_empty                	            0
[RUBY_DEBUG_COUNTER]	obj_data_xfree                	            0
[RUBY_DEBUG_COUNTER]	obj_data_imm_free             	            0
[RUBY_DEBUG_COUNTER]	obj_data_zombie               	            0
[RUBY_DEBUG_COUNTER]	obj_match_under4              	            0
[RUBY_DEBUG_COUNTER]	obj_match_ge4                 	            0
[RUBY_DEBUG_COUNTER]	obj_match_ge8                 	            0
[RUBY_DEBUG_COUNTER]	obj_match_ptr                 	            0
[RUBY_DEBUG_COUNTER]	obj_iclass_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_class_ptr                 	            0
[RUBY_DEBUG_COUNTER]	obj_module_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_bignum_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_bignum_embed              	            0
[RUBY_DEBUG_COUNTER]	obj_float                     	            0
[RUBY_DEBUG_COUNTER]	obj_complex                   	            0
[RUBY_DEBUG_COUNTER]	obj_rational                  	            0
[RUBY_DEBUG_COUNTER]	obj_regexp_ptr                	            0
[RUBY_DEBUG_COUNTER]	obj_file_ptr                  	            0
[RUBY_DEBUG_COUNTER]	obj_symbol                    	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_ment                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_iseq                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_env                 	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_tmpbuf              	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_ast                 	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_cref                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_svar                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_throw_data          	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_ifunc               	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_memo                	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_parser_strterm      	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_callinfo            	            0
[RUBY_DEBUG_COUNTER]	obj_imemo_callcache           	            0
[RUBY_DEBUG_COUNTER]	artable_hint_hit              	       22,575
[RUBY_DEBUG_COUNTER]	artable_hint_miss             	           70
[RUBY_DEBUG_COUNTER]	artable_hint_notfound         	       27,631
[RUBY_DEBUG_COUNTER]	heap_xmalloc                  	        6,942
[RUBY_DEBUG_COUNTER]	heap_xrealloc                 	          987
[RUBY_DEBUG_COUNTER]	heap_xfree                    	          423
[RUBY_DEBUG_COUNTER]	theap_alloc                   	        7,715
[RUBY_DEBUG_COUNTER]	theap_alloc_fail              	            0
[RUBY_DEBUG_COUNTER]	theap_evacuate                	            0
[RUBY_DEBUG_COUNTER]	mjit_exec                     	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_not_added           	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_not_ready           	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_not_compiled        	            0
[RUBY_DEBUG_COUNTER]	mjit_exec_call_func           	            0
[RUBY_DEBUG_COUNTER]	mjit_add_iseq_to_process      	            0
[RUBY_DEBUG_COUNTER]	mjit_unload_units             	            0
[RUBY_DEBUG_COUNTER]	mjit_frame_VM2VM              	            0
[RUBY_DEBUG_COUNTER]	mjit_frame_VM2JT              	            0
[RUBY_DEBUG_COUNTER]	mjit_frame_JT2JT              	            0
[RUBY_DEBUG_COUNTER]	mjit_frame_JT2VM              	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel                   	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_ivar_inline       	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_exivar_inline     	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_send_inline       	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_opt_insn          	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_invalidate_all    	            0
[RUBY_DEBUG_COUNTER]	mjit_cancel_leave             	            0
[RUBY_DEBUG_COUNTER]	mjit_length_unit_queue        	            0
[RUBY_DEBUG_COUNTER]	mjit_length_active_units      	            0
[RUBY_DEBUG_COUNTER]	mjit_length_compact_units     	            0
[RUBY_DEBUG_COUNTER]	mjit_length_stale_units       	            0
[RUBY_DEBUG_COUNTER]	mjit_compile_failures         	            0

I patched Ruby to also count obj_respond_to calls.

This is our "profile mode" home page, it takes about 40ms or so and is very typical of Rails performance.

A single page will issue 613 obj_respond_to calls, I guess this is not properly cached.

You can run it by running Discourse bench on 2.7 (which will prep the DB) then execute the script.

Actions #8

Updated by marcandre (Marc-Andre Lafortune) about 4 years ago

  • Related to Bug #17383: 3.0 recursion memory|speed issues added

Updated by tenderlovemaking (Aaron Patterson) about 4 years ago

@xrxr and I made a patch that implements a negative lookup cache. We also changed respond_to? to use the per class method cache.

This PR has the "respond_to" patch: https://github.com/ruby/ruby/pull/3873

This PR has the negative cache in addition to the "respond_to" patch: https://github.com/ruby/ruby/pull/3872

Neither of these patches seemed to have a large impact on RailsBench. Maybe they will help Discourse, but I'm not sure.

I do want to talk about the problems with the negative cache patch though. First, I think some of the invalidation is broken (and that's why the CI is red), but that's not really the biggest issue.

The pCMC is effectively a weak reference cache. In our strategy, we just insert a NULL in the cache for that method name. The fact that there is a cache entry, but the entry is NULL allows us to avoid lookups for the negative case.

The problem though: If we cache negative values (in this case it caches a NULL), how long should those negative values live? Right now, they get GC'd extremely quickly. For example, here is a perf report before adding a negative cache:

perf 1

You can see rb_id_table_lookup is pretty high in the report. Now compare to a perf report after adding the negative cache:

perf 2

Now the number 2 entry is cc_table_free_i. Now we're spending time freeing entries in the pCMC because the negative cache entries aren't real objects. I guess the question is: how long should these negative entries live? They can't live forever.

Numbers for RailsBench were approximately the same with this patch presumably because of the extra GC time.

Anyway, I don't have good answers right now, but I wanted to share progress. Also, I'm sorry these screenshots look so big in the webui, I don't know how to make them smaller 😅

Updated by ioquatix (Samuel Williams) almost 4 years ago

In my early benchmarks:

Async

ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]

Wrapper#wait_readable
                        114.385k (± 1.0%) i/s -    573.250k in   5.012051s
    Reactor#register    137.206k (± 1.0%) i/s -    693.906k in   5.057946s

ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin20]

Wrapper#wait_readable
                        110.108k (± 1.4%) i/s -    552.034k in   5.014519s
    Reactor#register    131.360k (± 1.7%) i/s -    665.193k in   5.065527s

Async::HTTP

ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-darwin19]

Running 2s test @ http://127.0.0.1:9294/
  8 threads and 8 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   228.46us   86.09us   3.82ms   83.40%
    Req/Sec     4.32k     1.07k    5.80k    50.60%
  72205 requests in 2.10s, 3.44MB read
Requests/sec:  34380.26
Transfer/sec:      1.64MB

ruby 3.0.0p0 (2020-12-25 revision 95aff21468) [x86_64-darwin20]

Running 2s test @ http://127.0.0.1:9294/
  8 threads and 8 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   252.91us  107.48us   1.15ms   69.00%
    Req/Sec     3.90k     1.46k    5.87k    39.88%
  65106 requests in 2.10s, 3.10MB read
Requests/sec:  31001.91
Transfer/sec:      1.48MB

Updated by sam.saffron (Sam Saffron) almost 4 years ago

Good news, @ko1 (Koichi Sasada) and @tenderlove managed to recover a fair bit of performance leading to release.

3.0 remains slower in our bench than 2.7 but we did improve over the first post.

That said, the picture as of release is:

Ruby 2.6 -> Ruby 2.7 -> Ruby 3.0

categories page / median: 26 -> 27 -> 28 (slow down 1ms from 2.7 ... 2ms from 2.6)
home page: 50 -> 51 -> 53
topic page: 44 -> 46 -> 49
cat admin: 26 -> 27 -> 28
home admin: 50 -> 53 -> 53
topic admin: 44 -> 45 -> 48

Interestingly when I test this all from the same process some of this slowdown is no longer observed, so some of the slowdown could be due to IO changes or GC.

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

  • Status changed from Open to Closed

It looks like the slowdown for refined methods was fixed by @ko1 (Koichi Sasada) in cfd7729ce7a31c8b6ec5dd0e99c67b2932de4732. Even without that, I'm guessing with yjit, Ruby 3.2 is now faster than Ruby 2.6/2.7. If that is not the case, please reopen with details.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0