Bug #20659: Speed regression of `parse.y` parser after numeric nodes were introduced - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #20659

closed

Speed regression of `parse.y` parser after numeric nodes were introduced

Added by alanwu (Alan Wu) about 1 year ago. Updated 8 months ago.

Status:

Closed

Assignee:

yui-knk (Kaneko Yuichiro)

Target version:

ruby -v:

ruby 3.4.0dev (2024-07-30T14:01:43Z master 1164b6a7ba) [x86_64-linux]

Backport:

3.1: UNKNOWN, 3.2: UNKNOWN, 3.3: UNKNOWN

[ruby-core:118750]

Description

The mail benchmark from yjit-bench is about 20% slower on master compared to 98eeadc9 ("Development of 3.4.0 started.") as the baseline, comparing running time of the Ruby process running the benchmark for a single iteration. Much of this workload is Ruby parsing.

$ hyperfine -L ruby '3.3-equiv/bin/ruby,before-numeric-nodes/bin/ruby,master/bin/ruby' '~/.rubies/{ruby} -Iharness-once benchmarks/mail/benchmark.rb'
Benchmark 1: ~/.rubies/3.3-equiv/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb
  Time (mean ± σ):     889.0 ms ±   2.0 ms    [User: 776.9 ms, System: 111.7 ms]
  Range (min … max):   885.7 ms … 892.1 ms    10 runs
 
Benchmark 2: ~/.rubies/before-numeric-nodes/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb
  Time (mean ± σ):     891.7 ms ±   1.6 ms    [User: 794.5 ms, System: 96.9 ms]
  Range (min … max):   889.0 ms … 894.7 ms    10 runs
 
Benchmark 3: ~/.rubies/master/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb
  Time (mean ± σ):      1.086 s ±  0.003 s    [User: 0.951 s, System: 0.134 s]
  Range (min … max):    1.080 s …  1.091 s    10 runs
 
Summary
  '~/.rubies/3.3-equiv/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb' ran
    1.00 ± 0.00 times faster than '~/.rubies/before-numeric-nodes/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb'
    1.22 ± 0.00 times faster than '~/.rubies/master/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb'

$ for tag in 3.3-equiv master before-numeric-nodes; do echo $tag: $(~/.rubies/$tag/bin/ruby -v); done
3.3-equiv: ruby 3.4.0dev (2023-12-25T09:13:40Z master 98eeadc932) [x86_64-linux]
master: ruby 3.4.0dev (2024-07-30T14:01:43Z master 1164b6a7ba) [x86_64-linux]
before-numeric-nodes: ruby 3.4.0dev (2024-01-06T18:26:38Z master 76afbda5b5) [x86_64-linux]

Using Valgrind's DHAT reveals that 1b8d01136c3ff6c60325c7609d61e19ac42acd9f ("Introduce Numeric Node's") issues roughly 3 times more malloc(3) calls compared to the baseline, most of them coming from strdup() calls in set_number_literal().

Comparison with --parser=prism, for interest:

$ hyperfine -L ruby '3.3-equiv/bin/ruby,before-numeric-nodes/bin/ruby,master/bin/ruby,master/bin/ruby --parser=prism' \
       '~/.rubies/{ruby} -Iharness-once benchmarks/mail/benchmark.rb'
Benchmark 1: ~/.rubies/3.3-equiv/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb
  Time (mean ± σ):     889.7 ms ±   2.6 ms    [User: 771.4 ms, System: 118.0 ms]
  Range (min … max):   885.8 ms … 894.6 ms    10 runs
 
Benchmark 2: ~/.rubies/before-numeric-nodes/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb
  Time (mean ± σ):     890.4 ms ±   1.6 ms    [User: 776.8 ms, System: 113.3 ms]
  Range (min … max):   888.4 ms … 892.9 ms    10 runs
 
Benchmark 3: ~/.rubies/master/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb
  Time (mean ± σ):      1.087 s ±  0.004 s    [User: 0.968 s, System: 0.119 s]
  Range (min … max):    1.083 s …  1.097 s    10 runs
 
Benchmark 4: ~/.rubies/master/bin/ruby --parser=prism -Iharness-once benchmarks/mail/benchmark.rb
  Time (mean ± σ):     826.9 ms ±   2.1 ms    [User: 725.8 ms, System: 100.7 ms]
  Range (min … max):   823.6 ms … 830.9 ms    10 runs
 
Summary
  '~/.rubies/master/bin/ruby --parser=prism -Iharness-once benchmarks/mail/benchmark.rb' ran
    1.08 ± 0.00 times faster than '~/.rubies/3.3-equiv/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb'
    1.08 ± 0.00 times faster than '~/.rubies/before-numeric-nodes/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb'
    1.31 ± 0.01 times faster than '~/.rubies/master/bin/ruby -Iharness-once benchmarks/mail/benchmark.rb'

Actions

Copy link

Updated by alanwu (Alan Wu) about 1 year ago

Description updated (diff)

Actions

Copy link

#2 [ruby-core:118754]

Updated by mame (Yusuke Endoh) about 1 year ago

Status changed from Open to Assigned
Assignee set to yui-knk (Kaneko Yuichiro)

Actions

Copy link

#3 [ruby-core:118756]

Updated by mame (Yusuke Endoh) about 1 year ago

Just a simple curiosity, I am not sure how the change will affect the performance so much. Does mail gem use eval a lot?

Actions

Copy link

#4 [ruby-core:118759]

Updated by alanwu (Alan Wu) about 1 year ago

Some light profiling shows that this workload spends over half of its time in Kernel#require. I don't think it's coming from evals in the body of the benchmark, but rather the loading of it. Running one iteration is important to trigger all the autoloads.

The gem does contain a large number of integer literals in the generated parsers it bundles. For example: https://raw.githubusercontent.com/mikel/mail/master/lib/mail/parsers/address_lists_parser.rb

Actions

Copy link

#5 [ruby-core:120221]

Updated by alanwu (Alan Wu) 8 months ago

Status changed from Assigned to Closed

I believe this was fixed by c93d07ed7448f332379cf21b4b7b649b057e5671.

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #20659

Speed regression of `parse.y` parser after numeric nodes were introduced

Updated by alanwu (Alan Wu) about 1 year ago

Updated by mame (Yusuke Endoh) about 1 year ago

Updated by mame (Yusuke Endoh) about 1 year ago

Updated by alanwu (Alan Wu) about 1 year ago

Updated by alanwu (Alan Wu) 8 months ago