Ruby master - Bug #19438: Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark</h1> <article> <h1>Ruby master - Bug #19438: Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark</h1> <p>2023-02-15T18:20:24Z</p> <ul><li><strong>Subject</strong> changed from <i>Ruby 2.7 -> 3.2 Performance Regression</i> to <i>Ruby 2.7 -> 3.2 Performance Regression in CSV</i></li></ul><p>CSV being a gem, it's very likely that the regression isn't in ruby itself, but in the successive versions of the csv gem being bundled with Ruby:</p> <pre><code>ruby 3.2 => csv 3.2.6 ruby 3.1 => csv 3.2.2 ruby 3.0 => csv 3.1.9 ruby 2.7 => csv 3.1.2 ruby 2.6 => csv 3.0.9 </code></pre> <p>So if you wish to investigate this further, the best start would be to benchmark <code>csv 3.0.9</code> on Ruby 3.2, and <code>csv 3.2.6</code> on Ruby 2.7 or 2.6.</p> <p>I'm not very familiar with the <code>CSV</code> gem, but CSV as a format is losely defined, so I really wouldn't be surprised if over the years various fixes were added to handle corner cases and that it would impact performance.</p> </article> <article> <h1>Ruby master - Bug #19438: Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark</h1> <p>2023-02-15T18:22:07Z</p> <ul><li><strong>Subject</strong> changed from <i>Ruby 2.7 -> 3.2 Performance Regression in CSV</i> to <i>Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark</i></li></ul><p>Oh, I'm sorry, I misread your benchmark and thought it was a CSV based benchmark, I now see it's radically different, please disregard my previous answer.</p> </article> <article> <h1>Ruby master - Bug #19438: Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark</h1> <p>2023-03-04T00:40:17Z</p> <ul></ul><p>I took a look at the benchmark, and I was wondering what exactly it was trying to benchmark. It looks like its benchmarking the ability to sum up the word count of a given substring. If you agree that that is what is being benchmarked, one thing we can do is change the benchmark to use <code>tally</code> instead of manually incrementing the count. I ran the benchmark with this change and got the following result.</p> <pre><code>❯ benchmark-driver so_k_nucleotide.yml --rbenv '2.7.5;3.0.5;3.1.2;3.2.1' --repeat-count 5 Calculating ------------------------------------- 2.7.5 3.0.5 3.1.2 3.2.1 so_k_nucleotide 1.671 1.728 1.893 2.045 i/s - 1.000 times in 0.598269s 0.578549s 0.528261s 0.489104s Comparison: so_k_nucleotide 3.2.1: 2.0 i/s 3.1.2: 1.9 i/s - 1.08x slower 3.0.5: 1.7 i/s - 1.18x slower 2.7.5: 1.7 i/s - 1.22x slower </code></pre> <p>The method to count the frequency looks something like this</p> <pre><code> def word_tally(seq, length) words = [] n = seq.length - length + 1 (0 ... length).each do |f| (f ... n).step(length) do |i| words << seq[i,length] end end table = words.tally [n,table] end </code></pre> <p>In some of the iterations that I played around with, the slow down seems to be coming from the incrementing of the value of a particular key.</p> <ul> <li>taking the substring</li> <li>accessing the value of that key from the Hash</li> </ul> </article> <article> <h1>Ruby master - Bug #19438: Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark</h1> <p>2023-03-04T09:44:55Z</p> <ul></ul><blockquote> <p>If you agree that that is what is being benchmarked, one thing we can do is change the benchmark to use tally instead of manually incrementing the count.</p> </blockquote> <p>So this is a benchmark that was popularized by the "benchmark-game", if the goal is to look better on that benchmark (I personally don't care for it) then yes that is a solution.</p> <p><a href="https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/knucleotide.html#knucleotide" class="external">https://benchmarksgame-team.pages.debian.net/benchmarksgame/description/knucleotide.html#knucleotide</a></p> <p>However I think the interesting part of this issue, would be to figure out why that specific implementation got slower over time.</p> <p>It would be interesting to break down the algorithm in smaller pieces and benchmark them idependently to see which operation exactly got slower.</p> </article> <article> <h1>Ruby master - Bug #19438: Ruby 2.7 -> 3.2 Performance Regression in so_k_nucleotide benchmark</h1> <p>2023-03-06T18:10:06Z</p> <ul></ul><blockquote> <p>However I think the interesting part of this issue, would be to figure out why that specific implementation got slower over time.</p> </blockquote> <p>I definitely agree!</p> <blockquote> <p>It would be interesting to break down the algorithm in smaller pieces and benchmark them idependently to see which operation exactly got slower.</p> </blockquote> <p>before I did the change to <code>tally</code> I also broke it down to smaller steps and ran the benchmark, apologies for the wall of text to follow; a direct quote from a discussion I had with others looking at this</p> <hr> <p>I may have found something interesting while playing around with the benchmark tool. I wanted to see if we can narrow it down to a specific thing that might be slow. so I got rid of the the <code>sort_by_freq</code> and <code>find_seq</code> methods for now and focused mostly on the <code>frecuency</code> method since stackprof identified that as the <em>slowest</em> one</p> <p>Just so we have a baseline here is the code and the benchmark of the <code>frecuency</code> method untouched</p> <pre><code> def frecuency(seq, length) frequencies = Hash.new(0) ns = seq.length + 1 - length for i in (0 ... ns) frequencies[seq[i, length]] += 1 end [ns, frequencies] end </code></pre> <pre><code>Calculating ------------------------------------- 2.7.5 3.0.5 3.1.2 3.2.1 so_k_nucleotide 7.530 6.524 6.056 6.294 i/s - 1.000 times in 0.132808s 0.153277s 0.165117s 0.158884s Comparison: so_k_nucleotide 2.7.5: 7.5 i/s 3.0.5: 6.5 i/s - 1.15x slower 3.2.1: 6.3 i/s - 1.20x slower 3.1.2: 6.1 i/s - 1.24x slower </code></pre> <p>Now lets look at <code>String#[]</code> as the only operation</p> <pre><code> def frecuency(seq, length) frequencies = Hash.new(0) ns = seq.length + 1 - length for i in (0 ... ns) seq[i, length] # frequencies[seq[i, length]] += 1 end [ns, frequencies] end </code></pre> <pre><code>❯ benchmark-driver so_k_nucleotide.yml --rbenv '2.7.5;3.0.5;3.1.2;3.2.1' --repeat-count 5 Calculating ------------------------------------- 2.7.5 3.0.5 3.1.2 3.2.1 so_k_nucleotide 14.000 14.225 13.797 14.096 i/s - 1.000 times in 0.071427s 0.070300s 0.072481s 0.070944s Comparison: so_k_nucleotide 3.0.5: 14.2 i/s 3.2.1: 14.1 i/s - 1.01x slower 2.7.5: 14.0 i/s - 1.02x slower 3.1.2: 13.8 i/s - 1.03x slower </code></pre> <p>Now lets look at <code>String#[]</code> and <code>Hash#[]=</code></p> <pre><code> def frecuency(seq, length) frequencies = Hash.new(0) ns = seq.length + 1 - length for i in (0 ... ns) frequencies[seq[i, length]] end [ns, frequencies] end </code></pre> <pre><code>Calculating ------------------------------------- 2.7.5 3.0.5 3.1.2 3.2.1 so_k_nucleotide 12.516 12.248 12.058 13.034 i/s - 1.000 times in 0.079900s 0.081645s 0.082931s 0.076725s Comparison: so_k_nucleotide 3.2.1: 13.0 i/s 2.7.5: 12.5 i/s - 1.04x slower 3.0.5: 12.2 i/s - 1.06x slower 3.1.2: 12.1 i/s - 1.08x slower </code></pre> <p>However the interesting bit is when we assign things to variables or when we increment the counter for a hash value which I think is <code>Hash#[]=</code></p> <pre><code> def frecuency(seq, length) frequencies = Hash.new(0) ns = seq.length + 1 - length for i in (0 ... ns) x = seq[i, length] end [ns, frequencies] end </code></pre> <pre><code>❯ benchmark-driver so_k_nucleotide.yml --rbenv '2.7.5;3.0.5;3.1.2;3.2.1' --repeat-count 5 Calculating ------------------------------------- 2.7.5 3.0.5 3.1.2 3.2.1 so_k_nucleotide 13.641 12.170 11.857 12.001 i/s - 1.000 times in 0.073306s 0.082168s 0.084335s 0.083325s Comparison: so_k_nucleotide 2.7.5: 13.6 i/s 3.0.5: 12.2 i/s - 1.12x slower 3.2.1: 12.0 i/s - 1.14x slower 3.1.2: 11.9 i/s - 1.15x slower </code></pre> <p>This is the same as the baseline in the start of the comment</p> <pre><code>❯ benchmark-driver so_k_nucleotide.yml --rbenv '2.7.5;3.0.5;3.1.2;3.2.1' --repeat-count 5 Calculating ------------------------------------- 2.7.5 3.0.5 3.1.2 3.2.1 so_k_nucleotide 7.626 6.474 6.135 6.239 i/s - 1.000 times in 0.131126s 0.154468s 0.163005s 0.160283s Comparison: so_k_nucleotide 2.7.5: 7.6 i/s 3.0.5: 6.5 i/s - 1.18x slower 3.2.1: 6.2 i/s - 1.22x slower 3.1.2: 6.1 i/s - 1.24x slower </code></pre> <p>I think we might need to look into the assignment methods instead of the access methods for string and hash 🤔</p> </article> </main></body></html>