https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112014-03-27T01:28:26ZRuby Issue Tracking SystemRuby master - Bug #9680: String#sub and siblings should not use regex when String pattern is passedhttps://bugs.ruby-lang.org/issues/9680?journal_id=459532014-03-27T01:28:26Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>I had an idea to replace the current reg_cache with memoization for<br>
converting string literals, but never got around to it. That would<br>
also reduce garbage while preserving $& compatibility.</p> Ruby master - Bug #9680: String#sub and siblings should not use regex when String pattern is passedhttps://bugs.ruby-lang.org/issues/9680?journal_id=459552014-03-27T02:02:48Zsrawlins (Sam Rawlins)sam.rawlins@gmail.com
<ul></ul><p>I think the speedup in this patch comes almost entirely from skipping the regex engine, not from the GC savings.</p>
<p>Preserving <code>$&</code> (and <code>$~</code> and friends) while still not firing up the regex engine might be possible (constructing the basic backref, with no subgroups, by hand), but very very ugly code (an <code>RMatch</code> has an <code>RRegexp</code> and an <code>rmatch</code> which has a <code>re_registers</code>, etc). This might only a ~20 line function, but feels so dirty...</p>
<p>I think an improvement (or replacement) to <code>reg_cache</code> would also be welcome.</p> Ruby master - Bug #9680: String#sub and siblings should not use regex when String pattern is passedhttps://bugs.ruby-lang.org/issues/9680?journal_id=459652014-03-27T16:17:43ZEregon (Benoit Daloze)
<ul></ul><p>It would be interesting to run the benchmark on a more realistic example. One should use String#tr or String#tr! if it is only to replace a single character.</p> Ruby master - Bug #9680: String#sub and siblings should not use regex when String pattern is passedhttps://bugs.ruby-lang.org/issues/9680?journal_id=459682014-03-27T19:03:42Zsrawlins (Sam Rawlins)sam.rawlins@gmail.com
<ul></ul><p>Good point, Benoit! This is actually why I started looking into #gsub in the first place. I benchmarked ActiveSupport::Inflector [1], which does operations like <code>gsub!('/', '::')</code> and <code>gsub('::', '/')</code>. Here are the benchmarks, before and after Nobu's patch, based on my patch:</p>
<pre><code>BEFORE
user system total real
#underscore 24.000000 0.160000 24.160000 ( 24.254921)
#camelize 3.040000 0.010000 3.050000 ( 3.060907)
AFTER
user system total real
#underscore 23.690000 0.160000 23.850000 ( 24.012497)
#camelize 2.680000 0.010000 2.690000 ( 2.706418)
</code></pre>
<p>So #underscore is 1% faster; #camelize is 11% faster.</p>
<p>I also benchmarked Psych with <code>YAML.dump(["one string", "two string"])</code>:</p>
<pre><code> user system total real
YAML.dump BEFORE 11.380000 0.250000 11.630000 ( 11.680545)
YAML.dump AFTER 11.030000 0.240000 11.270000 ( 11.313147)
</code></pre>
<p>The patch seems to shave 3% off here. Take all of these benchmarks with a grain of salt :)</p>
<p>[1] <a href="https://github.com/rails/rails/blob/master/activesupport/lib/active_support/inflector/methods.rb" class="external">https://github.com/rails/rails/blob/master/activesupport/lib/active_support/inflector/methods.rb</a></p> Ruby master - Bug #9680: String#sub and siblings should not use regex when String pattern is passedhttps://bugs.ruby-lang.org/issues/9680?journal_id=459732014-03-27T22:42:34Znobu (Nobuyoshi Nakada)nobu@ruby-lang.org
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Closed</i></li></ul><p>I missed to include this ticket reference in the commit log.</p>
<p>Inflector seems using other replacements with RE, so this improvement might not be significant much.</p> Ruby master - Bug #9680: String#sub and siblings should not use regex when String pattern is passedhttps://bugs.ruby-lang.org/issues/9680?journal_id=475342014-07-02T06:58:29Zusa (Usaku NAKAMURA)usa@garbagecollect.jp
<ul><li><strong>Backport</strong> changed from <i>2.0.0: UNKNOWN, 2.1: UNKNOWN</i> to <i>2.0.0: WONTFIX, 2.1: UNKNOWN</i></li></ul> Ruby master - Bug #9680: String#sub and siblings should not use regex when String pattern is passedhttps://bugs.ruby-lang.org/issues/9680?journal_id=482482014-08-08T03:25:30Znagachika (Tomoyuki Chikanaga)nagachika00@gmail.com
<ul><li><strong>Backport</strong> changed from <i>2.0.0: WONTFIX, 2.1: UNKNOWN</i> to <i>2.0.0: WONTFIX, 2.1: WONTFIX</i></li></ul>