https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112016-09-09T16:58:15ZRuby Issue Tracking SystemRuby master - Feature #12744: Add str.reverse_each_char and str.reverse_charshttps://bugs.ruby-lang.org/issues/12744?journal_id=604592016-09-09T16:58:15Zbouk (Bouke van der Bijl)boukevanderbijl@gmail.com
<ul><li><strong>Subject</strong> changed from <i>Add str.reverse_each and str.reverse_chars</i> to <i>Add str.reverse_each_char and str.reverse_chars</i></li></ul> Ruby master - Feature #12744: Add str.reverse_each_char and str.reverse_charshttps://bugs.ruby-lang.org/issues/12744?journal_id=604622016-09-10T02:58:35Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul><li><strong>Status</strong> changed from <i>Open</i> to <i>Feedback</i></li></ul><p>What about using</p>
<p><code>str.reverse.chars</code> for <code>string.reverse_chars</code>?</p>
<p>It allocates some memory, but compared to all the memory allocated for the individual characters, the memory for the overall reversed string is not a big deal. Also, it's O(N).</p>
<p>What's the use case? It doesn't look like a very frequent operation needing a dedicated method if it can be done by method composition so easily.</p>
<p>Similar for<br>
<code>str.reverse_each</code>, which you probably meant to be <code>str.reverse_each_line</code>, which should be something like <code>string.lines.reverse.each</code></p> Ruby master - Feature #12744: Add str.reverse_each_char and str.reverse_charshttps://bugs.ruby-lang.org/issues/12744?journal_id=604872016-09-12T18:58:31Zbouk (Bouke van der Bijl)boukevanderbijl@gmail.com
<ul></ul><p>Martin Dürst wrote:</p>
<blockquote>
<p>What about using</p>
<p><code>str.reverse.chars</code> for <code>string.reverse_chars</code>?</p>
<p>It allocates some memory, but compared to all the memory allocated for the individual characters, the memory for the overall reversed string is not a big deal. Also, it's O(N).</p>
<p>What's the use case? It doesn't look like a very frequent operation needing a dedicated method if it can be done by method composition so easily.</p>
<p>Similar for<br>
<code>str.reverse_each</code>, which you probably meant to be <code>str.reverse_each_line</code>, which should be something like <code>string.lines.reverse.each</code></p>
</blockquote>
<p>I don't really have a use case for reverse_chars, but I added it for symmetry with the other methods. I meant str.reverse_each_char, I typo'd it in the issue but it's correct in the patch. The equivalent with doing allocation would be str.chars.reverse.each. I could use <code>reverse_each_char</code> in Sprockets, where we need to iterate over the string backwards to check that it ends with certain characters (and know what it ends with). This needs to be done many times when compiling assets, so having a native way to iterate characters without allocation is a beneficial optimization.</p> Ruby master - Feature #12744: Add str.reverse_each_char and str.reverse_charshttps://bugs.ruby-lang.org/issues/12744?journal_id=604882016-09-13T00:21:29Zshyouhei (Shyouhei Urabe)shyouhei@ruby-lang.org
<ul></ul><p>I doubt if we can make a reverse_each_char which is faster than reverse.each_char. It ls not always clear where is a boundary between a character and another, especially when scanning backwards. We might end up scanning whole string from the beginning, splitting characters into separate substrings, then iterate over them.</p> Ruby master - Feature #12744: Add str.reverse_each_char and str.reverse_charshttps://bugs.ruby-lang.org/issues/12744?journal_id=604952016-09-13T17:12:45Zbouk (Bouke van der Bijl)boukevanderbijl@gmail.com
<ul></ul><p>Shyouhei Urabe wrote:</p>
<blockquote>
<p>I doubt if we can make a reverse_each_char which is faster than reverse.each_char. It ls not always clear where is a boundary between a character and another, especially when scanning backwards. We might end up scanning whole string from the beginning, splitting characters into separate substrings, then iterate over them.</p>
</blockquote>
<p>Not sure why you think we can't make it faster than <code>reverse.each_char</code>, I've already implemented it and attached the patch. It uses <code>rb_enc_left_char_head</code>, which is implemented by all the encodings to scan a string backwards.</p>
<p>For the most common encoding (UTF8) it is always possible to scan a string backwards from any point, and looking at the other encodings implemented in Ruby it seems only gb18030 has a stateful way to back up to previous characters, so iterating backwards over that one could end up being O(N^2).</p> Ruby master - Feature #12744: Add str.reverse_each_char and str.reverse_charshttps://bugs.ruby-lang.org/issues/12744?journal_id=605252016-09-16T10:41:27Zduerst (Martin Dürst)duerst@it.aoyama.ac.jp
<ul></ul><p>Bouke van der Bijl wrote:</p>
<blockquote>
<p>I don't really have a use case for reverse_chars, but I added it for symmetry with the other methods.</p>
</blockquote>
<p>Other languages may do that, but Ruby doesn't add something just for symmetry.</p>
<blockquote>
<p>I meant str.reverse_each_char, I typo'd it in the issue but it's correct in the patch. The equivalent with doing allocation would be str.chars.reverse.each. I could use <code>reverse_each_char</code> in Sprockets, where we need to iterate over the string backwards to check that it ends with certain characters (and know what it ends with).</p>
</blockquote>
<p>Wouldn't this usually be done with a Regexp? If using a Regexp directly isn't efficient, what about just applying the reverse of the Regexp to the reverse of the string (so that it gets applied from the start)?</p>
<blockquote>
<p>Not sure why you think we can't make it faster than <code>reverse.each_char</code>, I've already implemented it and attached the patch. It uses <code>rb_enc_left_char_head</code>, which is implemented by all the encodings to scan a string backwards.</p>
</blockquote>
<p>Some of these implementations are not exactly trivial. Please look at enc/shift_jis.c or enc/gb18030.c. Please try your code on something like</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="s2">"</span><span class="se">\x95\x95</span><span class="s2">"</span><span class="p">.</span><span class="nf">force_encoding</span><span class="p">(</span><span class="s1">'Shift_JIS'</span><span class="p">)</span> <span class="o">*</span> <span class="n">x</span>
</code></pre>
<p>where you increase x and see whether the time increases linearly or not.</p>
<blockquote>
<p>For the most common encoding (UTF8) it is always possible to scan a string backwards from any point, and looking at the other encodings implemented in Ruby it seems only gb18030 has a stateful way to back up to previous characters, so iterating backwards over that one could end up being O(N^2).</p>
</blockquote>
<p>Yes indeed.</p>