Ruby Issue Tracking System: Issueshttps://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112022-12-25T23:25:03ZRuby Issue Tracking System
Redmine Ruby master - Bug #19258 (Closed): URI::Generic#host returns empty string instead of nilhttps://bugs.ruby-lang.org/issues/192582022-12-25T23:25:03Zjanko (Janko Marohnić)janko@hey.com
<p>On Ruby 3.1, <code>URI::Generic#host</code> would return <code>nil</code> for <code>unix:///</code> URLs, but on Ruby 3.2 it now returns an empty string:</p>
<pre><code class="rb syntaxhl" data-language="rb"><span class="n">uri</span> <span class="o">=</span> <span class="no">URI</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="s2">"unix:///var/run/docker.sock"</span><span class="p">)</span>
<span class="n">uri</span><span class="p">.</span><span class="nf">host</span> <span class="c1">#=> </span>
<span class="c1"># Ruby 3.1: nil</span>
<span class="c1"># Ruby 3.2: ""</span>
</code></pre>
<p>This introduced a regression in the Excon gem, which currently doesn't handle these URLs on Ruby 3.2, because it <a href="https://github.com/excon/excon/blob/efd48747fe6c6fa959e787aa5949241cd762f8f3/lib/excon/connection.rb#L92" class="external">aborts</a> for UNIX URLs when <code>:host</code> is not <code>nil</code>.</p> Ruby master - Bug #17481 (Closed): Keyword arguments change value after calling super without arg...https://bugs.ruby-lang.org/issues/174812020-12-27T12:40:13Zjanko (Janko Marohnić)janko@hey.com
<p>There seems to be a bug in Ruby 3.0 regarding keyword arguments and calling super without arguments, where the splatted variable changes its value after super is called. The following self-contained example reproduces the issue:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="k">class</span> <span class="nc">BaseTest</span>
<span class="k">def</span> <span class="nf">call</span><span class="p">(</span><span class="n">a</span><span class="p">:,</span> <span class="n">b</span><span class="p">:,</span> <span class="o">**</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">class</span> <span class="nc">Test</span> <span class="o"><</span> <span class="no">BaseTest</span>
<span class="k">def</span> <span class="nf">call</span><span class="p">(</span><span class="n">a</span><span class="p">:,</span> <span class="n">b</span><span class="p">:,</span> <span class="o">**</span><span class="n">options</span><span class="p">)</span>
<span class="nb">p</span> <span class="n">options</span>
<span class="k">super</span>
<span class="nb">p</span> <span class="n">options</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="no">Test</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">call</span><span class="p">(</span><span class="ss">a: </span><span class="mi">1</span><span class="p">,</span> <span class="ss">b: </span><span class="mi">2</span><span class="p">,</span> <span class="ss">c: </span><span class="p">{})</span>
</code></pre>
<pre><code>{:c=>{}}
{:c=>{}, :a=>1, :b=>2}
</code></pre>
<p>We can see that the <code>options</code> variable changed value to all keyword arguments after <code>super</code> was called. This doesn't happen when explicitly passing arguments to <code>super</code>, i.e. <code>super(a: a, b: b, **options)</code>.</p> Ruby master - Bug #14900 (Closed): Extra allocation in String#byteslicehttps://bugs.ruby-lang.org/issues/149002018-07-07T09:47:27Zjanko (Janko Marohnić)janko@hey.com
<p>When executing <code>String#byteslice</code> with a range, I noticed that sometimes the original string is allocated again. When I run the following script:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="nb">require</span> <span class="s2">"objspace"</span>
<span class="n">string</span> <span class="o">=</span> <span class="s2">"a"</span> <span class="o">*</span> <span class="mi">100_000</span>
<span class="no">GC</span><span class="p">.</span><span class="nf">start</span>
<span class="no">GC</span><span class="p">.</span><span class="nf">disable</span>
<span class="n">generation</span> <span class="o">=</span> <span class="no">GC</span><span class="p">.</span><span class="nf">count</span>
<span class="no">ObjectSpace</span><span class="p">.</span><span class="nf">trace_object_allocations</span> <span class="k">do</span>
<span class="n">string</span><span class="p">.</span><span class="nf">byteslice</span><span class="p">(</span><span class="mi">50_000</span><span class="o">..-</span><span class="mi">1</span><span class="p">)</span>
<span class="no">ObjectSpace</span><span class="p">.</span><span class="nf">each_object</span><span class="p">(</span><span class="no">String</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">string</span><span class="o">|</span>
<span class="nb">p</span> <span class="n">string</span><span class="p">.</span><span class="nf">bytesize</span> <span class="k">if</span> <span class="no">ObjectSpace</span><span class="p">.</span><span class="nf">allocation_generation</span><span class="p">(</span><span class="n">string</span><span class="p">)</span> <span class="o">==</span> <span class="n">generation</span>
<span class="k">end</span>
<span class="k">end</span>
</code></pre>
<p>it outputs</p>
<pre><code>50000
100000
6
5
</code></pre>
<p>The one with 50000 bytes is the result of <code>String#byteslice</code>, but the one with 100000 bytes is the duplicated original string. I expected only the result of <code>String#byteslice</code> to be amongst new allocations.</p>
<p>If instead of the last 50000 bytes I slice the <em>first</em> 50000 bytes, the extra duplication doesn't occur.</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="c1"># ...</span>
<span class="n">string</span><span class="p">.</span><span class="nf">byteslice</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">50_000</span><span class="p">)</span>
<span class="c1"># ...</span>
</code></pre>
<pre><code>50000
5
</code></pre>
<p>It's definitely ok if the implementation of <code>String#bytesize</code> allocates extra strings as part of the implementation, but it would be nice if they were deallocated before returning the result.</p>
<p>EDIT: It seems that <code>String#slice</code> has the same issue.</p> Ruby master - Bug #14745 (Closed): High memory usage when using String#replace with IO.copy_streamhttps://bugs.ruby-lang.org/issues/147452018-05-09T12:50:09Zjanko (Janko Marohnić)janko@hey.com
<p>I'm using custom IO-like objects that implement #read as the first argument to IO.copy_stream, and I noticed odd memory behaviour when using String#replace on the output buffer versus String#clear. Here is an example of a "fake IO" object where #read uses String#clear on the output buffer:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="no">GC</span><span class="p">.</span><span class="nf">disable</span>
<span class="nb">require</span> <span class="s2">"stringio"</span>
<span class="k">class</span> <span class="nc">FakeIO</span>
<span class="k">def</span> <span class="nf">initialize</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
<span class="vi">@io</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">def</span> <span class="nf">read</span><span class="p">(</span><span class="n">length</span><span class="p">,</span> <span class="n">outbuf</span><span class="p">)</span>
<span class="n">chunk</span> <span class="o">=</span> <span class="vi">@io</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="n">length</span><span class="p">)</span>
<span class="k">if</span> <span class="n">chunk</span>
<span class="n">outbuf</span><span class="p">.</span><span class="nf">clear</span>
<span class="n">outbuf</span> <span class="o"><<</span> <span class="n">chunk</span>
<span class="n">chunk</span><span class="p">.</span><span class="nf">clear</span>
<span class="k">else</span>
<span class="n">outbuf</span><span class="p">.</span><span class="nf">clear</span>
<span class="k">end</span>
<span class="n">outbuf</span> <span class="k">unless</span> <span class="n">outbuf</span><span class="p">.</span><span class="nf">empty?</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="n">io</span> <span class="o">=</span> <span class="no">FakeIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"a"</span> <span class="o">*</span> <span class="mi">50</span><span class="o">*</span><span class="mi">1024</span><span class="o">*</span><span class="mi">1024</span><span class="p">)</span> <span class="c1"># 50MB</span>
<span class="no">IO</span><span class="p">.</span><span class="nf">copy_stream</span><span class="p">(</span><span class="n">io</span><span class="p">,</span> <span class="no">File</span><span class="o">::</span><span class="no">NULL</span><span class="p">)</span>
<span class="nb">system</span> <span class="s2">"top -pid </span><span class="si">#{</span><span class="no">Process</span><span class="p">.</span><span class="nf">pid</span><span class="si">}</span><span class="s2">"</span>
</code></pre>
<p>This program outputs memory usage of 50MB at the end, as expected – 50MB was loaded into memory at the beginning and any new strings are deallocated. However, if I modify the #read implementation to use String#replace instead of String#clear:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"> <span class="k">def</span> <span class="nf">read</span><span class="p">(</span><span class="n">length</span><span class="p">,</span> <span class="n">outbuf</span><span class="p">)</span>
<span class="n">chunk</span> <span class="o">=</span> <span class="vi">@io</span><span class="p">.</span><span class="nf">read</span><span class="p">(</span><span class="n">length</span><span class="p">)</span>
<span class="k">if</span> <span class="n">chunk</span>
<span class="n">outbuf</span><span class="p">.</span><span class="nf">replace</span> <span class="n">chunk</span>
<span class="n">chunk</span><span class="p">.</span><span class="nf">clear</span>
<span class="k">else</span>
<span class="n">outbuf</span><span class="p">.</span><span class="nf">clear</span>
<span class="k">end</span>
<span class="n">outbuf</span> <span class="k">unless</span> <span class="n">outbuf</span><span class="p">.</span><span class="nf">empty?</span>
<span class="k">end</span>
</code></pre>
<p>the memory usage has now doubled to 100MB at the end of the program, indicating that some string bytes weren't successfully deallocated. So, it seems that String#replace has different behaviour compared to String#clear + String#<<.</p>
<p>I was <em>only</em> able to reproduce this with <code>IO.copy_stream</code>, the following program shows 50MB memory usage, regardless of whether the String#clear or String#replace approach is used:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="no">GC</span><span class="p">.</span><span class="nf">disable</span>
<span class="n">buffer</span> <span class="o">=</span> <span class="s2">"a"</span> <span class="o">*</span> <span class="mi">50</span><span class="o">*</span><span class="mi">1024</span><span class="o">*</span><span class="mi">1024</span>
<span class="n">chunk</span> <span class="o">=</span> <span class="s2">"b"</span> <span class="o">*</span> <span class="mi">50</span><span class="o">*</span><span class="mi">1024</span><span class="o">*</span><span class="mi">1024</span>
<span class="k">if</span> <span class="no">ARGV</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="s2">"clear"</span>
<span class="n">buffer</span><span class="p">.</span><span class="nf">clear</span>
<span class="n">buffer</span> <span class="o"><<</span> <span class="n">chunk</span>
<span class="k">else</span>
<span class="n">buffer</span><span class="p">.</span><span class="nf">replace</span> <span class="n">chunk</span>
<span class="k">end</span>
<span class="n">chunk</span><span class="p">.</span><span class="nf">clear</span>
<span class="nb">system</span> <span class="s2">"top -pid </span><span class="si">#{</span><span class="no">Process</span><span class="p">.</span><span class="nf">pid</span><span class="si">}</span><span class="s2">"</span>
</code></pre>
<p>With this program I also noticed one interesting thing. If I remove <code>chunk.clear</code>, then the "clear" version uses 100MB as expected (because both buffer and chunk strings are 50MB large), but the "replace" version uses only 50MB, which makes it appear that the <code>buffer</code> string doesn't use any memory when in fact it should use 50MB just like the <code>chunk</code> string. I found that odd, and I think it might be a clue to the memory bug with String#replace I experienced when using <code>IO.copy_stream</code>.</p> Ruby master - Feature #14426 (Closed): [PATCH] openssl: reduce memory allocation in OpenSSL::Buff...https://bugs.ruby-lang.org/issues/144262018-01-31T12:36:40Zjanko (Janko Marohnić)janko@hey.com
<p>When writing data to an SSLSocket, there are a lot of, in my opinion, unnecessary strings being allocated, concretely in OpenSSL::Buffering#do_write.</p>
<p>When the buffer would be written, it would always be copied into a new string first, regardless of whether the write was partial or not. And in case of partial writes, it's not necessary to create copies of remaining data, we could just use the <code>String[from, length] = ""</code> trick immediately which modifies the string in-place.</p>
<p>I also thought that splitting writes on newlines was adding unnecessary memory allocations, so I removed that.</p>
<p>I tested uploading a 5MB file using HTTP.rb, and memory allocation went from 7.7 MB to 0.2 MB with this change.</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="nb">require</span> <span class="s2">"http"</span>
<span class="nb">require</span> <span class="s2">"memory_profiler"</span>
<span class="nb">require</span> <span class="s2">"stringio"</span>
<span class="n">body</span> <span class="o">=</span> <span class="no">StringIO</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="s2">"a"</span> <span class="o">*</span> <span class="mi">5</span><span class="o">*</span><span class="mi">1024</span><span class="o">*</span><span class="mi">1024</span><span class="p">)</span>
<span class="no">MemoryProfiler</span><span class="p">.</span><span class="nf">report</span> <span class="k">do</span>
<span class="no">HTTP</span><span class="p">.</span><span class="nf">post</span><span class="p">(</span><span class="s2">"https://example.com"</span><span class="p">,</span> <span class="ss">body: </span><span class="n">body</span><span class="p">)</span>
<span class="k">end</span><span class="p">.</span><span class="nf">pretty_print</span>
</code></pre> Ruby master - Feature #14404 (Open): Adding writev support to IO#write_nonblockhttps://bugs.ruby-lang.org/issues/144042018-01-26T11:12:00Zjanko (Janko Marohnić)janko@hey.com
<p>In Ruby 2.5 IO#write received writev support (<a href="https://github.com/ruby/ruby/commit/3efa7126e5e853f06cdd78d4d88837aeb72a9a3e" class="external">https://github.com/ruby/ruby/commit/3efa7126e5e853f06cdd78d4d88837aeb72a9a3e</a>), allowing it to accept multiple arguments and utilize writev when available.</p>
<p>Would it be possible to add this feature to IO#write_nonblock as well? IO#write_nonblock is used by the HTTP.rb and Socketry gems to implement their "write timeout" feature (the same way that IO#read_nonblock is used in Net::HTTP to implement "read timeout"). Since IO#write_nonblock doesn't yet support writev, at the moment it's not possible for HTTP.rb and Socketry to utilize writev when the "write timeout" is specified.</p> Ruby master - Bug #13539 (Closed): uninitialized class variable @@accept_charset in #<Class:CGI> ...https://bugs.ruby-lang.org/issues/135392017-05-03T05:08:21Zjanko (Janko Marohnić)janko@hey.com
<p>When I execute this script:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="nb">require</span> <span class="s2">"cgi/util"</span>
<span class="no">CGI</span><span class="p">.</span><span class="nf">unescape</span><span class="p">(</span><span class="s2">"foo"</span><span class="p">)</span>
</code></pre>
<p>On Ruby 2.3.0 this will execute just fine, but on 2.4.1 this throws an error:</p>
<pre><code>`unescape': uninitialized class variable @@accept_charset in #<Class:CGI> (NameError)
</code></pre>
<p>This doesn't happen when I require the whole cgi.rb standard library, only when I require cgi/util.rb. The reason why I want to require only cgi/util.rb is because I need only the URI escaping/unescaping behaviour.</p> Ruby master - Feature #13527 (Closed): Accept IO object as stdin data in Open3.capturehttps://bugs.ruby-lang.org/issues/135272017-04-30T11:49:44Zjanko (Janko Marohnić)janko@hey.com
<p>Currently Open3.capture3, Open3.capture2, Open3.capture2e accept a :stdin_data option, which allows you to write a String into subprocess' standard input. This patch adds the ability to also pass in an IO-like object (any object that respond to #read) as :stdin_data, which will them be streamed to standard input.</p>
<p>Open3.capture3("file", "--mime-type", "--brief", "-", stdin_data: File.open("image.jpg"))<br>
Open3.capture3("ffprobe", "-print_format", "json", "-i", "pipe:0", stdin_data: File.open("video.mp4"))</p>
<p>This is convenient when you want to pass in files into standard input (images, videos etc), because this way you don't have to load the whole file into memory, the file contents will get efficiently streamed into subprocess' standard input.</p>
<p>Another advantage is that many command line tools will stop reading the standard input once they get enough data. In both the examples above the subprocess will stop reading standard input as soon as it gets the information it needs (the image MIME type or video metadata), and in both examples it turns out to be about 1-2MB. This isn't that useful if the IO object represents a file on the filesystem (where reading is fast), but it becomes very useful when the IO object represents a file from the database or a remote file over HTTP. That way you don't need to guess how much data the subprocess needs, you can just give it the IO object and it will read as much as it needs, and then only that amount will be retrieved from the database or downloaded from the Internet.</p> Ruby master - Bug #11014 (Closed): String#partition doesn't return correct result on zero-width m...https://bugs.ruby-lang.org/issues/110142015-03-29T14:51:56Zjanko (Janko Marohnić)janko@hey.com
<p>First, to see how <code>String#match</code> works on my example:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="n">match</span> <span class="o">=</span> <span class="s2">"foo"</span><span class="p">.</span><span class="nf">match</span><span class="p">(</span><span class="sr">/^=*/</span><span class="p">)</span>
<span class="n">match</span><span class="p">.</span><span class="nf">pre_match</span> <span class="c1">#=> ""</span>
<span class="n">match</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="c1">#=> ""</span>
<span class="n">match</span><span class="p">.</span><span class="nf">post_match</span> <span class="c1">#=> "foo"</span>
</code></pre>
<p>Now, if I used <code>String#partition</code> instead of <code>match</code>, I'd expect to get <code>["", "", "foo"]</code> (pre_match, match, post_match). However</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="s2">"foo"</span><span class="p">.</span><span class="nf">partition</span><span class="p">(</span><span class="sr">/^=*/</span><span class="p">)</span> <span class="c1">#=> ["foo", "", ""]</span>
</code></pre>
<p><code>String#rpartition</code> returns the correct result (with the same regex).</p> Ruby master - Bug #10659 (Closed): can't dup Fixnum (TypeError)https://bugs.ruby-lang.org/issues/106592014-12-26T22:40:21Zjanko (Janko Marohnić)janko@hey.com
<p>In Ruby 2.2 (older versions are good) there is a bug with unnamed keyword arguments when <code>super</code> is used.</p>
<pre><code class="rb syntaxhl" data-language="rb"><span class="k">module</span> <span class="nn">Foo</span>
<span class="k">def</span> <span class="nf">foo</span><span class="p">(</span><span class="o">**</span><span class="p">)</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="k">class</span> <span class="nc">Bar</span>
<span class="kp">include</span> <span class="no">Foo</span>
<span class="k">def</span> <span class="nf">foo</span><span class="p">(</span><span class="ss">bar: </span><span class="s2">"bar"</span><span class="p">,</span> <span class="o">**</span><span class="p">)</span>
<span class="k">super</span>
<span class="k">end</span>
<span class="k">end</span>
<span class="no">Bar</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">foo</span> <span class="c1"># `dup': can't dup Fixnum (TypeError)</span>
</code></pre>
<p>It happens when <code>super</code> is called. If I give the keyword arguments a name (<code>**</code> => <code>**options</code>) or if I remove the default keyword argument (<code>bar: "bar"</code>), the error doesn't happen.</p>