https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112012-10-27T03:53:55ZRuby Issue Tracking SystemRuby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=316202012-10-27T03:53:55Zbrixen (Brian Shirai)brixen@gmail.com
<ul></ul><p>I have found the test for #dup that actually asserts this behavior is correct. Please tell me that is a mistake.</p>
<p>Thanks,<br>
Brian</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=316212012-10-27T04:23:42Zheadius (Charles Nutter)headius@headius.com
<ul></ul><p>I believe, as you supposed on IRC, that this is designed to mimic IO behavior. IO#dup will dup the underlying file descriptor, but they do not maintain their own notion of position into the file; advancing one advances the other. Surely StringIO, as a supposed drop-in for IO, needs to mimic this behavior?</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=316272012-10-27T04:42:41Zbrixen (Brian Shirai)brixen@gmail.com
<ul></ul><p>If StringIO manifests this behavior as a copy of IO's behavior, both should be consider bugs.</p>
<p>RubySpec has a quarantined spec for this behavior in IO with a comment stating that there are platform incompatibilities with this IO "state aliasing": <a href="https://github.com/rubyspec/rubyspec/blob/master/core/io/dup_spec.rb#L34-54" class="external">https://github.com/rubyspec/rubyspec/blob/master/core/io/dup_spec.rb#L34-54</a></p>
<p>IO#dup creates an instance with a different underlying fd (#fileno): <a href="https://github.com/rubyspec/rubyspec/blob/master/core/io/dup_spec.rb#L30-32" class="external">https://github.com/rubyspec/rubyspec/blob/master/core/io/dup_spec.rb#L30-32</a></p>
<p>There is no other Ruby core class that causes aliasing when calling #dup. String#dup, for example, is called in code precisely to create a new String so the original would not be mutated.</p>
<p>That IO and StringIO do cause this aliasing is a deviation from the typical behavior of #dup, makes no sense, and is not at all required. It's perfectly possible to do the following:</p>
<p>sasha:rubinius brian$ cat foobar.txt<br>
123456<br>
sasha:rubinius brian$ irb<br>
1.9.3p286 :001 > a = File.open("foobar.txt", "r")<br>
=> #<a href="File:foobar.txt" class="external">File:foobar.txt</a><br>
1.9.3p286 :002 > b = File.open("foobar.txt", "r")<br>
=> #<a href="File:foobar.txt" class="external">File:foobar.txt</a><br>
1.9.3p286 :003 > a.getc<br>
=> "1"<br>
1.9.3p286 :004 > a.pos<br>
=> 1<br>
1.9.3p286 :005 > b.pos<br>
=> 0<br>
1.9.3p286 :006 > b.getc<br>
=> "1"<br>
1.9.3p286 :007 > a.fileno<br>
=> 5<br>
1.9.3p286 :008 > b.fileno<br>
=> 6<br>
1.9.3p286 :009 > c = b.dup<br>
=> #<a href="File:foobar.txt" class="external">File:foobar.txt</a><br>
1.9.3p286 :010 > c.fileno<br>
=> 7</p>
<p>Thanks,<br>
Brian</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=316662012-10-27T06:15:02Zdrbrain (Eric Hodel)drbrain@segment7.net
<ul></ul><p>IO#dup calls dup(2) and StringIO#dup matches this behavior.</p>
<p>Making IO#dup call open(2) would break backwards compatibility.</p>
<p>How do you propose we implement dup(2) for IO?</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=316802012-10-27T06:46:10Zbrixen (Brian Shirai)brixen@gmail.com
<ul></ul><p>If dup(2) functionality is actually needed, create an IO.system_dup method that return a dup(2) fd and use that fd in IO.new/IO.for_fd.</p>
<p>The fact there there is a dup() system call should not make IO.dup inconsistent with the semantics of every other core #dup method.</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=316862012-10-27T06:55:18Zbrixen (Brian Shirai)brixen@gmail.com
<ul></ul><p>There's also a dup2() system call. Why don't we provide that one as well?</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=317402012-10-27T10:18:29Zdrbrain (Eric Hodel)drbrain@segment7.net
<ul><li><strong>Tracker</strong> changed from <i>Bug</i> to <i>Feature</i></li><li><strong>Subject</strong> changed from <i>StringIO#initialize_copy causes aliasing between the objects</i> to <i>Separate IO#dup, StringIO#initialize_copy from dup(2)</i></li><li><strong>Target version</strong> set to <i>3.0</i></li></ul><p>This is intentional behavior which has existed since 1998. It is not a bug.</p>
<p>When I am working with IOs I expect the ruby methods to follow POSIX conventions more than ruby conventions. This method is not the only one in the standard library that doesn't follow ruby conventions.</p>
<p>If you wish to change this behavior you must demonstrate the change will be harmless and easy for existing libraries to adapt to. I don't believe this is the case (due to the drastic change in behavior this would introduce), or that such a change is worthwhile after nearly 15 years.</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=317472012-10-27T10:54:25Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>"brixen (Brian Ford)" <a href="mailto:brixen@gmail.com" class="email">brixen@gmail.com</a> wrote:</p>
<blockquote>
<p>There's also a dup2() system call. Why don't we provide that one as well?</p>
</blockquote>
<p>IO#reopen already provides dup2() (or dup3() if available)</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=317542012-10-27T11:23:40Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>"brixen (Brian Ford)" <a href="mailto:brixen@gmail.com" class="email">brixen@gmail.com</a> wrote:</p>
<blockquote>
<p>There is no other Ruby core class that causes aliasing when calling<br>
#dup. String#dup, for example, is called in code precisely to create a<br>
new String so the original would not be mutated.</p>
</blockquote>
<p>No other Ruby core class is a delegate for OS-level objects, either.</p>
<blockquote>
<p>That IO and StringIO do cause this aliasing is a deviation from the<br>
typical behavior of #dup, makes no sense, and is not at all required.<br>
It's perfectly possible to do the following:</p>
</blockquote>
<p>It makes sense when I look at it this way:</p>
<pre><code>class Foo # this could be a numeric FD or IO object
attr_reader :array
def initialize
@array = [] # the underlying file handle in the kernel
end
end
a = Foo.new
b = a.dup
p(a.array.object_id == b.array.object_id) # => true
</code></pre>
<blockquote>
<p>sasha:rubinius brian$ cat foobar.txt<br>
123456<br>
sasha:rubinius brian$ irb<br>
1.9.3p286 :001 > a = File.open("foobar.txt", "r")<br>
=> #<a href="File:foobar.txt" class="external">File:foobar.txt</a><br>
1.9.3p286 :002 > b = File.open("foobar.txt", "r")<br>
=> #<a href="File:foobar.txt" class="external">File:foobar.txt</a></p>
</blockquote>
<p>It's impossible to implement File#dup using open() reliably. foobar.txt<br>
can be unlinked or replaced by an entirely different file in between the<br>
File.open calls.</p>
<blockquote>
<p>1.9.3p286 :003 > a.getc<br>
=> "1"<br>
1.9.3p286 :004 > a.pos<br>
=> 1<br>
1.9.3p286 :005 > b.pos<br>
=> 0<br>
1.9.3p286 :006 > b.getc<br>
=> "1"</p>
</blockquote>
<p>However, if you want separate offsets, you can still use dup() but<br>
always implement read/write using pread()/pwrite() (and manually<br>
maintain per-object offsets in userspace).</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=317702012-10-27T12:28:01Zbrixen (Brian Shirai)brixen@gmail.com
<ul></ul><p>drbrain (Eric Hodel) wrote:</p>
<blockquote>
<p>This is intentional behavior which has existed since 1998. It is not a bug.</p>
<p>When I am working with IOs I expect the ruby methods to follow POSIX conventions more than ruby conventions. This method is not the only one in the standard library that doesn't follow ruby conventions.</p>
<p>If you wish to change this behavior you must demonstrate the change will be harmless and easy for existing libraries to adapt to. I don't believe this is the case (due to the drastic change in behavior this would introduce), or that such a change is worthwhile after nearly 15 years.</p>
</blockquote>
<p>The time it has been implemented is a ridiculous standard. So is requiring it to be easy to adapt to. Nothing that has changed in 1.9 has respected such a standard.</p>
<p>There are serious problems with aliasing the same fd as IO#dup does. It is not consistent across platforms. At the very least, the implementation leaves huge holes, like allowing one fd to be closed, and that IO's #closed? will return true, but the aliasing IO's #closed? will return false, and closing it will raise Errno::EBADF. That behavior shouldn't be behind a common Ruby method or concept.</p>
<p>Further, if StringIO is supposed to mimic all this, there are numerous bugs because StringIO instances that are aliases via #dup do report the same value for eg closed?.</p>
<p>I'm not asking to remove the ability to use dup(2) but that it not be hidden behind Ruby's concept of #dup. I am certain that most Ruby developers understand nothing of the complexities of aliasing a system resource like an fd between different Ruby objects because I've fixed tons of EBADF bugs is RubySpec due to IO#dup. MRI very likely has no tests for such problems.</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=317722012-10-27T12:30:54Zbrixen (Brian Shirai)brixen@gmail.com
<ul></ul><p>drbrain (Eric Hodel) wrote:</p>
<blockquote>
<p>This is intentional behavior which has existed since 1998. It is not a bug.</p>
<p>When I am working with IOs I expect the ruby methods to follow POSIX conventions more than ruby conventions. This method is not the only one in the standard library that doesn't follow ruby conventions.</p>
</blockquote>
<p>Ruby's accidental POSIX semantics are a problem that eg JRuby and any attempt to make Ruby function well on Windows must constantly struggle with. They are not a reason to defend a poor design decision.</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=317802012-10-27T14:23:18Znormalperson (Eric Wong)normalperson@yhbt.net
<ul></ul><p>"brixen (Brian Ford)" <a href="mailto:brixen@gmail.com" class="email">brixen@gmail.com</a> wrote:</p>
<blockquote>
<p>There are serious problems with aliasing the same fd as IO#dup does.<br>
It is not consistent across platforms. At the very least, the<br>
implementation leaves huge holes, like allowing one fd to be closed,<br>
and that IO's #closed? will return true, but the aliasing IO's<br>
#closed? will return false, and closing it will raise Errno::EBADF.<br>
That behavior shouldn't be behind a common Ruby method or concept.</p>
</blockquote>
<p>On *nix, IO#dup creates a <em>new</em> fd, but <em>not</em> a new file handle (inside<br>
the kernel). The close() syscall decrements a refcount and only<br>
releases the file handle when there are no other references to it.</p>
<p>I don't know non-*nix platforms, but assuming those platforms don't<br>
support fork(), either, a non-*nix IO#close/#dup which emulates<br>
POSIX semantics could probably work like this:</p>
<p>FD_REFCOUNT = [] # fileno => refcount<br>
FD_REFCOUNT_LOCK = Mutex.new</p>
<p>def close<br>
raise IOError if @closed<br>
@closed = true<br>
FD_REFCOUNT_LOCK.synchronize do<br>
FD_REFCOUNT[fileno] -= 1<br>
refcount = FD_REFCOUNT[fileno]</p>
<pre><code> # call real close() only when refcount is zero
sysclose(fileno) if refcount == 0
end
</code></pre>
<p>end</p>
<p>def dup<br>
raise IOError if @closed<br>
FD_REFCOUNT_LOCK.synchronize do<br>
FD_REFCOUNT[fileno] += 1<br>
end<br>
super<br>
end</p>
<p>This is basically what happens inside the kernel anyways (except the<br>
kernel is multi-process-aware). For platforms without fork(), you<br>
should be able to emulate POSIX dup()/close() semantics using the<br>
example above as a starting point.</p>
<blockquote>
<p>Further, if StringIO is supposed to mimic all this, there are numerous<br>
bugs because StringIO instances that are aliases via #dup do report<br>
the same value for eg closed?.</p>
</blockquote>
<p>Yeah, StringIO looks buggy here (but I don't expect StringIO to ever<br>
perfectly match IO).</p> Ruby master - Feature #7220: Separate IO#dup, StringIO#initialize_copy from dup(2)https://bugs.ruby-lang.org/issues/7220?journal_id=890872020-12-10T08:47:47Znaruse (Yui NARUSE)naruse@airemix.jp
<ul><li><strong>Target version</strong> deleted (<del><i>3.0</i></del>)</li></ul>