https://bugs.ruby-lang.org/https://bugs.ruby-lang.org/favicon.ico?17113305112022-12-11T20:41:09ZRuby Issue Tracking SystemRuby master - Bug #19192: IO has third data mode, document is incomplete.https://bugs.ruby-lang.org/issues/19192?journal_id=1005492022-12-11T20:41:09Zalanwu (Alan Wu)
<ul></ul><p>Ugh, it's quite weird. The <code>:crlf_newline</code> option is an <a href="https://docs.ruby-lang.org/en/master/encodings_rdoc.html#label-Encoding+Options" class="external">encoding option</a>,<br>
but <code>File.open</code> uses it in a way different from <code>String#encode</code>.<br>
With <code>String#encode</code>, it replaces <code>\n</code> with <code>\r\n</code>:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="nb">p</span> <span class="s2">"</span><span class="se">\n</span><span class="s2">"</span><span class="p">.</span><span class="nf">encode</span><span class="p">(</span><span class="ss">crlf_newline: </span><span class="kp">true</span><span class="p">)</span> <span class="c1"># => "\r\n"</span>
</code></pre>
<p>With <code>File.open</code>, it controls DOS TEXT mode, which is Windows-specific and<br>
does the inverse conversion when reading.<br>
Also, when used in conjunction with encoding conversion, it doesn't do any newline conversion on Windows:</p>
<pre><code class="ruby syntaxhl" data-language="ruby"><span class="nb">require</span> <span class="s1">'tempfile'</span>
<span class="n">content</span> <span class="o">=</span> <span class="s2">"</span><span class="se">\x1a</span><span class="s2"> </span><span class="se">\r</span><span class="s2"> </span><span class="se">\r\n</span><span class="s2"> </span><span class="se">\n</span><span class="s2">"</span><span class="p">.</span><span class="nf">freeze</span>
<span class="n">tmp</span> <span class="o">=</span> <span class="no">Tempfile</span><span class="p">.</span><span class="nf">new</span><span class="p">.</span><span class="nf">binmode</span>
<span class="n">tmp</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
<span class="n">tmp</span><span class="p">.</span><span class="nf">flush</span>
<span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="n">tmp</span><span class="p">.</span><span class="nf">path</span><span class="p">,</span> <span class="s2">"r:US-ASCII:UTF-8"</span><span class="p">,</span> <span class="ss">crlf_newline: </span><span class="kp">true</span><span class="p">)</span> <span class="k">do</span>
<span class="n">read_content</span> <span class="o">=</span> <span class="n">_1</span><span class="p">.</span><span class="nf">read</span>
<span class="nb">p</span> <span class="n">content</span>
<span class="nb">p</span> <span class="n">read_content</span>
<span class="nb">p</span> <span class="n">content</span> <span class="o">==</span> <span class="n">read_content</span>
<span class="k">end</span>
<span class="cp">__END__
F:\> ruby.exe -v .\19192.rb
ruby 3.2.0dev (2022-12-09T14:34:17Z master 12b5268679) [x64-mswin64_140]
"\u001A \r \r\n \n"
"\u001A \r \r\n \n"
true
</span></code></pre>
<p>There seems to be no way to get just CRLF conversion without also getting<br>
special treatment for <code>0x1A</code>. This is probably because Ruby has to rely on the Windows<br>
system library to do the conversion. The <code>:universal_newline</code> option is built into<br>
Ruby so it works cross-platform.</p>