Bug #20101
closedrb_file_open and rb_io_fdopen don't perform CRLF -> LF conversion when encoding is set
Description
When opening a file with File.open
, as long as 'b'
is not set in the mode, Ruby will perform CRLF -> LF conversion on Windows when reading text files - i.e. CRLF line endings on disk get converted to Ruby strings with only "\n" in them. If you explicitly set the encoding with IO#set_encoding
, this still works properly.
If you open the file in C with either the rb_io_fdopen
or rb_file_open
APIs in text mode, CRLF -> LF conversion also works. However, if you then call IO#set_encoding
on this file, the CRLF -> LF conversion stops happening.
Concretely, this means that the conversion doesn't happen in the following circumstances:
- When loading ruby files with require (that calls
rb_io_fdopen
) - When parsing ruuby files with RubyVM::AbstractSyntaxTree (that calls
rb_file_open
).
This then causes the ErrorHighlight tests to fail on windows if git has checked them out with CRLF line endings - the error messages it's testing wind up with literal \r\n sequences in them because the iseq text from the parser contains un-newline-converted strings.
This seems to happen because, in File.open
, the file's encflags get the flag ECONV_DEFAULT_NEWLINE_DECORATOR
in rb_io_extract_modeenc
; however, this method isn't called for rb_io_fdopen
or rb_file_open
, so encflags
doesn't get set to ECONV_DEFAULT_NEWLINE_DECORATOR
. Without that flag, the underlying file descriptor's mode gets changed to binary mode by the NEED_NEWLINE_DECORATOR_ON_READ_CHECK
macro.