Bug #18074
closedARGF.read(length) exhibits short rea
Description
If ruby is invoked with three file name arguments (which refer to extant non-empty regular files), and ARFG.read
is called with a length argument that exceeds the combined size of the first two files[1], then only the content of the first two files will be collected in the resultant string.
$ for f in a b c; do echo -n $f > $f; done
$ ruby -e 'p ARGF.read(3)' a b c
"ab"
[1]: This is actually just a necessary condition, the exact criteria will be provided in followup analysis.
Updated by csabahenk2 (Csaba Henk) about 3 years ago
argf_read
pseudocodized:
/* 1 */ argf_read(VALUE length)
/* 2 */ {
/* 3 */ long len = NUM2LONG(length);
/* 4 */ VALUE str = StringValue();
/* 5 */ retry:
/* 6 */ VALUE tmp = io_read(length, ARGF.current_file);
/* 7 */ if (tmp != NIL_P)
/* 8 */ rb_str_append(str, tmp);
/* 9 */ if (tmp == NIL_P) {
/* 10 */ ARGF.close();
/* 11 */ ARGF.next();
/* 12 */ goto retry;
/* 13 */ } else {
/* 14 */ slen = RSTRING_LEN(str);
/* 15 */ if (slen < len) {
/* 16 */ len -= slen;
/* 17 */ length = LONG2NUM(len);
/* 18 */ goto retry;
/* 19 */ }
/* 20 */ }
/* 21 */ return str;
/* 22 */ }
The bug resides in line 16, when len
is reduced.
The joint effect of lines 16-17 that both len
and length
are reduced by the length of the string being assembled (str
). Here len
is a C long that represents the required total read length (according to the role it fulfils in the conditional guard expression of line 15), and length
is a Ruby numeric value representing the remaining length (after the read attempts, that occur in line 6, that took place up to that point). So modifying len
is incorrect.
Why do we need three files for the bug to manifest?
- The bug is triggered in the conditional guard of line 15, with certain corrupt values of
len
. - That requires two passings through line 15, as initial
len
is not yet corrupt, it only becomes such in the next line (line 16).
Assuming ARGC
contains three files, and argf_read
is invoked with a length
that exceeds the combined length of the first two files, the control flow goes as follows:
- The first I/O activity is reading in the first file, upon hitting line 6. From this the code flow passes on to line 15, where the condition shall be fulfilled, as we read less than the required length, so we get to line 16, where the corruption occurs.
- We are still at the first file, I/O is attempted on it again, but as we reached its end already,
io_read()
returns nil (stored intmp
). - A nil
tmp
lands us in the block of lines 9-13, where we step to the second file. - This also means avoiding the critical line 15.
- We move on to do I/O on the second file, reading it in, too.
- As we still have data to read (residing in the third file), we should pass into the code block between lines 15-19 from where we would go to doing I/O again.
- However, the conditional guard of line 15, that could give us pass into this code block, can bogusly fail due to the corrupt, incorrectly reduced
len
.
The exact condition for length
to trigger the bug: if the three files are of size s1, s2, s3, then, as said before:
- s1, s2, s3 > 0
-
s1 + s2 <
length
and also:
- 2*s1 + s2 >
length
Checking these criteria is left to the reader.
Updated by csabahenk2 (Csaba Henk) about 3 years ago
It was commit 4f5976c that introduced this bug:
commit 4f5976cbb826df462c62a0a8e72ebe91c15a7ba4
Author: matz <matz@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>
Date: Wed Dec 3 17:30:09 2003 +0000
* io.c (argf_read): should not terminate on empty string; wait
until real EOF. [ruby-dev:21969]
* io.c (argf_read): should adjust length to read, when length is
specified and read spans command line argument files.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@5096 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
4f5976c is v1_8_1~189, so practically all MRI versions are affected.
Updated by csabahenk2 (Csaba Henk) about 3 years ago
Proposed fix: https://github.com/ruby/ruby/pull/4727
Updated by Anonymous about 3 years ago
- Status changed from Open to Closed
Applied in changeset git|8df1ace64a7695c855bf0a774e3fd70edfab0bf3.
Fix ARGF.read(length) short read [Bug #18074]