Project

General

Profile

Actions

Bug #18074

closed

ARGF.read(length) exhibits short rea

Added by csabahenk2 (Csaba Henk) over 2 years ago. Updated over 2 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 3.1.0dev (2021-08-04T09:29:42Z master 6e55facdb3) [x86_64-linux]
[ruby-core:104855]

Description

If ruby is invoked with three file name arguments (which refer to extant non-empty regular files), and ARFG.read is called with a length argument that exceeds the combined size of the first two files[1], then only the content of the first two files will be collected in the resultant string.

$ for f in a b c; do echo -n $f > $f; done
$ ruby -e 'p ARGF.read(3)' a b c
"ab"

[1]: This is actually just a necessary condition, the exact criteria will be provided in followup analysis.

Updated by csabahenk2 (Csaba Henk) over 2 years ago

argf_read pseudocodized:

/*  1 */ argf_read(VALUE length)
/*  2 */ {
/*  3 */ 	long len = NUM2LONG(length);
/*  4 */ 	VALUE str = StringValue();
/*  5 */ retry:
/*  6 */ 	VALUE tmp = io_read(length, ARGF.current_file);
/*  7 */ 	if (tmp != NIL_P)
/*  8 */ 		rb_str_append(str, tmp);
/*  9 */ 	if (tmp == NIL_P) {
/* 10 */ 		ARGF.close();
/* 11 */ 		ARGF.next();
/* 12 */ 		goto retry;
/* 13 */ 	} else {
/* 14 */ 		slen = RSTRING_LEN(str);
/* 15 */ 		if (slen < len) {
/* 16 */ 			len -= slen;
/* 17 */ 			length = LONG2NUM(len);
/* 18 */ 			goto retry;
/* 19 */ 		}
/* 20 */ 	}
/* 21 */ 	return str;
/* 22 */ }

The bug resides in line 16, when len is reduced.

The joint effect of lines 16-17 that both len and length are reduced by the length of the string being assembled (str). Here len is a C long that represents the required total read length (according to the role it fulfils in the conditional guard expression of line 15), and length is a Ruby numeric value representing the remaining length (after the read attempts, that occur in line 6, that took place up to that point). So modifying len is incorrect.

Why do we need three files for the bug to manifest?

  • The bug is triggered in the conditional guard of line 15, with certain corrupt values of len.
  • That requires two passings through line 15, as initial len is not yet corrupt, it only becomes such in the next line (line 16).

Assuming ARGC contains three files, and argf_read is invoked with a length that exceeds the combined length of the first two files, the control flow goes as follows:

  • The first I/O activity is reading in the first file, upon hitting line 6. From this the code flow passes on to line 15, where the condition shall be fulfilled, as we read less than the required length, so we get to line 16, where the corruption occurs.
  • We are still at the first file, I/O is attempted on it again, but as we reached its end already, io_read() returns nil (stored in tmp).
  • A nil tmp lands us in the block of lines 9-13, where we step to the second file.
  • This also means avoiding the critical line 15.
  • We move on to do I/O on the second file, reading it in, too.
  • As we still have data to read (residing in the third file), we should pass into the code block between lines 15-19 from where we would go to doing I/O again.
  • However, the conditional guard of line 15, that could give us pass into this code block, can bogusly fail due to the corrupt, incorrectly reduced len.

The exact condition for length to trigger the bug: if the three files are of size s1, s2, s3, then, as said before:

  • s1, s2, s3 > 0
  • s1 + s2 < length

and also:

  • 2*s1 + s2 > length

Checking these criteria is left to the reader.

Updated by csabahenk2 (Csaba Henk) over 2 years ago

It was commit 4f5976c that introduced this bug:

commit 4f5976cbb826df462c62a0a8e72ebe91c15a7ba4
Author: matz <matz@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>
Date:   Wed Dec 3 17:30:09 2003 +0000

    * io.c (argf_read): should not terminate on empty string; wait
      until real EOF.  [ruby-dev:21969]
    
    * io.c (argf_read): should adjust length to read, when length is
      specified and read spans command line argument files.
    
    
    git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@5096 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

4f5976c is v1_8_1~189, so practically all MRI versions are affected.

Actions #4

Updated by Anonymous over 2 years ago

  • Status changed from Open to Closed

Applied in changeset git|8df1ace64a7695c855bf0a774e3fd70edfab0bf3.


Fix ARGF.read(length) short read [Bug #18074]

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0