Project

General

Profile

Bug #10025

Incorrect wrapping of base64 output of Array.pack()

Added by thoger (Tomas Hoger) almost 6 years ago. Updated 12 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.0.0p353 (2013-11-22 revision 43784) [x86_64-linux]
[ruby-core:63639]

Description

String format directive m for Array pack() is documented as:

   m         | String  | base64 encoded string (see RFC 2045, count is width)
             |         | (if count is 0, no line feed are added, see RFC 4648)

http://www.ruby-doc.org/core-2.1.2/Array.html#method-i-pack

While the description of the meaning of count argument is rather limited, it seems it's supposed to mean the maximum length of the line in the output before line break is added. However, that's not what actually happens:

$ ruby -e 'print ["a"*40].pack("m20")'
YWFhYWFhYWFhYWFhYWFhYWFh
YWFhYWFhYWFhYWFhYWFhYWFh
YWFhYQ==

In this example, output lines have 24 characters. To have 20 character long output lines, m15 has to be specified:

$ ruby -e 'print ["a"*40].pack("m15")'
YWFhYWFhYWFhYWFhYWFh
YWFhYWFhYWFhYWFhYWFh
YWFhYWFhYWFhYQ==

This is caused by the following in pack_pack():

len = len / 3 * 3;

https://github.com/ruby/ruby/blob/dd5d029/pack.c#L832

This looks like a typo / thinko. Base64 encoding produces 4 bytes of output for every 3 bytes of input. Hence to get output line of length N, encoding should process N / 4 * 3 input bytes before inserting line break. The len argument passed to encodes() is the number of input bytes to process to generate one output line.

The same applies to UU-encoding (the u format), with the difference that every line starts with and additional character specifying line length. Hence even with the above fixed, u20 would produces lines with 21 characters.


Files

pack-doc.patch (1.18 KB) pack-doc.patch jeremyevans0 (Jeremy Evans), 07/10/2019 04:36 AM
pack-m-width-output.patch (1.4 KB) pack-m-width-output.patch jeremyevans0 (Jeremy Evans), 07/10/2019 04:36 AM

Updated by jeremyevans0 (Jeremy Evans) 12 months ago

I agree this is a bug. I am not sure if it is a documentation bug or code bug. The existing documentation for Array#pack does suggest the count should specify output bytes (width of the resulting field), while the m count currently specifies input bytes between each LF.

Attached are two patches, one considering this a documentation bug (which tries to make the documentation more clear), and one considering this a code bug (which fixes the calculation to use output bytes instead of input bytes.

I'm leaning toward considering this a documentation bug, since that is a better choice for backwards compatibility.

#2

Updated by jeremyevans (Jeremy Evans) 12 months ago

  • Status changed from Feedback to Closed

Applied in changeset git|2f6cc00338826dbaa439a18e4b4f7a19c1f5987a.


Fix documentation for Array#pack m directive count specifier [ci skip]

Fixes [Bug #10025]

Also available in: Atom PDF