Bug #9790

Zlib::GzipReader only decompressed the first of concatenated files

Added by Jake Quain 12 months ago. Updated 2 months ago.

[ruby-core:62257]
Status:Assigned
Priority:Normal
Assignee:Eric Hodel
ruby -v:2.1.1 Backport:2.0.0: UNKNOWN, 2.1: UNKNOWN

Description

There is a similar old issue in Node that I came across that perfectly describes the situation in ruby:

https://github.com/joyent/node/issues/6032

In ruby given the following setup:

echo "1" > 1.txt
echo "2" > 2.txt
gzip 1.txt
gzip 2.txt
cat 1.txt.gz 2.txt.gz > 3.txt.gz

Calling:

Zlib::GzipReader.open("3.txt.gz") do |gz|
  print gz.read
end

would just print:

1

History

#1 Updated by Eric Hodel 12 months ago

  • Category set to ext
  • Status changed from Open to Assigned
  • Assignee set to Eric Hodel
  • Target version set to current: 2.2.0

#2 Updated by Aleksandar Kostadinov 2 months ago

Because gzip format allows multiple entries with filename I'd suggest to support a method like Java's ZipInputStream getNextEntry() [1]. This way programmer can choose to read everything as one chunk of data or multiple chunks each with its own name. This would allow storing and then retrieving multiple files in/from one gz.

On the other hand the command line gzip utility only supports reading the whole thing as one. So a convenience method to read everything in one go, would also be nice.

[1] http://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipInputStream.html

#3 Updated by Martin Dürst 2 months ago

Aleksandar Kostadinov wrote:

Because gzip format allows multiple entries with filename I'd suggest to support a method like Java's ZipInputStream getNextEntry() [1]. This way programmer can choose to read everything as one chunk of data or multiple chunks each with its own name. This would allow storing and then retrieving multiple files in/from one gz.

Good idea, but it should be more Ruby-like, such as .each_file or so.

Also available in: Atom PDF