Bug #9790

Zlib::GzipReader only decompressed the first of concatenated files

Added by Jake Quain about 1 year ago. Updated about 1 month ago.

[ruby-core:62257]
Status:Assigned
Priority:Normal
Assignee:Eric Hodel
ruby -v:2.1.1 Backport:2.0.0: UNKNOWN, 2.1: UNKNOWN

Description

There is a similar old issue in Node that I came across that perfectly describes the situation in ruby:

https://github.com/joyent/node/issues/6032

In ruby given the following setup:

echo "1" > 1.txt
echo "2" > 2.txt
gzip 1.txt
gzip 2.txt
cat 1.txt.gz 2.txt.gz > 3.txt.gz

Calling:

Zlib::GzipReader.open("3.txt.gz") do |gz|
  print gz.read
end

would just print:

1

Related issues

Duplicated by Ruby trunk - Bug #11180: Missing lines with Zlib::GzipReader Open

History

#1 Updated by Eric Hodel about 1 year ago

  • Target version set to current: 2.2.0
  • Category set to ext
  • Status changed from Open to Assigned
  • Assignee set to Eric Hodel

#2 Updated by Aleksandar Kostadinov 4 months ago

Because gzip format allows multiple entries with filename I'd suggest to support a method like Java's ZipInputStream getNextEntry() [1]. This way programmer can choose to read everything as one chunk of data or multiple chunks each with its own name. This would allow storing and then retrieving multiple files in/from one gz.

On the other hand the command line gzip utility only supports reading the whole thing as one. So a convenience method to read everything in one go, would also be nice.

[1] http://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipInputStream.html

#3 Updated by Martin Dürst 4 months ago

Aleksandar Kostadinov wrote:

Because gzip format allows multiple entries with filename I'd suggest to support a method like Java's ZipInputStream getNextEntry() [1]. This way programmer can choose to read everything as one chunk of data or multiple chunks each with its own name. This would allow storing and then retrieving multiple files in/from one gz.

Good idea, but it should be more Ruby-like, such as .each_file or so.

#4 Updated by Evgeny Li about 1 month ago

Hey guys, is there any updates?

I have created a small gem yesterday to make it able to read multiple files https://github.com/exAspArk/multiple_files_gzip_reader

> MultipleFilesGzipReader.open("3.txt.gz") do |gz|
>   puts gz.read
> end

# 1
# 2
# => nil

#5 Updated by Tomoyuki Chikanaga about 1 month ago

  • Duplicated by Bug #11180: Missing lines with Zlib::GzipReader added

Also available in: Atom PDF