Project

General

Profile

Feature #14315

zlib: reduce garbage on gzip writes (deflate)

Added by normalperson (Eric Wong) almost 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:84638]

Description

zlib: reduce garbage on gzip writes (deflate)

Zlib::GzipWriter generated large amounts of garbage from
(struct zstream).input.  Reuse the .input field when it is
hidden, and recycle it when its lifetime is over.  This change
alone reduced memory usage of the writer from 90MB to 4.5MB.

For the detached buffer of compressed data used by
gzfile_write_raw, we can only clear the string (not recycle it)
since user code may hold references to it (but the data would be
clobbered, anyways).  This reduced memory usage slightly by
around 0.5MB (because it's smaller compressed data).

Combined, these changes reduce the anonymous RSS memory of a
dedicated writer process from over 90MB to under 4MB.

before:

    #      user     system      total        real

    writer   7.823332   0.053333   7.876665 (  7.881464)
    writer RssAnon:    92944 kB
    reader   6.969999   0.076666   7.046665 (  7.906377)
    reader RssAnon:   109820 kB

after:

    writer   7.359999   0.000000   7.359999 (  7.360639)
    writer RssAnon:     4040 kB
    reader   6.346667   0.070000   6.416667 (  7.387654)
    reader RssAnon:    98272 kB

Script used:
-------
require 'zlib'
require 'benchmark'
nr = 16384 * 2

def stats(pfx, bm)
  str = "#{bm}#{File.readlines("/proc/#$$/status").grep(/^RssAnon:/)[0]}"
  puts str.gsub!(/^/m, pfx)
end

rd, wr = IO.pipe
pid = fork do
  buf = ((0..255).map(&:chr).join * 128).freeze
  rd.close
  gzip = Zlib::GzipWriter.new(wr)
  bm = Benchmark.measure do
    nr.times { gzip.write(buf) }
    gzip.close
    wr.close
  end
  stats('writer ', bm)
end

wr.close
buf = ''
gunzip = Zlib::GzipReader.new(rd)
n = 0
bm = Benchmark.measure do
  begin
    gunzip.readpartial(16384, buf)
    n += buf.size
  rescue EOFError
    break
  end while true
end
stats('reader ', bm)
Process.waitall
-------
* ext/zlib/zlib.c (zstream_discard_input): reuse or recycle hidden input
  (zstream_reset_input): clear hidden input
  (zstream_run): detach input and recycle after use
  (gzfile_write_raw): clear buffer after write

Files

Associated revisions

Revision a55abcc0
Added by normal almost 2 years ago

zlib: reduce garbage on gzip writes (deflate)

Zlib::GzipWriter generated large amounts of garbage from
(struct zstream).input. Reuse the .input field when it is
hidden, and recycle it when its lifetime is over. This change
alone reduced memory usage of the writer from 90MB to 4.5MB.

For the detached buffer of compressed data used by
gzfile_write_raw, we can only clear the string (not recycle it)
since user code may hold references to it (but the data would be
clobbered, anyways). This reduced memory usage slightly by
around 0.5MB (because it's smaller compressed data).

Combined, these changes reduce the anonymous RSS memory of a
dedicated writer process from over 90MB to under 4MB.

before:

#      user     system      total        real

writer   7.823332   0.053333   7.876665 (  7.881464)
writer RssAnon:    92944 kB
reader   6.969999   0.076666   7.046665 (  7.906377)
reader RssAnon:   109820 kB

after:

writer   7.359999   0.000000   7.359999 (  7.360639)
writer RssAnon:     4040 kB
reader   6.346667   0.070000   6.416667 (  7.387654)
reader RssAnon:    98272 kB

Script used:

require 'zlib'
require 'benchmark'
nr = 16384 * 2

def stats(pfx, bm)
str = "#{bm}#{File.readlines("/proc/#$$/status").grep(/RssAnon:/)[0]}"
puts str.gsub!(//m, pfx)
end

rd, wr = IO.pipe
pid = fork do
buf = ((0..255).map(&:chr).join * 128).freeze
rd.close
gzip = Zlib::GzipWriter.new(wr)
bm = Benchmark.measure do
nr.times { gzip.write(buf) }
gzip.close
wr.close
end
stats('writer ', bm)
end

wr.close
buf = ''
gunzip = Zlib::GzipReader.new(rd)
n = 0
bm = Benchmark.measure do
begin
gunzip.readpartial(16384, buf)
n += buf.size
rescue EOFError
break
end while true
end
stats('reader ', bm)

Process.waitall

  • ext/zlib/zlib.c (zstream_discard_input): reuse or recycle hidden input (zstream_reset_input): clear hidden input (zstream_run): detach input and recycle after use (gzfile_write_raw): clear buffer after write [ruby-core:84638] [Feature #14315]

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61631 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 61631
Added by normalperson (Eric Wong) almost 2 years ago

zlib: reduce garbage on gzip writes (deflate)

Zlib::GzipWriter generated large amounts of garbage from
(struct zstream).input. Reuse the .input field when it is
hidden, and recycle it when its lifetime is over. This change
alone reduced memory usage of the writer from 90MB to 4.5MB.

For the detached buffer of compressed data used by
gzfile_write_raw, we can only clear the string (not recycle it)
since user code may hold references to it (but the data would be
clobbered, anyways). This reduced memory usage slightly by
around 0.5MB (because it's smaller compressed data).

Combined, these changes reduce the anonymous RSS memory of a
dedicated writer process from over 90MB to under 4MB.

before:

#      user     system      total        real

writer   7.823332   0.053333   7.876665 (  7.881464)
writer RssAnon:    92944 kB
reader   6.969999   0.076666   7.046665 (  7.906377)
reader RssAnon:   109820 kB

after:

writer   7.359999   0.000000   7.359999 (  7.360639)
writer RssAnon:     4040 kB
reader   6.346667   0.070000   6.416667 (  7.387654)
reader RssAnon:    98272 kB

Script used:

require 'zlib'
require 'benchmark'
nr = 16384 * 2

def stats(pfx, bm)
str = "#{bm}#{File.readlines("/proc/#$$/status").grep(/RssAnon:/)[0]}"
puts str.gsub!(//m, pfx)
end

rd, wr = IO.pipe
pid = fork do
buf = ((0..255).map(&:chr).join * 128).freeze
rd.close
gzip = Zlib::GzipWriter.new(wr)
bm = Benchmark.measure do
nr.times { gzip.write(buf) }
gzip.close
wr.close
end
stats('writer ', bm)
end

wr.close
buf = ''
gunzip = Zlib::GzipReader.new(rd)
n = 0
bm = Benchmark.measure do
begin
gunzip.readpartial(16384, buf)
n += buf.size
rescue EOFError
break
end while true
end
stats('reader ', bm)

Process.waitall

  • ext/zlib/zlib.c (zstream_discard_input): reuse or recycle hidden input (zstream_reset_input): clear hidden input (zstream_run): detach input and recycle after use (gzfile_write_raw): clear buffer after write [ruby-core:84638] [Feature #14315]

Revision 61631
Added by normal almost 2 years ago

zlib: reduce garbage on gzip writes (deflate)

Zlib::GzipWriter generated large amounts of garbage from
(struct zstream).input. Reuse the .input field when it is
hidden, and recycle it when its lifetime is over. This change
alone reduced memory usage of the writer from 90MB to 4.5MB.

For the detached buffer of compressed data used by
gzfile_write_raw, we can only clear the string (not recycle it)
since user code may hold references to it (but the data would be
clobbered, anyways). This reduced memory usage slightly by
around 0.5MB (because it's smaller compressed data).

Combined, these changes reduce the anonymous RSS memory of a
dedicated writer process from over 90MB to under 4MB.

before:

#      user     system      total        real

writer   7.823332   0.053333   7.876665 (  7.881464)
writer RssAnon:    92944 kB
reader   6.969999   0.076666   7.046665 (  7.906377)
reader RssAnon:   109820 kB

after:

writer   7.359999   0.000000   7.359999 (  7.360639)
writer RssAnon:     4040 kB
reader   6.346667   0.070000   6.416667 (  7.387654)
reader RssAnon:    98272 kB

Script used:

require 'zlib'
require 'benchmark'
nr = 16384 * 2

def stats(pfx, bm)
str = "#{bm}#{File.readlines("/proc/#$$/status").grep(/RssAnon:/)[0]}"
puts str.gsub!(//m, pfx)
end

rd, wr = IO.pipe
pid = fork do
buf = ((0..255).map(&:chr).join * 128).freeze
rd.close
gzip = Zlib::GzipWriter.new(wr)
bm = Benchmark.measure do
nr.times { gzip.write(buf) }
gzip.close
wr.close
end
stats('writer ', bm)
end

wr.close
buf = ''
gunzip = Zlib::GzipReader.new(rd)
n = 0
bm = Benchmark.measure do
begin
gunzip.readpartial(16384, buf)
n += buf.size
rescue EOFError
break
end while true
end
stats('reader ', bm)

Process.waitall

  • ext/zlib/zlib.c (zstream_discard_input): reuse or recycle hidden input (zstream_reset_input): clear hidden input (zstream_run): detach input and recycle after use (gzfile_write_raw): clear buffer after write [ruby-core:84638] [Feature #14315]

History

Updated by normalperson (Eric Wong) almost 2 years ago

normalperson@yhbt.net wrote:

https://bugs.ruby-lang.org/issues/14315

Complementary patch for the Zlib::GzipReader side:

https://80x24.org/spew/20180105134532.8946-1-e@80x24.org/raw
(sorry too tired to write proper commit message)

#2

Updated by Anonymous almost 2 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r61631.


zlib: reduce garbage on gzip writes (deflate)

Zlib::GzipWriter generated large amounts of garbage from
(struct zstream).input. Reuse the .input field when it is
hidden, and recycle it when its lifetime is over. This change
alone reduced memory usage of the writer from 90MB to 4.5MB.

For the detached buffer of compressed data used by
gzfile_write_raw, we can only clear the string (not recycle it)
since user code may hold references to it (but the data would be
clobbered, anyways). This reduced memory usage slightly by
around 0.5MB (because it's smaller compressed data).

Combined, these changes reduce the anonymous RSS memory of a
dedicated writer process from over 90MB to under 4MB.

before:

#      user     system      total        real

writer   7.823332   0.053333   7.876665 (  7.881464)
writer RssAnon:    92944 kB
reader   6.969999   0.076666   7.046665 (  7.906377)
reader RssAnon:   109820 kB

after:

writer   7.359999   0.000000   7.359999 (  7.360639)
writer RssAnon:     4040 kB
reader   6.346667   0.070000   6.416667 (  7.387654)
reader RssAnon:    98272 kB

Script used:

require 'zlib'
require 'benchmark'
nr = 16384 * 2

def stats(pfx, bm)
str = "#{bm}#{File.readlines("/proc/#$$/status").grep(/RssAnon:/)[0]}"
puts str.gsub!(//m, pfx)
end

rd, wr = IO.pipe
pid = fork do
buf = ((0..255).map(&:chr).join * 128).freeze
rd.close
gzip = Zlib::GzipWriter.new(wr)
bm = Benchmark.measure do
nr.times { gzip.write(buf) }
gzip.close
wr.close
end
stats('writer ', bm)
end

wr.close
buf = ''
gunzip = Zlib::GzipReader.new(rd)
n = 0
bm = Benchmark.measure do
begin
gunzip.readpartial(16384, buf)
n += buf.size
rescue EOFError
break
end while true
end
stats('reader ', bm)

Process.waitall

  • ext/zlib/zlib.c (zstream_discard_input): reuse or recycle hidden input (zstream_reset_input): clear hidden input (zstream_run): detach input and recycle after use (gzfile_write_raw): clear buffer after write [ruby-core:84638] [Feature #14315]

Also available in: Atom PDF