Bug #13085
closedio.c io_fwrite creates garbage
Description
Relying on rb_str_new_frozen for unconverted strings does not
save memory because copy-on-write is always triggered in
read-write I/O loops were subsequent IO#read calls will
clobber the given write buffer.
buf = ''.b
while input.read(16384, buf)
output.write(buf)
end
This generates a lot of garbage starting with Ruby 2.2 (r44471).
For my use case, even IO.copy_stream
generates garbage, since
I wrap "write" to do Digest calculation in a single pass.
I tried using rb_str_replace and reusing the string as a hidden
(klass == 0)
thread-local, but rb_str_replace
attempts CoW
optimization by creating new frozen objects, too:
https://80x24.org/spew/20161229004417.12304-1-e@80x24.org/raw
So, I'm not sure what to do, temporal locking seems wrong for
writing strings (I guess it's for reading?). I get
test_threaded_flush
failures with the following:
https://80x24.org/spew/20161229005701.9712-1-e@80x24.org/raw
IO#syswrite
has the same problem with garbage. I can use
IO#write_nonblock
on fast filesystems while holding GVL,
I guess...
Files
Updated by normalperson (Eric Wong) about 8 years ago
- File 0001-io.c-io_fwrite-temporarily-freeze-string-when-writin.patch 0001-io.c-io_fwrite-temporarily-freeze-string-when-writin.patch added
Proposed patch to temporarily freeze string while copying
io.c (io_fwrite): temporarily freeze string when writing
This avoids garbage from IO#write for [Bug #13085].
Memory usage from benchmark/bm_io_copy_stream_write.rb
is reduced greatly:
target 0: a (ruby 2.5.0dev (2016-12-30 trunk 57236) [x86_64-linux])
target 1: b (ruby 2.5.0dev (2016-12-30) [x86_64-linux])
Memory usage (last size) (B)
name a b
io_copy_stream_write 82235392.000 6651904.000
Memory consuming ratio (size) with the result of `a' (greater is better)
name b
io_copy_stream_write 12.363
There is also a speedup in execution time:
Execution time (sec)
name a b
io_copy_stream_write 0.380 0.143
Speedup ratio: compare with the result of `a' (greater is better)
name b
io_copy_stream_write 2.651
Caveat, there is one potential race condition:
If another thread calls String#freeze on the string we are
currently writing; we will blindly unfreeze it during
fwrite_unfreeze from ensure. However, I do not expect this to
be a real-world case.
Ideally, Ruby should have a way of detecting threads which
are not visible to other threads.
Updated by normalperson (Eric Wong) about 8 years ago
normalperson@yhbt.net wrote:
Issue #13085 has been updated by Eric Wong.
File 0001-io.c-io_fwrite-temporarily-freeze-string-when-writin.patch added
Caveat, there is one potential race condition: If another thread calls String#freeze on the string we are currently writing; we will blindly unfreeze it during fwrite_unfreeze from ensure. However, I do not expect this to be a real-world case. Ideally, Ruby should have a way of detecting threads which are not visible to other threads.
Any comment? I still have not been able to think of a better
solution. I'll commit in a month or five if no objections.
Updated by nobu (Nobuyoshi Nakada) about 8 years ago
No.
Your patch releases the GVL while unfreezing the string to be written.
If the write operation is blocked, other threads can clobber that string.
Updated by normalperson (Eric Wong) about 8 years ago
nobu@ruby-lang.org wrote:
No.
Your patch releases the GVL while unfreezing the string to be written.
If the write operation is blocked, other threads can clobber that string.
It does?
fwrite_unfreeze
happens in rb_ensure e_proc
, which has GVL.
From what I can tell, GVL release happens only inside
fwrite_freeze (rb_ensure b_proc)
:
fwrite_freeze (rb_ensure b_proc)
io_binwrite
rb_mutex_synchronize(fptr->write_lock...)
rb_mutex_lock = NOGVL
io_binwrite_string (identical as below:)
io_binwrite_string
rb_writev_internal
rb_thread_io_blocking_region = NOGVL
rb_write_internal
rb_thread_io_blocking_region = NOGVL
rb_io_wait_writable = NOGVL
By the time e_proc (fwrite_unfreeze)
is called by rb_ensure
we have GVL, again.
Updated by nobu (Nobuyoshi Nakada) about 8 years ago
- Description updated (diff)
Another thread may make a substring of the buffer string during writing.
Then the temporarily frozen string becomes a shared root of the new substring.
Later the root gets unfrozen and modified, now the substring has an invalid pointer.
If another thread calls String#freeze on the string we are
currently writing; we will blindly unfreeze it during
fwrite_unfreeze from ensure. However, I do not expect this to
be a real-world case.
That is test_threaded_flush
in test_io.rb,
freezing the same string simultaneously in io_fwrite
.
Updated by normalperson (Eric Wong) about 8 years ago
- File 0001-v2-io.c-io_fwrite-copy-to-hidden-buffer-when-writing.patch 0001-v2-io.c-io_fwrite-copy-to-hidden-buffer-when-writing.patch added
OK, different strategy; not as fast, but still better than what
we currently have.
[PATCH v2] io.c (io_fwrite): copy to hidden buffer when writing
This avoids garbage from IO#write for [Bug #13085] when
called in a read-write loop while protecting the VM
from race conditions forced by the user.
Memory usage from benchmark/bm_io_copy_stream_write.rb
is reduced greatly:
target 0: a (ruby 2.5.0dev (2017-01-05 trunk 57270) [x86_64-linux])
target 1: b (ruby 2.5.0dev (2017-01-05) [x86_64-linux])
Memory usage (last size) (B)
name a b
io_copy_stream_write 81899520.000 6561792.000
Memory consuming ratio (size) with the result of `a' (greater is better)
name b
io_copy_stream_write 12.481
Despite the extra deep data copy, there is a small speedup in
execution time due to GC avoidance:
Execution time (sec)
name a b
io_copy_stream_write 0.393 0.296
Speedup ratio: compare with the result of `a' (greater is better)
name b
io_copy_stream_write 1.328
This patch increases memory bandwidth use by pessimistically
copying the data into a temporary hidden buffer. Using a
lightweight frozen copy (as before this patch) is ineffective
in read-write loops, since the read operation will trigger
a heavy copy behind our back due to the CoW operation.
It is also impossible to safely release memory from the
lightweight CoW string, because we have no idea how many
lightweight duplicates exist by the time we reacquire GVL.
So, we now make a heavy copy up front which we recycle
immediately upon completion.
Ideally, Ruby should have a way of detecting Strings which are
not visible to other threads and be able to optimize away the
internal copy. Or, we give up on the idea of implicit data
sharing between threads since its dangerous anyways.
Updated by normalperson (Eric Wong) about 8 years ago
normalperson@yhbt.net wrote:
File 0001-v2-io.c-io_fwrite-copy-to-hidden-buffer-when-writing.patch added
OK, different strategy; not as fast, but still better than what
we currently have.
Any comments on my alternative patch?
Anyways, I don't like this v2 strategy, either, since it has an
extra memory copy, but it's still better than creating gigantic
amounts of garbage.
With the existence of ObjectSpace.each_object and ability for
threads to be spawned inside signal handlers, I'm not sure if
there's a performant way for us to check the thread-visibility
of a particular String to elide the copy.
Updated by normalperson (Eric Wong) about 8 years ago
- File 0001-basicsocket-Linux-workaround-for-excess-garbage-on-w.patch 0001-basicsocket-Linux-workaround-for-excess-garbage-on-w.patch added
At least on Linux, this is probably the best performance we can get
with sockets. Unfortunately, this does not affect writes to pipes
(write_nonblock is OK for me, at least), and regular files
(where we really need to release GVL)
--------8<-------
Subject: [PATCH] basicsocket: Linux workaround for excess garbage on write
Linux allows us to use the MSG_DONTWAIT flag on sockets
to do non-blocking send(2) without side effects of modifying
the underlying file flags. This allows us to replace the
generic IO#write and avoid garbage creation on GVL release:
https://bugs.ruby-lang.org/issues/13085
We now only release the GVL when IO cannot proceed.
While IO#write_nonblock may be used for pipes and sockets
portably, that has the side effect of changing the file flag via
fcntl. MSG_DONTWAIT under Linux has no side effects on the
socket.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
I HATE IMPLICITLY SHARED MEMORY.
Updated by normalperson (Eric Wong) almost 8 years ago
I think this can be a universal solution. Lightly tested and all tests pass,
but I have not checked coverage, yet.
I reuse one of the embed length bits for shared (noembed) string
to track when a string is shared multiple times. In the common
case, there is one share (the original source string), so we can
recycle immediately in ensure. If a second share is set, the
new bit is set and we do not recycle in ensure.
Updated by nobu (Nobuyoshi Nakada) almost 8 years ago
- Status changed from Open to Assigned
- Assignee set to normalperson (Eric Wong)
Seems nice, let's try it.
Updated by normalperson (Eric Wong) almost 8 years ago
nobu@ruby-lang.org wrote:
Seems nice, let's try it.
Thanks, r57469. I'll work on syswrite and send* (socket) next.
Updated by Anonymous almost 8 years ago
- Status changed from Assigned to Closed
Applied in changeset r57471.
string.c (rb_str_tmp_frozen_release): release embedded strings
Handle the embedded case first, since we may have an embedded
duplicate and non-embedded original string.
- string.c (rb_str_tmp_frozen_release): handled embedded strings
- test/ruby/test_io.rb (test_write_no_garbage): new test
[ruby-core:78898] [Bug #13085]
Updated by normalperson (Eric Wong) almost 8 years ago
Eric Wong normalperson@yhbt.net wrote:
nobu@ruby-lang.org wrote:
Seems nice, let's try it.
Thanks, r57469. I'll work on syswrite and send* (socket) next.
I guess send* in socket never froze the string. I guess it
will be necessary...
Anyways, I made sprintf and strftime avoid garbage
in a similar cases (r57473 and r57476), too.
Also, I've been thinking the rb_ensure usage for this in io.c
isn't necessary. Exceptions ought to be rare and I already
consider anything which repeatedly throws exceptions a lost
cause. So I think I'll apply the following, tomorrow:
https://80x24.org/spew/20170131024140.27859-1-e@80x24.org/raw
The diffstat alone is seems worth it:
io.c | 93 +++++++++++++++-----------------------------------------------------
1 file changed, 20 insertions(+), 73 deletions(-)
Updated by naruse (Yui NARUSE) almost 8 years ago
- Related to Bug #13299: backport r57469, r57472, r57508 (garbage reduction for IO#write/syswrite) added