Feature #13996
closed[PATCH] file.c: apply2files releases GVL
Description
This means File.chmod, File.lchmod, File.chown, File.lchown,
File.unlink, and File.utime operations on slow filesystems
no longer hold up other threads.
The platform-specific utime_failed changes is compile-tested
using a new UTIME_EINVAL macro
- file.c (UTIME_EINVAL): new macro to ease compile-testing
- file.c (struct apply_arg): new struct
- file.c (no_gvl_apply2files): new function
- file.c (apply2files): release GVL
- file.c (chmod_internal): adjust for apply2files changes
- file.c (lchmod_internal): ditto
- file.c (chown_internal): ditto
- file.c (lchown_internal): ditto
- file.c (utime_failed): ditto
- file.c (utime_internal): ditto
- file.c (unlink_internal): ditto
Files
Updated by normalperson (Eric Wong) over 7 years ago
normalperson@yhbt.net wrote:
Updated patch with benchmark, now:
https://80x24.org/spew/20171012191218.1624-1-e@80x24.org/raw
This hurts performance on fast filesystem, but these methods
are unlikely to be performance bottlenecks and (IMHO) avoiding
pathological slowdowns and stalls are more important.
benchmark results:
minimum results in each 3 measurements.
Execution time (sec)
name trunk built
file_chmod 0.591 0.801
Speedup ratio: compare with the result of `trunk' (greater is better)
name built
file_chmod 0.737
:<
Updated by normalperson (Eric Wong) over 7 years ago
I wrote:
Updated patch with benchmark, now:
https://80x24.org/spew/20171012191218.1624-1-e@80x24.org/raw
This hurts performance on fast filesystem, but these methods
are unlikely to be performance bottlenecks and (IMHO) avoiding
pathological slowdowns and stalls are more important.
Will commit in a few days...
At least on Linux, GVL performance may be slightly improved:
Updated by normalperson (Eric Wong) over 7 years ago
In the same vein as https://bugs.ruby-lang.org/issues/13996
There's 4 more patches to dir.c around opendir, readdir, openat,
and chdir.
I don't think releasing GVL around rewinddir is necessary, since
AFAIK; it's just an lseek and that only updates an offset in the
kernel.
Releasing GVL around closedir may not be needed, either, since
it's operations on that FD are read-only; I'll have to read
Linux kernel sources around various FSes to confirm; but also
modern versions of Linux defer a lot of the complicated close()
stuff, anyways, to reduce userspace syscall time.
However, releasing GVL around mkdir and rmdir will be necessary,
of course; as well as the remaining stat/lstat calls in dir.c
It also looks like the dir glob code could be doing multiple
syscalls and non-rb_* userspace stuff without GVL to reduce
overhead of releasing/acquiring the GVL too frequently; but
that's a more involved patch and UTF-8 normalizing FS platforms
won't benefit (but they're not my top priority performance-wise,
either).
The following changes since commit 593d9786464c94db9c776a5a968dcf65cbc1d9d4:
tempfile.rb: [DOC] all arguments [ci skip] (2017-10-17 12:40:00 +0000)
are available in the git repository at:
git://80x24.org/ruby dir-file-gvl-r60203
for you to fetch changes up to 859716bb3a0bc2057a2d0f05079e480a1955e36c:
dir: Dir.chdir releases GVL (2017-10-18 01:59:48 +0000)
Eric Wong (5):
file.c: apply2files releases GVL
dir.c: release GVL around remaining readdir calls
dir.c: release GVL on opendir
dir.c: openat calls release GVL, too
dir: Dir.chdir releases GVL
benchmark/bm_file_chmod.rb | 9 +++
dir.c | 195 ++++++++++++++++++++++++++++++++-------------
file.c | 159 +++++++++++++++++++++++++-----------
3 files changed, 261 insertions(+), 102 deletions(-)
create mode 100644 benchmark/bm_file_chmod.rb
Updated by Anonymous about 7 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r60386.
file.c: apply2files releases GVL
This means File.chmod, File.lchmod, File.chown, File.lchown,
File.unlink, and File.utime operations on slow filesystems
no longer hold up other threads.
The platform-specific utime_failed changes is compile-tested
using a new UTIME_EINVAL macro
This hurts performance on fast filesystem, but these methods
are unlikely to be performance bottlenecks and (IMHO) avoiding
pathological slowdowns and stalls are more important.
benchmark results:
minimum results in each 3 measurements.
Execution time (sec)
name trunk built
file_chmod 0.591 0.801
Speedup ratio: compare with the result of `trunk' (greater is better)
name built
file_chmod 0.737
- file.c (UTIME_EINVAL): new macro to ease compile-testing
- file.c (struct apply_arg): new struct
- file.c (no_gvl_apply2files): new function
- file.c (apply2files): release GVL
- file.c (chmod_internal): adjust for apply2files changes
- file.c (lchmod_internal): ditto
- file.c (chown_internal): ditto
- file.c (lchown_internal): ditto
- file.c (utime_failed): ditto
- file.c (utime_internal): ditto
- file.c (unlink_internal): ditto
[ruby-core:83200] [Feature #13996]
Updated by nobu (Nobuyoshi Nakada) about 7 years ago
/*
* aa is on-stack for small argc, we must ensure paths are marked
* for large argv if aa is on the heap.
*/
As ALLOCV
uses imemo_alloc
(NODE_ALLOCA
till 2.4), the contents are marked.
size_t size = sizeof(const char *) + sizeof(VALUE);
const long len = (long)(sizeof(struct apply_arg) + (size * argc) - size);
I'm afraid that struct apply_arg
and fn
in it may not be guaranteed as never-padded.
Isn't it better to make it a separate struct, and calculate len
using offsetof
?
Updated by normalperson (Eric Wong) about 7 years ago
nobu@ruby-lang.org wrote:
As
ALLOCV
usesimemo_alloc
(NODE_ALLOCA
till 2.4), the contents are marked.
Ah, thanks. Good to know.
size_t size = sizeof(const char *) + sizeof(VALUE); const long len = (long)(sizeof(struct apply_arg) + (size * argc) - size);
I'm afraid that
struct apply_arg
andfn
in it may not be guaranteed as never-padded.
Isn't it better to make it a separate struct, and calculatelen
usingoffsetof
?
Right, both should be addressed in r60408. I'm not sure which
platform the padding can affect. Maybe it cause [Bug #14049]?