Actions
Feature #13697
open[PATCH]: futex based thread primitives
Description
Assigning to kosaki since he wrote the current GVL.
I'm hoping single-core vm_thread_pass benchmark can be
improved, but I'm not sure...
Using bare, Linux-specific futexes instead of relying on
NPTL-provided primitives seems to offer some speedups
in the more realistic benchmarks which release GVL
for IO.
Performance seems stable between multi-core and single-core
benchmarks. However, there is still more regressions for
single-core systems, but I think it mainly affects esoteric
cases. Mainly, the io_pipe_rw and vm_thread_pipe benchmarks
are improved across the board, so I am pretty happy
with that.
Some of the performance changes (good or bad) may also
be the result of size reductions between the 40-byte NPTL
mutex and the 4 byte futex shifting data into a different
cache line.
io and thread '-p (_io_|thread)' benchmark results on an
AMD FX-8320 @ 3.5GHz:
io_copy_stream_write 1.040
io_copy_stream_write_socket 1.027
io_file_create 1.016
io_file_read 1.057
io_file_write 1.001
io_nonblock_noex 1.047
io_nonblock_noex2 1.037
io_pipe_rw 1.077
io_select 1.024
io_select2 1.003
io_select3 0.991
require_thread 8.379
vm_thread_alive_check1 1.171
vm_thread_close 1.015
vm_thread_condvar1 0.979
vm_thread_condvar2 1.192
vm_thread_create_join 1.043
vm_thread_mutex1 0.985
vm_thread_mutex2 1.005
vm_thread_mutex3 0.991
vm_thread_pass 4.563
vm_thread_pass_flood 0.991
vm_thread_pipe 1.867
vm_thread_queue 0.995
vm_thread_sized_queue 1.050
vm_thread_sized_queue2 1.079
vm_thread_sized_queue3 1.073
vm_thread_sized_queue4 1.087
single core (schedtool -a 0x1 -e ...):
io_copy_stream_write 1.039
io_copy_stream_write_socket 1.012
io_file_create 1.010
io_file_read 1.066
io_file_write 0.999
io_nonblock_noex 1.061
io_nonblock_noex2 1.020
io_pipe_rw 1.101
io_select 1.008
io_select2 1.001
io_select3 0.992
require_thread 1.005
vm_thread_alive_check1 0.938
vm_thread_close 1.135
vm_thread_condvar1 1.145
vm_thread_condvar2 1.134
vm_thread_create_join 1.146
vm_thread_mutex1 0.999
vm_thread_mutex2 0.999
vm_thread_mutex3 1.001
vm_thread_pass 0.887
vm_thread_pass_flood 0.973
vm_thread_pipe 1.100
vm_thread_queue 1.013
vm_thread_sized_queue 1.125
vm_thread_sized_queue2 1.172
vm_thread_sized_queue3 1.184
vm_thread_sized_queue4 1.081
Files
Updated by normalperson (Eric Wong) over 7 years ago
normalperson@yhbt.net wrote:
https://bugs.ruby-lang.org/issues/13697
Assigning to kosaki since he wrote the current GVL.
I'm hoping single-core vm_thread_pass benchmark can be
improved, but I'm not sure...
Can anybody else review? I guess kosaki is busy. Thanks.
Updated by normalperson (Eric Wong) almost 7 years ago
Updated by hsbt (Hiroshi SHIBATA) 10 months ago
- Status changed from Open to Assigned
Actions
Like0
Like0Like0Like0