Project

General

Profile

Actions

Feature #8788

closed

use eventfd on newer Linux instead of pipe for timer thread

Added by normalperson (Eric Wong) over 10 years ago. Updated about 6 years ago.

Status:
Feedback
Assignee:
-
Target version:
-
[ruby-core:56634]

Description

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.


Files

0001-thread_pthread-use-eventfd-under-Linux-for-timer-thr.patch (9.08 KB) 0001-thread_pthread-use-eventfd-under-Linux-for-timer-thr.patch use eventfd on Linux normalperson (Eric Wong), 08/17/2013 06:40 AM
0001-thread_pthread-use-eventfd-under-Linux-for-timer-thr.patch (9.07 KB) 0001-thread_pthread-use-eventfd-under-Linux-for-timer-thr.patch PATCH v2 - eventfd_compat should return void normalperson (Eric Wong), 08/17/2013 06:53 AM
tt_efd_v2.patch (5.91 KB) tt_efd_v2.patch normalperson (Eric Wong), 07/08/2014 10:38 PM
Actions #1

Updated by normalperson (Eric Wong) over 10 years ago

[PATCH] thread_pthread: use eventfd under Linux for timer thread

The timer thread is an ideal use case for eventfd because it is
only used to signal wakeups and not transfer data.

From eventfd(2) manpage:

Applications can use an eventfd file descriptor instead of a pipe (see
pipe(2)) in all cases where a pipe is used simply to signal events.
The kernel overhead of an eventfd file descriptor is much lower than
that of a pipe, and only one file descriptor is required (versus the
two required for a pipe).

Updated by normalperson (Eric Wong) over 10 years ago

Ignore my first patch, eventfd_compat function should have void return, not int.
Sorry for the confusion

Updated by ko1 (Koichi Sasada) over 10 years ago

(2013/08/16 10:47), normalperson (Eric Wong) wrote:

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

Can we see the performance comparison?
If we can see the clear difference, it can be acceptable.

(If we can't see any difference, it only increase the source code
complexity).

--
// SASADA Koichi at atdot dot net

Updated by normalperson (Eric Wong) over 10 years ago

SASADA Koichi wrote:

(2013/08/16 10:47), normalperson (Eric Wong) wrote:

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

Can we see the performance comparison?
If we can see the clear difference, it can be acceptable.

It's not for speed (signal handling performance should not be a
bottleneck), but halve FD use in userspace and reduce memory use inside
the kernel.

AFAIK, writing to a empty pipe still allocates a 4K page, eventfd avoids
that allocation/deallocation. Since Ruby is CoW/fork-friendly, this
should allow running more Ruby processes on a system.

I also thought my own code had an FD leak when timer_thread_pipe_low was
introduced. Maybe this will reduce confusion for users who lsof Ruby
processes, since there are more pipe users than eventfd users.

(If we can't see any difference, it only increase the source code
complexity).

I've tried to minimize the impact of my patch and keep the eventfd/pipe
difference minimal.

Updated by kosaki (Motohiro KOSAKI) over 10 years ago

Hi

On Sat, Aug 17, 2013 at 3:37 PM, Eric Wong wrote:

SASADA Koichi wrote:

(2013/08/16 10:47), normalperson (Eric Wong) wrote:

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

Can we see the performance comparison?
If we can see the clear difference, it can be acceptable.

It's not for speed (signal handling performance should not be a
bottleneck), but halve FD use in userspace and reduce memory use inside
the kernel.

How much increase number of maximum ruby processes? Can you measure it?
I bet the difference is very small.

AFAIK, writing to a empty pipe still allocates a 4K page, eventfd avoids
that allocation/deallocation. Since Ruby is CoW/fork-friendly, this
should allow running more Ruby processes on a system.

I also thought my own code had an FD leak when timer_thread_pipe_low was
introduced. Maybe this will reduce confusion for users who lsof Ruby
processes, since there are more pipe users than eventfd users.

Well, that's not a good reason. You said your patch decrease your confusion
but increase a confusion of other eventfd users.

(If we can't see any difference, it only increase the source code
complexity).

I've tried to minimize the impact of my patch and keep the eventfd/pipe
difference minimal.

Anyway, I haven't seen any bugs in your patch. I would see a measurement
result.

Updated by normalperson (Eric Wong) over 10 years ago

KOSAKI Motohiro wrote:

Hi

On Sat, Aug 17, 2013 at 3:37 PM, Eric Wong wrote:

SASADA Koichi wrote:

(2013/08/16 10:47), normalperson (Eric Wong) wrote:

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

Can we see the performance comparison?
If we can see the clear difference, it can be acceptable.

It's not for speed (signal handling performance should not be a
bottleneck), but halve FD use in userspace and reduce memory use inside
the kernel.

How much increase number of maximum ruby processes? Can you measure it?
I bet the difference is very small.

On Linux 3.10 on x86_64, 64-byte L1 cache line size

file->private_data:
sizeof(struct eventfd_ctx) == 48 bytes
sizeof(struct pipe_inode_info) == 136 bytes

So 176 bytes and 2 FDs saved for every Ruby process. Fwiw, I often have
hundreds of (mostly idle) Ruby processes on my systems running random
scripts/daemons.

I doubt most users will notice the difference. But maybe it will make a
tiny difference somewhere (fewer cache lines touched, smaller select()
footprint).

I don't have a machine to forkbomb with Ruby, but overall size of Ruby
is probably the limiting factor anyways.

AFAIK, writing to a empty pipe still allocates a 4K page, eventfd avoids
that allocation/deallocation. Since Ruby is CoW/fork-friendly, this
should allow running more Ruby processes on a system.

I also thought my own code had an FD leak when timer_thread_pipe_low was
introduced. Maybe this will reduce confusion for users who lsof Ruby
processes, since there are more pipe users than eventfd users.

Well, that's not a good reason. You said your patch decrease your confusion
but increase a confusion of other eventfd users.

I suppose it depends on the user. I'm don't know of anybody using
eventfd with Ruby right now (but I'll be updating some of my projects to
do so).

(If we can't see any difference, it only increase the source code
complexity).

I've tried to minimize the impact of my patch and keep the eventfd/pipe
difference minimal.

Anyway, I haven't seen any bugs in your patch. I would see a measurement
result.

Thanks for looking. Sorry I cannot provide real-world measurement/use
case.

Ideally, we wouldn't even need a timer thread and we could just use
ppoll/pselect. But that would be a very intrusive change (and maybe too
incompatible with C extensions).

Updated by kosaki (Motohiro KOSAKI) over 10 years ago

Ideally, we wouldn't even need a timer thread and we could just use
ppoll/pselect. But that would be a very intrusive change (and maybe too
incompatible with C extensions).

Ideally?
syscall is much slower than current flag check in VM loop. That's why
now VM event loop doesn't handle thread runtime expire directly.

Updated by normalperson (Eric Wong) over 10 years ago

KOSAKI Motohiro wrote:

Ideally, we wouldn't even need a timer thread and we could just use
ppoll/pselect. But that would be a very intrusive change (and maybe too
incompatible with C extensions).

Ideally?
syscall is much slower than current flag check in VM loop. That's why
now VM event loop doesn't handle thread runtime expire directly.

Ah, I forget about thread runtime expiry. Maybe that's why I gave up
on the idea originally, I had this idea a long while (years?) back.

Updated by naruse (Yui NARUSE) over 10 years ago

  • Status changed from Open to Feedback
  • Target version changed from 2.1.0 to 2.6

I'm negative because it causes code complex unless it has performance improvement.

Updated by normalperson (Eric Wong) over 9 years ago

Updated patch (from testing for #10009).

Uploading for archival purposes. This version is probably less intrusive and
falls back to pipe in case of ENOSYS (in case glibc supports eventfd and the
kernel has eventfd disabled).

This has no measurable performance improvement for me, but saves two FDs
and a few bytes of kernel memory for every process.

Actions #11

Updated by naruse (Yui NARUSE) about 6 years ago

  • Target version deleted (2.6)
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0