Feature #8788

use eventfd on newer Linux instead of pipe for timer thread

Added by Eric Wong 8 months ago. Updated 7 months ago.

[ruby-core:56634]
Status:Feedback
Priority:Low
Assignee:-
Category:core
Target version:next minor

Description

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

0001-thread_pthread-use-eventfd-under-Linux-for-timer-thr.patch Magnifier - use eventfd on Linux (9.08 KB) Eric Wong, 08/17/2013 06:40 AM

0001-thread_pthread-use-eventfd-under-Linux-for-timer-thr.patch Magnifier - PATCH v2 - eventfd_compat should return void (9.07 KB) Eric Wong, 08/17/2013 06:53 AM

History

#1 Updated by Eric Wong 8 months ago

[PATCH] thread_pthread: use eventfd under Linux for timer thread

The timer thread is an ideal use case for eventfd because it is
only used to signal wakeups and not transfer data.

From eventfd(2) manpage:

Applications can use an eventfd file descriptor instead of a pipe (see
pipe(2)) in all cases where a pipe is used simply to signal events.
The kernel overhead of an eventfd file descriptor is much lower than
that of a pipe, and only one file descriptor is required (versus the
two required for a pipe).

#2 Updated by Eric Wong 8 months ago

Ignore my first patch, eventfd_compat function should have void return, not int.
Sorry for the confusion

#3 Updated by Koichi Sasada 8 months ago

(2013/08/16 10:47), normalperson (Eric Wong) wrote:

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

Can we see the performance comparison?
If we can see the clear difference, it can be acceptable.

(If we can't see any difference, it only increase the source code
complexity).

--
// SASADA Koichi at atdot dot net

#4 Updated by Eric Wong 8 months ago

SASADA Koichi ko1@atdot.net wrote:

(2013/08/16 10:47), normalperson (Eric Wong) wrote:

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

Can we see the performance comparison?
If we can see the clear difference, it can be acceptable.

It's not for speed (signal handling performance should not be a
bottleneck), but halve FD use in userspace and reduce memory use inside
the kernel.

AFAIK, writing to a empty pipe still allocates a 4K page, eventfd avoids
that allocation/deallocation. Since Ruby is CoW/fork-friendly, this
should allow running more Ruby processes on a system.

I also thought my own code had an FD leak when timerthreadpipe_low was
introduced. Maybe this will reduce confusion for users who lsof Ruby
processes, since there are more pipe users than eventfd users.

(If we can't see any difference, it only increase the source code
complexity).

I've tried to minimize the impact of my patch and keep the eventfd/pipe
difference minimal.

#5 Updated by Motohiro KOSAKI 8 months ago

Hi

On Sat, Aug 17, 2013 at 3:37 PM, Eric Wong normalperson@yhbt.net wrote:

SASADA Koichi ko1@atdot.net wrote:

(2013/08/16 10:47), normalperson (Eric Wong) wrote:

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

Can we see the performance comparison?
If we can see the clear difference, it can be acceptable.

It's not for speed (signal handling performance should not be a
bottleneck), but halve FD use in userspace and reduce memory use inside
the kernel.

How much increase number of maximum ruby processes? Can you measure it?
I bet the difference is very small.

AFAIK, writing to a empty pipe still allocates a 4K page, eventfd avoids
that allocation/deallocation. Since Ruby is CoW/fork-friendly, this
should allow running more Ruby processes on a system.

I also thought my own code had an FD leak when timerthreadpipe_low was
introduced. Maybe this will reduce confusion for users who lsof Ruby
processes, since there are more pipe users than eventfd users.

Well, that's not a good reason. You said your patch decrease your confusion
but increase a confusion of other eventfd users.

(If we can't see any difference, it only increase the source code
complexity).

I've tried to minimize the impact of my patch and keep the eventfd/pipe
difference minimal.

Anyway, I haven't seen any bugs in your patch. I would see a measurement
result.

#6 Updated by Eric Wong 8 months ago

KOSAKI Motohiro kosaki.motohiro@gmail.com wrote:

Hi

On Sat, Aug 17, 2013 at 3:37 PM, Eric Wong normalperson@yhbt.net wrote:

SASADA Koichi ko1@atdot.net wrote:

(2013/08/16 10:47), normalperson (Eric Wong) wrote:

eventfd is a cheaper alternative to pipe for self-notification (signals) on Linux

I will submit patches in the next few days/weeks unless there are objections
(or somebody else wants to do it sooner). I'd also like to cleanup some of the existing #ifdefs in that area while I'm at it.

Can we see the performance comparison?
If we can see the clear difference, it can be acceptable.

It's not for speed (signal handling performance should not be a
bottleneck), but halve FD use in userspace and reduce memory use inside
the kernel.

How much increase number of maximum ruby processes? Can you measure it?
I bet the difference is very small.

On Linux 3.10 on x86_64, 64-byte L1 cache line size

file->privatedata:
sizeof(struct eventfd
ctx) == 48 bytes
sizeof(struct pipeinodeinfo) == 136 bytes

So 176 bytes and 2 FDs saved for every Ruby process. Fwiw, I often have
hundreds of (mostly idle) Ruby processes on my systems running random
scripts/daemons.

I doubt most users will notice the difference. But maybe it will make a
tiny difference somewhere (fewer cache lines touched, smaller select()
footprint).

I don't have a machine to forkbomb with Ruby, but overall size of Ruby
is probably the limiting factor anyways.

AFAIK, writing to a empty pipe still allocates a 4K page, eventfd avoids
that allocation/deallocation. Since Ruby is CoW/fork-friendly, this
should allow running more Ruby processes on a system.

I also thought my own code had an FD leak when timerthreadpipe_low was
introduced. Maybe this will reduce confusion for users who lsof Ruby
processes, since there are more pipe users than eventfd users.

Well, that's not a good reason. You said your patch decrease your confusion
but increase a confusion of other eventfd users.

I suppose it depends on the user. I'm don't know of anybody using
eventfd with Ruby right now (but I'll be updating some of my projects to
do so).

(If we can't see any difference, it only increase the source code
complexity).

I've tried to minimize the impact of my patch and keep the eventfd/pipe
difference minimal.

Anyway, I haven't seen any bugs in your patch. I would see a measurement
result.

Thanks for looking. Sorry I cannot provide real-world measurement/use
case.

Ideally, we wouldn't even need a timer thread and we could just use
ppoll/pselect. But that would be a very intrusive change (and maybe too
incompatible with C extensions).

#7 Updated by Motohiro KOSAKI 8 months ago

Ideally, we wouldn't even need a timer thread and we could just use
ppoll/pselect. But that would be a very intrusive change (and maybe too
incompatible with C extensions).

Ideally?
syscall is much slower than current flag check in VM loop. That's why
now VM event loop doesn't handle thread runtime expire directly.

#8 Updated by Eric Wong 8 months ago

KOSAKI Motohiro kosaki.motohiro@gmail.com wrote:

Ideally, we wouldn't even need a timer thread and we could just use
ppoll/pselect. But that would be a very intrusive change (and maybe too
incompatible with C extensions).

Ideally?
syscall is much slower than current flag check in VM loop. That's why
now VM event loop doesn't handle thread runtime expire directly.

Ah, I forget about thread runtime expiry. Maybe that's why I gave up
on the idea originally, I had this idea a long while (years?) back.

#9 Updated by Yui NARUSE 7 months ago

  • Status changed from Open to Feedback
  • Target version changed from 2.1.0 to next minor

I'm negative because it causes code complex unless it has performance improvement.

Also available in: Atom PDF