Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#1 [ruby-core:81493]

Pull request below for non-"git am" users...

I tried my best to add many comments throughout the code.

I realize this is a lot of new code; and not a typical or common
usage of kqueue or epoll. The kqueue code ended up being very
complicated to support corner cases (see comments in iom_kqueue);
so perhaps the epoll implementation should be easiest-to-understand.

I suggest understanding data structures, first; everything else
will be easier. Please do not hesitate to ask here if you come
across questions or bugs or any comments.

Finally; libkqueue is broken (using epoll on Linux) for corner
cases. I am using a FreeBSD 11.0 VM for kqueue development; but
I may resume working with libkqueue upstream if I have time.
I expect real-world Linux users to be using native epoll,
of course; so no big problems, there.

The following changes since commit d0015e4ac6b812ea1681b1f5fa86fbab52a58960:

Improve performance of implicit type conversion (2017-05-31 12:30:57 +0000)

are available in the git repository at:

git://80x24.org/ruby iom

for you to fetch changes up to 8d6b09d46fcdf6362d6f875347c4790d5cf86401:

auto fiber schedule for rb_wait_for_single_fd and rb_waitpid (2017-06-01 00:07:18 +0000)

Eric Wong (1):
auto fiber schedule for rb_wait_for_single_fd and rb_waitpid

Updated by ko1 (Koichi Sasada) over 8 years ago Actions
Copy link
#2 [ruby-core:81495]

Thank you for your great work.

summary of this comment¶

Recent days I'm thinking about this feature's "safety" or "dependability".
Because of this issue, I think it is difficult to employ this feature right now.

Non-auto-fibers¶

Without this feature, Fiber switching is explicit (Fiber.yield) and most of case, it is easy to write several operations in atomic.

Typical atomic operation is increment. Let's think about it with example: t = n; ...; n = t+1.

def some_method
  Fiber.yield
end

n = 0
f1 = Fiber.new{
  t = n
  some_method
  n = t + 1
}

f1.resume
n += 1
f1.resume

p n #=> 1 (although two increments are tried)

In this case, main fiber and fiber f1 try to increment n and some_method breaks atomicity because of Fiber.yield. Of course, nobody write such silly code and it is easy to check because Fiber.yield is strongly coupled with Fiber operations written by users (basically, libraries don't call Fiber.yield).

auto-fibers¶

However, auto-fiber switching introduce this kind of danger.

# assume all fibers are auto-scheduling fibers

n = 0
f1 = Fiber.new{
  n = log(t) + 1
}

f1.resume # auto-fibers should not call resume
          # but please allow me, this is pseudo-code to describe an issue.
n += 1
f1.resume

p n

If log() method tries to send a log message over network, Fiber will switch to other fibers.

Problems are:

It is difficult to know which operations should be run in atomic (users write code without checking atomicity).
It is difficult to find out which method can switch.
- Not only user writing code, but also all library code can switch fibers.
- This means that we need to check all of library code to know that they don't violate atomic assumptions.
It introduced non-deterministic behavior (with Fiber.yield it will be deterministic behavior and it is easy to reproduce the problem).

This kind of difficulties are same as threading. The impact can be smaller than threading (because threading can switch anywhere and it is very hard to predict the behavior. Auto-fibers switch only at blocking operations especially on IO operations).

Consideration¶

To solve this behavior, we have several choice.

(1) Introduce synchronization mechanisms for auto-fibers

Like Mutex, Queue and so on.
On Ruby 1.8 era, we have Thread.exclusive to prohibit thread-switching.

I don't want to choice this option because it is what I want to avoid from Ruby.

(2) Introduce limitations

The problem "It is difficult to find out which method can switch" is because we need to check whole of code. If we can restrict the auto-fiber switching, this problem can be smaller.

(2-1) Introduce Fiber switching methods

Instead of implicit blocking (IO) operations, introduce explicit blocking operations can switch. We can check all of source code by grep.

(2-2) Check context

Permit fiber switching only at permitted places, by block, pragma, and so on.

# auto-fiber: true # <- this file can switch fibers automatically
Fiber.new(auto: true){
  ...
  io.read # can switch
  ...
  something_defined_in_gem # can't switch
  ...
}

I think other languages like Python, JavaScript employs this idea. I need to survey more on such languages.

(3) Something else cleaver

Introducing debugger is one choice (maybe it is easy than threading issues).
But we can't avoid troubles (and maybe the troubles should be not frequent, non-reproducible).

Other option is to introduce hooks to implement auto-fibers and provide auto-fibers by gems and advanced users know the above risk use this feature. But not good idea because we can't provide good way to write for many people.

thought?

Updated by ko1 (Koichi Sasada) over 8 years ago Actions
Copy link
#3 [ruby-core:81498]

Another idea is change this name from Fiber but thread-related name, but implementation is based on Fiber. It means resurrection of ruby 1.8 green thread without time based preemption (actually, implementation is similar).

Personally I want to avoid threading problems, but I show this idea as an option.

off topic: it is similar to CPU architecture about hardware multi-threading: simultaneous multi-threading (SMT, HT for x86) vs. virtical threading (Sparc, switching on cache miss).

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#4 [ruby-core:81500]

ko1@atdot.net wrote:

Issue #13618 has been updated by ko1 (Koichi Sasada).

Thank you for your great work.

You're welcome :)

summary of this comment¶

Recent days I'm thinking about this feature's "safety" or "dependability".
Because of this issue, I think it is difficult to employ this feature right now.

I disagree. I do not recall Ruby 1.8 Threads being a big problem
for Rubyists. Modern Rubyists seem OK using native Threads
("OK", not "great" :)

We can improve APIs (maybe more Queue/SizedQueue, less Mutex).

What auto-Fiber provides is an option to reduce memory usage and
improve scalability without rewriting existing synchronous
codebases (e.g. Rack + middlewares).

In my experience, I think Ruby gained more users during 1.8-era
when it memory usage was low for green threads; and lost users
as 1.9/2.x memory usage increase (and I guess 3rd-party libs
grew, too).

The safety difference between auto-Fiber and Thread is a minor
point. Lowering memory usage while retaining compatibility with
existing synchronous code is my reason for working on this.

It is difficult to know which operations should be run in atomic (users write code without checking atomicity).

It is difficult to find out which method can switch.

Not only user writing code, but also all library code can switch fibers.

This means that we need to check all of library code to know that they don't violate atomic assumptions.

It introduced non-deterministic behavior (with Fiber.yield it will be deterministic behavior and it is easy to reproduce the problem).

Yes; we will document all switch points in RDoc and NEWS,
of course (maybe write a separate doc/auto-fiber.rdoc)

This kind of difficulties are same as threading. The impact
can be smaller than threading (because threading can switch
anywhere and it is very hard to predict the behavior.
Auto-fibers switch only at blocking operations especially on
IO operations).

Right, I think auto-fiber will have some of the same (probably
minor) difficulties as threading. However, I do not believe it
is a big problem since Rubyists should already be used to
threading.

Consideration¶

To solve this behavior, we have several choice.

(1) Introduce synchronization mechanisms for auto-fibers

Like Mutex, Queue and so on.

Yes, I think Queue/SizedQueue should be able to respect Fiber
scheduling boundaries. Queue/SizedQueue are especially useful
and I plan to implement auto-fiber support for that.

I am not sure about Mutex... (can we defer to Matz for decisions?)

On Ruby 1.8 era, we have Thread.exclusive to prohibit thread-switching.

I don't want to choice this option because it is what I want to avoid from Ruby.

Right.

Maybe Mutex#synchronize can prohibit auto-switch (or, it
will show a warning or raise at auto-switch points).

(2) Introduce limitations

The problem "It is difficult to find out which method can switch" is because we need to check whole of code. If we can restrict the auto-fiber switching, this problem can be smaller.

Right now for IO, it is double opt-in:

It requires both Fiber#start and IO#nonblock=true.

Sidenote:

As a Rubyist who studies the Linux kernel; I consider it
imperative to give Rubyists the choice to make real blocking
syscalls (not the "fake blocking" with auto-fiber/green
threads).

This is because Linux can optimize "wake-one" situations to:
a) give round-robin load distribution across independent processes
b) avoid thundering herd with multiple threads/processes
c) (I forget...)

(sorry I forgot to note this in my original ticket, but it will
be in the final docs)

(2-1) Introduce Fiber switching methods

Instead of implicit blocking (IO) operations, introduce explicit blocking operations can switch. We can check all of source code by grep.

I am against this. Instead, I want it to be easy to port
existing Thread-aware codebases over.

Notice my example test script used net/http from stdlib.

I would like to use existing stdlib (net/*, webrick, drb, ...)
as much as possible without modifications. That means many
existing Ruby libraries can work transparently.

(2-2) Check context

Permit fiber switching only at permitted places, by block, pragma, and so on.
# auto-fiber: true # <- this file can switch fibers automatically
Fiber.new(auto: true){
  ...
  io.read # can switch
  ...
  something_defined_in_gem # can't switch
  ...
}
I think other languages like Python, JavaScript employs this idea. I need to survey more on such languages.

I do not like this, either. I admit I am not familiar with
those languages. I think we should strive to make existing
Thread-aware Ruby code work well, and as transparently as possible...

(3) Something else cleaver

Introducing debugger is one choice (maybe it is easy than threading issues).
But we can't avoid troubles (and maybe the troubles should be not frequent, non-reproducible).

Adding Tracepoint to help track auto-switch should be done
(honestly I have never used this feature in ruby :x).

And yes, I think native threading bugs are trickier to track down
than auto-Fiber switching. Just remember, today we have native
threading and things are OK. And I think there were more happy
Rubyists in 1.8 days.

Other option is to introduce hooks to implement auto-fibers and provide auto-fibers by gems and advanced users know the above risk use this feature. But not good idea because we can't provide good way to write for many people.

thought?

Again, no. I am really in favor of making it easy to port
existing Thread-aware code to auto-Fiber.

Again; from my experience; I do not believe many Ruby
programmers had safety problems with 1.8 green threads.

Today we have Rubyists who are already used to 1.9/2.x native
Thread already.

The safety improvement is a minor point.

Updated by ko1 (Koichi Sasada) over 8 years ago Actions
Copy link
#5 [ruby-core:81507]

normalperson (Eric Wong) wrote:

I disagree. I do not recall Ruby 1.8 Threads being a big problem
for Rubyists. Modern Rubyists seem OK using native Threads
("OK", not "great" :)
...
However, I do not believe it
is a big problem since Rubyists should already be used to
threading.
...
And yes, I think native threading bugs are trickier to track down
than auto-Fiber switching. Just remember, today we have native
threading and things are OK. And I think there were more happy
Rubyists in 1.8 days.
...
Again; from my experience; I do not believe many Ruby
programmers had safety problems with 1.8 green threads.

Today we have Rubyists who are already used to 1.9/2.x native
Thread already.

The safety improvement is a minor point.

My opinion is opposite. I think "For human being using threading is too hard to use correctly" or "Rubyist shouldn't care about threading difficulties". I agree my opinion is extreme and many "advanced" programmers like Eric can write correct thread programs. But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe.

(In addition: I heard some advanced programmers say "people can write". I doubt because it is something survivor bias)

(recent days I fixed rubygems' threading problem it is difficult to reproduce)

I often use this metaphor: It is like GC strategy. If people can manage object lifetime, it is faster than using GC (at some case. Some case GC is more faster than manual memory management). However we choose GC because we want to concentrate on writing application code.

I agree auto-fibers is safer than threads. In my mind:

danger <-> safe (this is my opinion)

   parallel threads (JRuby, ...) > concurrent threads (MRI) >>
   auto-fibers (full-auto)       > auto-fiber (restricted) >>
   Guild                         > single thread

But auto-fiber can introduce accident and it should be not so frequent, and it is difficult to reproduce. This means it is difficult to debug.

Ruby has many pit falls to shoot our own legs (meta-programming features, open class and so on) but they are deterministic (at most of case).

I think this is how to evaluate the risk of such danger.

C/C++/Java/... (and many imperative languages) choose performance (people should write correct code).

Some languages try to avoid this kind of difficulties. Rust choose threading but introduce harness by type system. Clojure choose STM to prevent atomic violation.

I agree threading and auto-fiber is easy to use. Maybe most of case it is no problem (especially on auto-fiber). But it can includes accident in only few cases and it will be difficult to find out.

I hope Ruby is safe language because I don't want to bother of such difficulties. This is my wish. I agree there are another wish like Eric's and I respect it.

Other than this point, I agree of all of your opinions. If I can believe "All Rubyist can write correct thread programs", your points make sense for me.

(other points)

Yes; we will document all switch points in RDoc and NEWS,
of course (maybe write a separate doc/auto-fiber.rdoc)

My point is, if method "foo" is switching point, then any method can call "foo" (bar, and baz, the caller of bar, ...) should be noted. Maybe it is impossible to complete because of Ruby's dynamic nature.

I would like to use existing stdlib (net/*, webrick, drb, ...)
as much as possible without modifications. That means many
existing Ruby libraries can work transparently.

I understand your point.

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#6 [ruby-core:81514]

ko1@atdot.net wrote:

My opinion is opposite. I think "For human being using threading is too hard to use correctly" or "Rubyist shouldn't care about threading difficulties". I agree my opinion is extreme and many "advanced" programmers like Eric can write correct thread programs. But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe.

I do not believe I can write correct code of any type, actually.
Everything I write; even trivial single-threaded scripts has bugs.

On the other hand, my likelyhood of introducing bugs seems
nearly identical across any environment and programming models.
However, having less/simpler code (and less dependencies) seems
to result in fewer bugs, in my experience.

(In addition: I heard some advanced programmers say "people can write". I doubt because it is something survivor bias)

Yes.

(recent days I fixed rubygems' threading problem it is difficult to reproduce)

I often use this metaphor: It is like GC jtrategy. If people can manage object lifetime, it is faster than using GC (at some case. Some case GC is more faster than manual memory management). However we choose GC because we want to concentrate on writing application code.

Right. However, it seems choosing "easier" strategies means
less focus on overall design, leading to more problems down the line.

Since around 2010; I believe unicorn caused major, irreparable
damage to Rack ecosystem by promoting single-threaded design and
having a SIGKILL timeout feature. unicorn made Rubyists stop
caring to fix concurrency bugs and do proper timeouts.

Nowadays Rack apps are both too buggy AND use too much memory :<

I know some people disagree with my assessment of unicorn;
but I prefer to hate everything I've done: it's easier to
find improvements that way :)

I agree auto-fibers is safer than threads. In my mind:

danger <-> safe (this is my opinion)

   parallel threads (JRuby, ...) > concurrent threads (MRI) >>
   auto-fibers (full-auto)       > auto-fiber (restricted) >>
   Guild                         > single thread

Agree. So maybe we can design API for "auto-fiber (restricted)"?

But auto-fiber can introduce accident and it should be not so frequent, and it is difficult to reproduce. This means it is difficult to debug.

Ruby has many pit falls to shoot our own legs (meta-programming features, open class and so on) but they are deterministic (at most of case).

Yes. I think these (along too much code + dependencies) cause
more problems than concurrency bugs.

normalperson (Eric Wong) wrote:

Yes; we will document all switch points in RDoc and NEWS,
of course (maybe write a separate doc/auto-fiber.rdoc)

My point is, if method "foo" is switching point, then any method can call "foo" (bar, and baz, the caller of bar, ...) should be noted. Maybe it is impossible to complete because of Ruby's dynamic nature.

Right. Maybe that is a lot of documentation...

What if the API were the opposite of Thread.exclusive/Mutex#synchronize?
Perhaps:

Fiber.new do
Fiber.auto do

enable auto-fiber inside this block¶

end

disable auto-fiber again¶

end

Maybe Fiber.exclusive can disable Fiber.auto temporarily:

Fiber.new do
Fiber.auto do

enable auto-fiber¶

Fiber.exclusive do

temporarily disable auto-fiber¶

end

enable auto-fiber again¶

...
end
end

Fiber.auto/Fiber.exclusive would be no-ops unless inside
a Fiber.new block...

But maybe that is too much code and nesting levels;
so I still like Fiber.start more.

I would like to use existing stdlib (net/*, webrick, drb, ...)
as much as possible without modifications. That means many
existing Ruby libraries can work transparently.

I understand your point.

Thanks; that is my biggest wish for this feature.

Anyways, I will leave matz, you and others deal with final API
decisions.

Updated by Eregon (Benoit Daloze) over 8 years ago Actions
Copy link
#7 [ruby-core:81537]

This is interesting work, I am curious to see how it will work out.

This looks similar to what Crystal has [1].

Does Kernel#puts potentially yields to another auto-Fiber?
I think that would be very counter-intuitive, but it would be tempting if $stdout is a pipe or socket.

Will a read from a socket always yield to the next fiber,
or can it proceed immediately if the socket is ready?
If not, then scheduling is non-deterministic,
even when communicating with a deterministic server.

It seems that the Crystal approach has some issues for terminating correctly.
However, if I understand in your model there is an implicit wait for all auto-fibers until termination at the program end?
This makes more sense to me for cooperative threading.

The description from Crystal mentions:
"Crystal uses green threads, called fibers, to achieve concurrency.
Fibers communicate with each other using channels, as in Go or Clojure, without having to turn to shared memory or locks."
The part about shared memory and locks is a lie though, these fibers do share memory and
atomicity is broken at every possible call that could invoke some IO-like operation.

This is also true for auto-fibers, which is a form of shared-memory concurrency,
and every yielding point will effectively need to assume
any other auto-fiber could have run in between and modified some global state
(unless the yielding order is very clear such as in a small program,
but in larger programs it becomes extremely difficult to know the fiber schedule).

[1] https://crystal-lang.org/docs/guides/concurrency.html

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#8 [ruby-core:81543]

eregontp@gmail.com wrote:

This is interesting work, I am curious to see how it will work out.

Thanks for the interest.

This looks similar to what Crystal has [1].

Right. But actually I would use MRI 1.8 green threads as
a reference point. The key difference between this and 1.8
is this is tickless (or timer-less); so more predictable.

To me, there are only two types threads available to userland:

OS kernel knows about them (native thread)
OS kernel has no idea about them (fiber/green thread/goroutine)

Does Kernel#puts potentially yields to another auto-Fiber?
I think that would be very counter-intuitive, but it would be tempting if $stdout is a pipe or socket.

Yes, potentially. However, it requires setting IO#nonblock=true
on $stdout (or whatever $> points to), which is rare...

Non-blocking stdout is rare since likely causes headaches if using
system() to run other programs or having 3rd-party libs which
write to stdout.

Will a read from a socket always yield to the next fiber,
or can it proceed immediately if the socket is ready?

It only yields on EAGAIN/EWOULDBLOCK when rb_wait_for_single_fd
is called. It will never yield if there is always data.

AFAIK, Ruby io.c+ext/socket/* does not use rb_wait_for_single_fd
until it encounters EAGAIN/EWOULDBLOCK. (I would consider it a
performance bug if it did)

If not, then scheduling is non-deterministic,
even when communicating with a deterministic server.

(sorry, double negatives are confusing to me to parse and use).

If a socket can always read/write without encountering
EAGAIN/EWOULDBLOCK, the Fiber may run forever. This will starve
other Fibers, so it is up to the programmer to yield explicitly.

We should add Fiber.pass (like Thread.pass) to aid users with
this. This will protect HTTP/1.1 servers from DoS via request
pipelining.

So I guess scheduling is non-deterministic; but actual use
can be deterministic since the programmer should know when
to yield/pass explicitly?

It seems that the Crystal approach has some issues for terminating correctly.
However, if I understand in your model there is an implicit wait for all auto-fibers until termination at the program end?

This makes more sense to me for cooperative threading.

No implicit waiting for termination. Fibers can be forgotten
and dropped at program end; just like threads. I think this is
a necessary condition for supporting fork or exec.

Users must use Fiber#join or Fiber#value to ensure termination;
(same as Thread#join / Thread#value)

The description from Crystal mentions:
"Crystal uses green threads, called fibers, to achieve concurrency.
Fibers communicate with each other using channels, as in Go or Clojure, without having to turn to shared memory or locks."
The part about shared memory and locks is a lie though, these fibers do share memory and
atomicity is broken at every possible call that could invoke some IO-like operation.

This is also true for auto-fibers, which is a form of shared-memory concurrency,
and every yielding point will effectively need to assume
any other auto-fiber could have run in between and modified some global state
(unless the yielding order is very clear such as in a small program,
but in larger programs it becomes extremely difficult to know the fiber schedule).

[1] https://crystal-lang.org/docs/guides/concurrency.html

Yes. Programmers must be careful about shared memory; but
ruby-core can promote+improve APIs like Queue/SizedQueue to use
as communications channels. This should reduce the use of (and
dangers associated with) shared memory.

Updated by ioquatix (Samuel Williams) over 8 years ago Actions
Copy link
#9 [ruby-core:81631]

To a certain extent, things discussed here are already implemented in

https://github.com/socketry/async

and

https://github.com/socketry/async-io

What are the benefits of having this implemented in core Ruby as opposed to a gem which can be versioned independently and works with all Rubies 2.x, including JRuby and (in theory) Rubinius?

Why not focus on making core part of Ruby fast, and providing the appropriate hooks, rather than expanding her scope and complexity, in a way which has a proven track record for frustration (poorly designed stdlib which can't be fixed or improved due to breaking backwards compatibility).

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#10 [ruby-core:81643]

samuel@oriontransfer.org wrote:

To a certain extent, things discussed here are already implemented in

https://github.com/socketry/async

and

https://github.com/socketry/async-io

What are the benefits of having this implemented in core Ruby as opposed to a gem which can be versioned independently and works with all Rubies 2.x, including JRuby and (in theory) Rubinius?

Neverblock basically tried the same thing with EM and never took
off. I don't know much about getting software adopted or
popularized, but maybe being in core has a better chance of
gaining adoption and being sustainable.

Being in core provides greater compatibility with external
libraries which are not aware of existing event loops. So
3rd-party DB adapters (e.g. mysql2) will be able to take advantage
of these changes transparently if they use rb_wait_for_single_fd
(and I will add a hook for rb_thread_fd_select, too).

It will also be easily possible to get existing primitives like
Queue/SizedQueue to work with Fibers out-of-the-box. Maybe even
Mutex+ConditionVariable, if approved.

One current example is being able to hook rb_waitpid: any
existing code using trap(:CHLD) continues to work transparently
even if using auto-Fiber for I/O; but auto-Fiber users can also
rely on "blocking" Process.waitpid if they desire.

Anyways, accepting any of this into core is not my decision to
make. I will only provide implementation and advice/hints.

A small rant about existing event loops:

Most existing event loop implementations (libev, libevent, EM)
seem stuck in single-thread mentality from legacy select/poll
APIs. They handle MT by having one event loop per-thread;
instead of taking advantage of the fact that modern primitives
like kqueue and epoll are both MT-friendly queues which are
populated by threads running inside the kernel.

In a world where memory and CPU are your only constraints,
you can run one (native thread|process) per-core and thus one
event loop per-core. This is perfectly fine for things like
memcached which are only memory+CPU bound.

That falls down once you have other constraints, such as
physical disks to deal with. I maintain software which reads
and writes simultaneously to dozens, if not hundreds of
rotational disks (JBOD) in a single process. With current APIs
on GNU/Linux and FreeBSD, the only way I've found(*) to deal
with this effectively is to use >=1 pthread per disk.

(*) Various AIO implementations are lacking, too. They
pessimize the hot cache case, lack open/unlink/rename/stat
equivalents, and userland implementations tend to not be
mountpoint/device-aware. Native AIO requires O_DIRECT in
Linux, so no page cache at all :<

Why not focus on making core part of Ruby fast, and providing the appropriate hooks, rather than expanding her scope and complexity, in a way which has a proven track record for frustration (poorly designed stdlib which can't be fixed or improved due to breaking backwards compatibility).

I think core and stdlib can evolve best if done together.

Fiber has been in production Ruby for nearly a decade now, with
only minor improvements, and seems largely ignored in the wider
scheme of things. I guess they're not that useful in practice.

And just because we're adding new features does not mean we're
not also finding places to optimize our code.
Mutex/Queue/SizedQueue/ConditionVariable are already faster in
trunk because of preparation work to make them auto-Fiber aware:

https://bugs.ruby-lang.org/issues/13517
https://bugs.ruby-lang.org/issues/13552

Why can't stdlib be fixed? Just because we need to support old
behaviors and APIs does not mean we cannot improve things.

Having a solid stdlib is a great way to improve core and
vice-versa, and helps us bridge the gap for end user code.

Finally, keep in mind there are Rubyists who are not
enthusiastic users willing to explore, they're the
"distro users". It'll be easier for them to pick up Ruby
and use Ruby apps if stdlib were better.

Despite using Perl more than Ruby, I'm a conservative "distro
user" myself with Perl. So I'm hesitant to use or depend on
stuff which isn't packaged by distros, especially when it comes
to end user convenience (some who do not even know or care about
what a programming language is).

So yes, I still write Perl 5.8-compatible code, and still
support legacy CentOS 5.x and 6.x systems.

Updated by ioquatix (Samuel Williams) over 8 years ago Actions
Copy link
#11 [ruby-core:81672]

I appreciate your detailed response it was interesting.

Does Ruby File.read and File.stat (and others) release the GVL? Otherwise, the performance benefit of multiple threads in this specific case is irrelevant. While I agree with you when writing high performance servers in C/C++, it might not be directly relevant to Ruby as it currently stands.

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#12 [ruby-core:81674]

samuel@oriontransfer.org wrote:

Does Ruby File.read and File.stat (and others) release the GVL? Otherwise, the performance benefit of multiple threads in this specific case is irrelevant. While I agree with you when writing high performance servers in C/C++, it might not be directly relevant to Ruby as it currently stands.

File.read does. File.stat does not, at the moment. I tried
it a while back but the GVL is expensive to release for hot
cache situations(*).

File.open, IO.copy_stream, IO#write, IO#read, readpartial, sysread,
syswrite all release GVL, too.

In particular, IO.copy_stream is great for large, parallel
transfers to/from high-latency storage.

(*) the cost of GVL for quick ops is a big reason I want to get rid of it

But yeah, maybe the small regression from releasing GVL is
acceptable for now with File.stat. It's better than getting
stufk on NFS or slow disks.

File.rename, File.unlink, most Dir methods all have the same
problem with slow storage, too. We already pay the price
for small regressions when releasing GVL in current cases,
so maybe those can be GVL release points.

Updated by ioquatix (Samuel Williams) over 8 years ago Actions
Copy link
#13 [ruby-core:81687]

Thanks for your detailed reply. It's impressive and useful that you have such a good knowledge of these issues.

I spent some time just thinking about this issue, and how this feature tries to solve the problem in Ruby.

On the one hand, I'm fundamentally opposed to increasing the surface area of Ruby when it could be done by writing a gem. This has a massive upstream cost, affecting both JRuby and Rubinius. While I appreciate what you are saying w.r.t. maximising usage, I feel like building this into Ruby will cause stagnation of progress long term - one solution for all problems isn't always ideal. Seeing initiatives like stdgems.org only reinforces how I feel about this.

Generally speaking - I really appreciate the work that's been done here. I also feel like you've reinvented nio4r, async and a bunch of other stuff, at a very low level, without as much testing, compatibility, etc.

Ideally, we could move all socket related code into a gem - perhaps that's already on the cards e.g. stdgems. Once that's done, fixing issues like exceptions: false would be easier since it can be versioned.

I was thinking about how we could expose this to Ruby - and ideally, I think we should add two functions:

IO.wait_for_single_fd and IO.wait_for_pid. The C functions rb_wait_for_single_fd and rb_waitpid would invoke these functions, and these functions would implement the current logic of the current C functions. It probably makes sense to think in more detail how these functions should work - e.g. wait_for_multiple_fds (or select), or something more elaborate.

Then, we could allow things like async and auto-fibers to extend Ruby's IO system to provide a policy for blocking IO. auto-fibers could be implemented as a gem with a C extension.

What do you think?

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#14 [ruby-core:81695]

samuel@oriontransfer.org wrote:

Thanks for your detailed reply. It's impressive and useful
that you have such a good knowledge of these issues.

No problem.

I spent some time just thinking about this issue, and how this
feature tries to solve the problem in Ruby.

On the one hand, I'm fundamentally opposed to increasing the
surface area of Ruby when it could be done by writing a gem.
This has a massive upstream cost, affecting both JRuby and
Rubinius. While I appreciate what you are saying w.r.t.
maximising usage, I feel like building this into Ruby will
cause stagnation of progress long term - one solution for all
problems isn't always ideal. Seeing initiatives like
stdgems.org only reinforces how I feel about this.

I understood something it was already decided by matz and ko1 to
do something along the lines of auto-Fiber. Though I can't find
ko1's original message in the archives, it's mostly quoted in in
my reply to him:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/80531

I should note some languages like Go, Erlang, Haskell, and the
afore-mentioned Crystal all have lightweight threading along
these lines in the core language.

In their current state, Fibers are much less useful than the
equivalents in those languages; while native Threads are too
expensive. Something in between Fibers and Threads seems
desirable; maybe we can give auto-Fiber another (short) name;
but I'm not sure it's necessary.

I was also influenced to explore lightweight threading in a
rack-devel thread and the responses James Tucker wrote to me:

Subject: big responses to slow clients: Rack vs PSGI

It's somewhere in https://groups.google.com/group/rack-devel but
that requires JS; so I can't view or link to it using w3m :<

Generally speaking - I really appreciate the work that's been
done here. I also feel like you've reinvented nio4r, async and
a bunch of other stuff, at a very low level, without as much
testing, compatibility, etc.

That's a fair point about less testing and compatibility.
But, I think there is more code using normal Ruby stdlib
that can automatically take advantage of these changes
so we'll be able to nail down any problems quickly.

On a technical level, I consider the design of libev (used by
nio4r and async) too limited in that it does not take advantage
of thread-safety baked into kqueue and epoll. Thinking in terms
of "events vs. threads" too limiting. As I've said before;
combining them is advantageous because both have their uses.
kqueue is an thread-friendly queue, so is epoll.

This feels like the microkernel vs monolithic kernel debate,
too. On one level, isolation and compartmentalization provided
by micro-kernels is appealing; but the ease-of-development of a
monolith allowed Linux to become the kernel for nearly
everything, from tiny IoT devices to giant supercomputers.

And that doesn't preclude things like loadable modules and FUSE
for userspace filesystems from being useful, despite core
filesystem drivers being bundled with Linux. So I think async
can still be supported as an alternative for Ruby; but the
bundled implementation can benefit more from tighter integration
into the core.

A more recent example might be git; which included high-level
non-essential "porcelain" tools early on in addition to the core
"plumbing". Initially, it was intended that separately
maintained wrappers such as "cogito", would implement the
porcelain UI bits and git would remain low-level plumbing. That
ended up making both development and usage more complicated.
Eventually git swallowed up most of the cogito functionality and
cogito was abandoned.

git also ended up with bundled functionality that would've been
separately packaged in other VCSes, including import/export
tools for email, CVS, SVN, etc.

The most relevant example from git might be the bundling of
libxdiff in git, allowing optimizations and tweaks not possible
with an external diff. However, GIT_EXTERNAL_DIFF still remains
supported for less-common use cases.

On a non-technical level:

Finally, this (ruby-core) is one of the few places I can still
contribute to in the Ruby world. All other relevant Ruby
projects requires running non-Free software (including JS) and
having to abide accept Terms-of-Service set by a corporation.

Fwiw, I agree with Rubinius philosophy of implementing more of
Ruby in Ruby and would rather contribute to that; but the above
is a huge factor in why I went on to work on C Ruby, instead.
(the other major factor is I strongly prefer C to C++).

Ideally, we could move all socket related code into a gem -
perhaps that's already on the cards e.g. stdgems. Once that's
done, fixing issues like exceptions: false would be easier
since it can be versioned.

Maybe that'll be done, too, but not my call.
But what about IO.pipe, backtick, and IO.popen?

I was thinking about how we could expose this to Ruby - and
ideally, I think we should add two functions:

IO.wait_for_single_fd and IO.wait_for_pid. The C functions
rb_wait_for_single_fd and rb_waitpid would invoke these
functions, and these functions would implement the current
logic of the current C functions. It probably makes sense to
think in more detail how these functions should work - e.g.
wait_for_multiple_fds (or select), or something more
elaborate.

Maybe.. I guess we already have IO#wait_able in io/wait; and
Process.wait/IO.select is already possible to override and that
would have the same effect. We'd also have to expose the
optional read/write buffering + encoding conversion and make
that accessible to pure Ruby.

It would make C Ruby feel closer to Rubinius and that would be
nice :) I'm not sure how feasable it would be; to introduce
more Ruby-visible APIs to implement this.

And I think exposing more APIs to handle FDs directly is a
mistake in the presence of native threads. My proposed C API
prefers "int *fd" and "rb_io_t" to deal with close notification
handling. Multithreaded programs recycle FDs frequently and
internal APIs need to be prepared to deal with that.

The implementation I proposed also takes advantage of some
C-only optimizations such as reading/writing to memory across
Fiber stack boundaries: something which cannot be done with
higher-level APIs . Similar optimizations already landed for
thread_sync.c (Mutex/Queue) as well as IO#close in trunk.

Again, designing user-visible APIs is most difficult and
ruby-core have to think most about long-term support and
consequences.

So the difficulty of changing/adding APIs is:

1) internal C API (easiest)
2) public C API (difficult)
3) Ruby API (most difficult)

So, I've mainly done 1) and made minimal additions to 3).
Only changes to 2) are to internal behavior, so use from C
extensions remains the same.

Then, we could allow things like async and auto-fibers to
extend Ruby's IO system to provide a policy for blocking IO.
auto-fibers could be implemented as a gem with a C
extension.

What do you think?

I guess this is meant for matz and ko1.

We could actually have that today; and I guess you already have
that with async. All the IO methods are well-documented and
you can even ignore/override the existing IO buffering if you
override all the methods by monkey patching core classes.
Heck, you may even go as far as to never allocate rb_io_t
if you override IO.open/IO.pipe/*Socket.new/... and replace
them with your own class.

What I think is (or at least ought to be) irrelevant.

I only give matz and ko1 another option to choose from. We can wait
for matz and ko1 to decide what to do, maybe they'll discuss
this at: https://bugs.ruby-lang.org/projects/ruby/wiki/DevelopersMeeting20170616Japan
I certainly won't attend meetings or try to influence anybody
using anything besides plain-text messages, here.

Updated by ioquatix (Samuel Williams) over 8 years ago Actions
Copy link
#15 [ruby-core:81721]

Ruby Fibers as they currently stand are perfect and making them more complex is a mistake IMHO.

Let's be clear on this: auto-fibers are really just Fibers that yield when you call a blocking operation. It's as if you are rewriting the blocking function to call Fiber.yield.. and as you have implemented by overriding rb_wait_for_single_fd which invokes something to resume the fiber when the blocking function is done. This is exactly what async does, but it does it the only way currently possible - by wrapping around _nonblock methods. It's the reverse of what your proposed method does - by handling rb_wait_for_single_fd. Because I can't access that method from async without writing C, my choice is limited. But, if it was available, async could use it successfully.

I appreciate what you said about multi-thread multi-fiber execution using your proposed reactor design. I think it's good and it's probably better than libev. It's excellent that you have thought about how to solve these problems and I admire it. However, in my experience, libev is fast enough and n-m concurrency model is fast enough for Ruby. Until Ruby is several orders of magnitude faster, it won't make much difference, except perhaps a tiny bit of latency, but there are benefits to keeping a single request on a single thread or process - you can avoid having to deal with locking and other synchronisation primitives in some cases, e.g. caches. So, there are tangible benefits to using, say, m-process n-fibers vs n-fibers/m-threads model. Ruby has never really suited multi-threaded model unfortunately.

Just to be clear: I'm more interested in semantics than implementation. Get the semantics right and the correct implementation will follow. I see a lot of work done here on an implementation (which is awesome and it looks good), but I'm not completely clear that the semantics are really sound.

In contrast, Async is all about getting the right semantics and finding the implementation that suits.

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#16 [ruby-core:81732]

samuel@oriontransfer.org wrote:

I appreciate what you said about multi-thread multi-fiber
execution using your proposed reactor design. I think it's
good and it's probably better than libev. It's excellent that
you have thought about how to solve these problems and I
admire it. However, in my experience, libev is fast enough and
n-m concurrency model is fast enough for Ruby. Until Ruby is
several orders of magnitude faster, it won't make much
difference, except perhaps a tiny bit of latency, but there
are benefits to keeping a single request on a single thread or
process - you can avoid having to deal with locking and other
synchronisation primitives in some cases, e.g. caches. So,
there are tangible benefits to using, say, m-process n-fibers
vs n-fibers/m-threads model. Ruby has never really suited
multi-threaded model unfortunately.

Just one correction; auto-Fiber does not migrate fibers or
migrate userspace(*) I/O operations across native threads at the
moment. You might be confusing this with my other
non-Fiber-using server designs which do migrate I/O operations
across threads.

For auto-fiber, there's minimal locking requirements even if we
get rid of GVL. It relies on locking already done by the
kernel; kqueue will require extra locking in the corner case
where read and write filters are both installed for an FD.

(*) Of course, Linux kernel soft IRQ handlers can migrate work
across cores in the background.

Just to be clear: I'm more interested in semantics than
implementation. Get the semantics right and the correct
implementation will follow. I see a lot of work done here on
an implementation (which is awesome and it looks good), but
I'm not completely clear that the semantics are really sound.

Anyways, it looks like matz is inclined to accept it; but ko1
wants some semantic tweaks with the API (but I'm not sure
what/how, exactly).

https://docs.google.com/document/d/1z19pKt8jlpiEUR3RnWWBCfs3OR_hbiAZMwpQ6ZTllP0/pub

(I've only viewed it with w3m, no idea if I'm missing anything
due to lack of JS)

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#17 [ruby-core:81826]

Updated patch against r59201:
https://80x24.org/spew/20170629043509.14939-1-e@80x24.org/raw

matz/ko1: any idea on what changes to the Ruby API you guys want?

Anyways, I will make IO.select / rb_thread_fd_select sometime soonish...

Updated by ko1 (Koichi Sasada) over 8 years ago Actions
Copy link
#18 [ruby-core:82028]

sorry for long absent about this topic. it is hard task (hard to start writing up because of problem difficulties and my English skil ;p ) to summarize about this topic.

I try to write step by step.

Discussion at last developers meeting¶

Thread/Fiber switch safety¶

Koichi: (repeat my opinion about difficulty of thread/fiber safety)

akr: providing better synchronize mechanism (such as go-lang has) and encouraging safe parallel computation seems better.

Koichi: It is one possible solution but my position is "if people can shoot their foot, people will shoot".

Matz: I don't like to force people to use lock and so on.

(the point is Matz doesn't reject "-safe" approach)

Introduce restriction¶

(The following idea is not available at last meeting (only part of idea I showed))

Koichi:
The problem of this feature is mind gap using auto-fiber user and script writer. This is same as thread-safety. Person A consider the code is auto-fiber safe, and other person B (can be same as A) write a code without auto-fiber safety, then it will be problem.

In general, most of existing libraries are not auto-fiber safe code (it doesn't mean most of libraries are not auto-fiber safe. Many code are auto-fiber safe without any care).

If we can know a code (and code called by this code) is auto-fiber safe, we can use auto-fiber in safe.

There are three type of code.

(1) don't care about auto-fiber
(2) auto-fiber aware code (assume switching is not allowed at the beginning)
(3) auto-fiber aware code (don't care it is allowed or not allowed to switch)

There are three types of status.

(a) can't switch
(b) can enable to switch, but don't switch
(c) can switch

in matrix

    can switch / can enable switch
(a) can't      / can't
(b) can't      / can
(c) can        / ??

matrix with (1-3) and (a-c)

     (a)     (b)     (c)
(1)   OK      NG      NG
(2)   OK      OK      NG
(3)   OK(*1)  OK(*1)  OK

(1)-(b) and (1)-(c) is not accepted because other method called from this code can switch the context.
(2)-(c) is also unacceptable because the beginning of code is not auto-fiber aware.

*1) Possible problem: (3) can introduce dead-lock problem because it can stop forever.

Normal threads start from (a).
Auto-fibers start from (b). They are written in (1), (2) and (3). Maybe (2) is written for auto fiber top-lelvel. This code will call some async methods which can change context.

My proposal is, to write down explicitly of (1) to (3) and (a) to (c) in program.

At the meeting, I proposed non-matured keywords(-like) to control them.
(and just now I don't have good syntax for it yet)

akira: If we introduce such keywords, we need to rewrite all of code if we want to use auto-fiber web application request handler (for example, we need to rewrite Rails to run on auto-fiber based rack server).

Matz: it is unacceptable to introduce huge rewriting for existing code.

(IMO (not appeared in last meeting) we need to rewrite all of code even if we don't introduce keywords to make sure the auto-fiber safety)

after this discussion¶

Matz and I discussed about this issue, and we conclude that it is too early to introduce this feature on Ruby 2.5.

I want to consider this issue further. auto-fiber based guild is one possibility, this mean we can introduce object isolation and context switching each other.

Updated by normalperson (Eric Wong) over 8 years ago Actions
Copy link
#19 [ruby-core:82040]

ko1@atdot.net wrote:

sorry for long absent about this topic. it is hard task (hard
to start writing up because of problem difficulties and my
English skil ;p ) to summarize about this topic.

No problem, thank you for summarizing.

I try to write step by step.

Discussion at last developers meeting¶

Thread/Fiber switch safety¶

Koichi: (repeat my opinion about difficulty of thread/fiber safety)

akr: providing better synchronize mechanism (such as go-lang
has) and encouraging safe parallel computation seems better.

Koichi: It is one possible solution but my position is "if
people can shoot their foot, people will shoot".

I think your approach is too cautious.

We already have many dangerous things in Ruby, even in
single-threaded code. For example: File.read, IO#read, IO#gets
are all dangerous with no size limit: they can cause
out-of-memory or swapping on gigantic inputs, leading to DoS.

Fork and inadvertant sharing of open files/sockets can also
cause problems. And there are also pathological Regexp which
can cause unbound CPU usage.

Matz: I don't like to force people to use lock and so on.

(the point is Matz doesn't reject "-safe" approach)

Introduce restriction¶

(The following idea is not available at last meeting (only
part of idea I showed))

Koichi:

The problem of this feature is mind gap using auto-fiber user
and script writer. This is same as thread-safety. Person A
consider the code is auto-fiber safe, and other person B (can
be same as A) write a code without auto-fiber safety, then it
will be problem.

In general, most of existing libraries are not auto-fiber safe
code (it doesn't mean most of libraries are not auto-fiber
safe. Many code are auto-fiber safe without any care).

Right; most code does not have to care; and all these dangers
already exist with native Threads.

If we can know a code (and code called by this code) is
auto-fiber safe, we can use auto-fiber in safe.

There are three type of code.

(1) don't care about auto-fiber

(2) auto-fiber aware code (assume switching is not allowed at the beginning)

(3) auto-fiber aware code (don't care it is allowed or not allowed to switch)

There are three types of status.

(a) can't switch

(b) can enable to switch, but don't switch

(c) can switch

in matrix
    can switch / can enable switch
(a) can't      / can't
(b) can't      / can
(c) can        / ??
matrix with (1-3) and (a-c)
     (a)     (b)     (c)
(1)   OK      NG      NG
(2)   OK      OK      NG
(3)   OK(*1)  OK(*1)  OK
(1)-(b) and (1)-(c) is not accepted because other method called from this code can switch the context.
(2)-(c) is also unacceptable because the beginning of code is not auto-fiber aware.

*1) Possible problem: (3) can introduce dead-lock problem because it can stop forever.

Perhaps holding Mutex lock should disable auto-fiber switching.
This should prevent deadlocks, I think.

Existing code has Mutexes, so I'm not sure how they should
interact with auto-Fiber. I agree with Matz that we should
discourage locking, so I guess disabling auto-Fiber switch
while Mutex is held is the most straightforward solution.

Normal threads start from (a). Auto-fibers start from (b).
They are written in (1), (2) and (3). Maybe (2) is written for
auto fiber top-lelvel. This code will call some async methods
which can change context.

My proposal is, to write down explicitly of (1) to (3) and (a)
to (c) in program.

At the meeting, I proposed non-matured keywords(-like) to control them.
(and just now I don't have good syntax for it yet)

akira: If we introduce such keywords, we need to rewrite all
of code if we want to use auto-fiber web application request
handler (for example, we need to rewrite Rails to run on
auto-fiber based rack server).

Matz: it is unacceptable to introduce huge rewriting for existing code.

I agree completely with akira's observation and Matz's opinion
of this.

(IMO (not appeared in last meeting) we need to rewrite all of
code even if we don't introduce keywords to make sure the
auto-fiber safety)

I don't agree with this. A lot of code is already auto-fiber
safe because they are written with GVL+Threads in mind.
(see my original Net::HTTP example); and we also have a lot
of code (webrick, net/*) which worked fine with green Threads
in 1.8

Worst case is we release GVL in a native Thread and forget to
yield to other Fibers in the same Thread. However, that is
already a problem with existing code when run inside Fibers
(e.g. getaddrinfo, IO operations on NFS/slow-disk, ...)

I am working on making rb_thread_fd_select auto-fiber aware,
too. (done for iom_select/iom_epoll, working on iom_kqueue)

Matz and I discussed about this issue, and we conclude that it
is too early to introduce this feature on Ruby 2.5.

OK, I will continue to work on implementation improvements
and keep patches rebased to trunk.

I want to consider this issue further. auto-fiber based guild
is one possibility, this mean we can introduce object
isolation and context switching each other.

Do you think this is in the 2.5 timeline?

Thank you.

Updated by ioquatix (Samuel Williams) about 8 years ago Actions
Copy link
#20 [ruby-core:82214]

I am following this thread and I find it really fascinating.

Thanks everyone for thinking about these issues and Eric for your insightful work and ideas. Just as an aside, I feel like something is being lost in translation w.r.t. the response from Matz and other core Ruby developers. Perhaps we need to have a hangout to discuss these ideas.

I've just released async, async-io and async-dns 1.0.0, along with rubydns 2.0.0 - in addition to this there is also async-http (client and server library) and falcon, a rack compatible server, built on top of async. The http library lacks support for SSL so it's not 1.x yet - still working on that part.

It works on Ruby 2.0+, and most of it also works on JRuby, excepting JRuby's missing support for UDP sockets (https://github.com/jruby/jruby/pull/4684).

I would like to think async is a proof of concept of what is possible with Ruby, in terms of performance. I think it's a solid platform for making network clients and servers, and I've implemented both DNS client/server and HTTP client/server which provide useful test cases for both performance and design.

In terms of design, it's a very simple concept to use with an API that works as if it's sequential, but yields if the operation would block. The user almost cannot make any mistakes, and implementing complex network logic becomes trivial.

In terms of performance, there are few comparisons I can make. If you like more details, let me know. I'm going to be matter of fact, you can draw your own conclusions.

RubyDNS is about as fast as Bind for a trivial benchmark resolving a fixed set of IP addresses.
Falcon is as fast as Puma but scales significantly better especially if non-blocking IO is leveraged.
Falcon and Puma both process requests significantly faster than typical Rack middleware can cope with them. An example would be, Falcon can easily handle 30,000 conn/s on my 8-core workstation, but as soon as I put any non-trivial rack application behind it, it would drop to < 3000 conn/s. Falcon can handle up to 100,000 req/s on the same hardware (e.g. using keep alive).
I implemented a complete stack in C++ of the same concept, and it achieved roughly on 1 core what Ruby required 8 cores. That is, a single process/thread could handle 25,000 conn/s on 1 core, and about 90,000 req/s. So, Ruby is about 10x slower than similar C++ code.

Eric, my opinion at this point is that the work you've done here is awesome.

What I would personally like to see, is a backend, perhaps an alternative to nio4r, which, as an example, async could use to implement it's reactor. I think that when your selector is running for the current fiber, operations like wait_for_pid and wait_one_fd should be hijacked and go via reactor. I think it should be possible for nio4r to tap into this too some how. This would make things completely transparent for user.

I still believe this should be a gem - even if it's an official one distributed with Ruby, and that Ruby should expose the relevant hooks. Otherwise, it's going to make a lot of trouble for other implementations e.g. JRuby, MRI, etc. Ideally they can just expose the same low-level hooks at the VM level.

I would like to say at this point, with the release of async & (-*) 1.0, I believe that this concept has proven itself - e.g. that the implementation works, that it has good performance, and that it can be used to implement good composable libraries. Whatever form the final library takes, I hope that it is (a) modular (b) fast and (c) composable.

One final opinion that I've formed while working on this project, is that Ruby IO primitives are overly complex and fail to expose the right abstraction. *_nonblock methods never should have existed. If there is one thing I'd wish for, it's that once a decent asynchronous library is adopted, that these methods are not made part of it's public API. async does forward these methods, but it's only to make wrapping existing Net::HTTP work better, and essentially the x_nonblock variant is identical to the x method in async.

Updated by ioquatix (Samuel Williams) about 8 years ago Actions
Copy link
#21 [ruby-core:82215]

Just to add, Puma has a HTTP parser (and perhaps other bits) written in C, while Falcon is pure Ruby, yet Falcon has better/similar performance in my (hopefully unbiased) tests. Additionally, Falcon had significantly lower latency, and the C++ implementation even moreso.

Updated by mame (Yusuke Endoh) about 8 years ago Actions
Copy link
#22 [ruby-core:82518]

I comment in compliance with hsbt's request.

Basically I agree with ko1; Thread is considered harmful. Casual Rubyists (including I) had better not use it.

However, I'm not against introducing the feature in question as a professional feature for mature Rubyists.

One issue that I'm concerned about is, the name. (Sorry, but this is an important point to me!) Fiber is fiber because the programmer manages its control flow completely. "Auto-fiber" looks self-contradictory to me. For example, MSDN says:

A fiber is a unit of execution that must be manually scheduled by the application.
https://msdn.microsoft.com/ja-jp/library/windows/desktop/ms682661(v=vs.85).aspx

I believe that this feature should be introduced with another name. I have no counterproposal, though. Sorry.

Updated by normalperson (Eric Wong) about 8 years ago Actions
Copy link
#23 [ruby-core:82552]

mame@ruby-lang.org wrote:

I believe that this feature should be introduced with another
name. I have no counterproposal, though. Sorry.

What about Thriber? Or Fred?

"Fread" might be confused with fread(3) function, and I don't
know anybody named "Fred", so it is a safe name to choose :)

Updated by normalperson (Eric Wong) about 8 years ago Actions
Copy link
#24 [ruby-core:82756]

Eric Wrong normalperson@yhbt.net wrote:

mame@ruby-lang.org wrote:

I believe that this feature should be introduced with another
name. I have no counterproposal, though. Sorry.

What about Thriber? Or Fred?

"Fread" might be confused with fread(3) function, and I don't
know anybody named "Fred", so it is a safe name to choose :)

OK, "class Fred" occurs in object.c documentation already,
so maybe it is confusing. So I choose Thriber as a name:

https://80x24.org/spew/20170912053032.13622-1-e@80x24.org/raw

That patch contains the latest version of this feature rebased
against ko1's recent execution context changes in trunk (up to
r59844) along with some bugfixes (infinite wait fix).

It also adds rb_thread_fd_select as a scheduling point
(in addition to rb_wait_for_single_fd and rb_waitpid from
previously published patches). Only lightly tested,
more tests will need to be written...

Naming is hard :<

Pull request available below for git users:

The following changes since commit 65b11a04f10a2438f0d6ba263a78d16367c3aac0:

console.c: set winsize on Windows (2017-09-11 20:10:34 +0000)

are available in the git repository at:

git://80x24.org/ruby thriber

for you to fetch changes up to d9c0095537c3c01d2187e783910cdc92d6c545fc:

thriber: green threads implemented using fibers (2017-09-12 05:29:31 +0000)

Eric Wrong (1):
thriber: green threads implemented using fibers

common.mk | 7 +
configure.in | 32 +
cont.c | 123 ++-
include/ruby/io.h | 2 +
iom.h | 95 +++
iom_common.h | 204 +++++
iom_epoll.h | 697 ++++++++++++++++
iom_internal.h | 280 +++++++
iom_kqueue.h | 899 +++++++++++++++++++++
iom_pingable_common.h | 54 ++
iom_select.h | 448 ++++++++++
prelude.rb | 12 +
process.c | 15 +-
signal.c | 39 +-
.../wait_for_single_fd/test_wait_for_single_fd.rb | 62 ++
test/lib/leakchecker.rb | 9 +
test/ruby/test_thriber.rb | 274 +++++++
thread.c | 76 +-
thread_pthread.c | 5 +
vm.c | 9 +
vm_core.h | 4 +
21 files changed, 3324 insertions(+), 22 deletions(-)
create mode 100644 iom.h
create mode 100644 iom_common.h
create mode 100644 iom_epoll.h
create mode 100644 iom_internal.h
create mode 100644 iom_kqueue.h
create mode 100644 iom_pingable_common.h
create mode 100644 iom_select.h
create mode 100644 test/ruby/test_thriber.rb¶

Mr. Wrong

Updated by normalperson (Eric Wong) about 8 years ago Actions
Copy link
#25 [ruby-core:83034]

I've updated the series to support FIBER_USE_NATIVE=0 (along
with the proposed fix for [Bug #13887]).

The primary change for FIBER_USE_NATIVE=0 platforms is to move
away from cross stack linked-list manipulation and use the
heap for allocations, instead. This involved some structure
modifications to make rb_thread_fd_select work on select(2)-based
implementations. Of course, this increases the dependency on
rb_ensure to release heap memory.

FIBER_USE_NATIVE=1 platforms are still more important and faster,
of course.

I've tested on Debian 8.x and FreeBSD 11.0. Test reports from
other platforms appricated, thank you

Patch mbox (gzipped):

https://80x24.org/spew/20170928004228.4538-1-e@80x24.org/t.mbox.gz

...or "git request-pull"-generated pull request:

The following changes since commit d21aab2d3e007372973f2b803d7d8d7f9547f0cc:

2017-09-28 (2017-09-27 21:55:33 +0000)

are available in the git repository at:

git://80x24.org/ruby thriber-copy

for you to fetch changes up to 20ea4d710d3d75d946f74346e6a6f3616dac682d:

thriber: non-native fiber support (2017-09-28 00:41:34 +0000)

Eric Wrong (3):
thriber: green threads implemented using fibers
thread_pthread: do not corrupt stack
thriber: non-native fiber support

Updated by normalperson (Eric Wong) almost 8 years ago Actions
Copy link
#26 [ruby-core:84118]

Too late for 2.5, but I'll maintain and periodically rebase this
in hope it can be accepted for 2.6. I've updated patches for
Thriber support against latest trunk (r61067)

https://80x24.org/spew/20171207041831.29005-2-e@80x24.org/raw
https://80x24.org/spew/20171207041831.29005-3-e@80x24.org/raw

Also available at the "thriber-r61067" branch on git://80x24.org/ruby

Updated by ioquatix (Samuel Williams) almost 8 years ago Actions
Copy link
#27 [ruby-core:84149]

I think that the work being done here is great. However I feel that this PR requires far more scrutiny than it's receiving.

It's worth considering that nio4r and friends took several years to stabilise and there is a huge amount of hard earned knowledge embedded in those gems, e.g.

I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think.

e.g. https://github.com/socketry/nio4r/issues/140

I think it's a great idea to have non-blocking evented IO. However, it's not as simple as making read/write non-blocking. How about DNS lookups? Filesystem access? The benefit of a library based approach as I proposed is that these limitations can be clearly part of the contract of a specific library, and people can make different libraries to suit their needs, but making it part of core Ruby is a slippery slope. If anything, it would be better to depend on an established solution for this, so that cases like using the system DNS resolver are handled correctly (e.g. libuv). Otherwise, this is a HUGE addition to the surface area of the ruby interpreter.

Updated by normalperson (Eric Wong) almost 8 years ago Actions
Copy link
#28 [ruby-core:84153]

samuel@oriontransfer.org wrote:

I think that the work being done here is great. However I feel that this PR requires far more scrutiny than it's receiving.

Of course, which is why you don't see me pushing for it's
inclusion in 2.5. I only present and update it so people can
test it if they're bored. And I only started working on it
because ko1 seemed interested in it at the time.

I'd be surprised if this gets into 2.6 or any release in the
future. Nobody besides you and I seems interested in discussing
this anymore; so likely it'll sit here quietly for a few more
years.

Again, I don't make API decisions, I only present options.

I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think.

e.g. https://github.com/socketry/nio4r/issues/140

Right. We already have plenty of threading internals using FP
for timing, as well as the public Ruby APIs for IO.select and
IO#wait_*able. Internally, at least it's a minor thing to
change all the internal APIs to use "struct timespec" all around
for maximum precision.

I think it's a great idea to have non-blocking evented IO.
However, it's not as simple as making read/write non-blocking.
How about DNS lookups? Filesystem access? The benefit of a
library based approach as I proposed is that these limitations
can be clearly part of the contract of a specific library, and
people can make different libraries to suit their needs, but
making it part of core Ruby is a slippery slope. If anything,
it would be better to depend on an established solution for
this, so that cases like using the system DNS resolver are
handled correctly (e.g. libuv). Otherwise, this is a HUGE
addition to the surface area of the ruby interpreter.

We have resolv.rb in stdlib; which was at least popular in 1.8
days. It's implemented entirely in Ruby, so it automatically
takes advantage of these Thriber changes, and has seen a fair
amount of use back in the 1.8 days (not that DNS has changed
drastically).

So really, the network I/O part is not a big, or even complex
change, it's 1.8 Threads being made an option again for
Ruby 2.x. I miss 1.8 Threads, but I also like native threads in
1.9/2.x; they each have their place. And the lightweight
threading for network I/O is what people seem to care about in
other languages (Go, Erlang). nio4r/libuv and async can still
be an option and I have no intention of breaking compatibility.

Filesystem access: out-of-scope for this...

I definitely do NOT want to try and make this use callbacks and
threadpools behind users' backs, even internally. It pessimizes
the common hot cache case (which doesn't require waiting); and
more importantly, and I do not want Ruby or any library to
interfere with mountpoint-aware code.

Mountpoint awareness is 100% necessary for me so there's no
queue blocking when one native thread is doing IO on a fast FS
while another native thread is doing IO on a slow FS. I end up
with dozens or hundreds of threads, because I have dozens or
hundreds of mountpoints of different speeds. This is an
uncommon use case, I know, but some people need it and the
VM must not get in the way.

So I think anything to deal with FS access specifically is
out-of-scope for this issue. We already have native Thread
support, and I use it to implement mountpoint-awareness. Some
of the GVL-release changes to File and Dir for 2.5 will help
with this (which reminds me, I still need to document some of
that in NEWS :x).

Updated by hsbt (Hiroshi SHIBATA) almost 8 years ago Actions
Copy link
#29 [ruby-core:84980]

Status changed from Open to Assigned
Assignee set to normalperson (Eric Wong)

Hi, Eric.

We discussed your proposal at last developer meeting (Dec 26, 2017)

Name this "Thread", or something Thread-ish word than Fiber-ish
Matz doesn't have a strong opinion on the name but prefers 2 words (auto-fiber) than a coined word "Thriber."

Next actions:

Give a thread-ish name
Lock and queue should work with auto-fiber?
Is explicit context switching onto auto-fiber possible?

Updated by normalperson (Eric Wong) almost 8 years ago Actions
Copy link
#30 [ruby-core:85012]

hsbt@ruby-lang.org wrote:

We discussed your proposal at last developer meeting (Dec 26, 2017)

Awesome news.

Name this "Thread", or something Thread-ish word than Fiber-ish

So if we just use "Thread", then existing Thread becomes M:N?
I will think about that... I have many use cases for native
threads, too; but maybe they can be satisfied transparently.

Matz doesn't have a strong opinion on the name but prefers 2 words (auto-fiber) than a coined word "Thriber."

Next actions:

Give a thread-ish name

OK, naming is hard :<

LightThread? Maybe too long...

Threadlet?

Not Thread-ish, but "Task"(*) or "Tasklet" may be a candidate.

This might take a while....

Lock and queue should work with auto-fiber?

I can definitely make Queues work. I think ko1 was mildly
against increasing use of Mutex.

One safety feature I was thinking about was disabling
auto-switching of Fibers while a Mutex is locked, even.

Is explicit context switching onto auto-fiber possible?

Yes, right now it's a subclass of Fiber so inherits
transfer/resume/yield

(*) Linux kernel uses "task" as generic term for threads, processes,
and everything in-between (different flags describe level of
sharing for clone(2))

Updated by dsferreira (Daniel Ferreira) almost 8 years ago Actions
Copy link
#31 [ruby-core:85088]

Hi Eric,

I've been reading this issue and I'm finding it fascinating.
Let me play here the role of the ruby developer that is seeking to
understand better the asynchronous ruby capabilities.
Every time I read threads(conversations) like this one about the pros
and cons of Fibers vs Threads I tend to think: stay away from it.

When people like Kochi write comments like this:

"But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe."

or Yusuke Endoh:

"Thread is considered harmful. Casual Rubyists (including I) had better not use it."

what these comments make us mere mortals feel?

I will speak about me. When I read such a line I tend to step away.
So yes, this situation makes me develop single threaded code as much
as possible.
I rely on libraries to handle asynchronous behaviour for me and
specially I rely extensively on the actor model.

I doubt I will change my mind unless I start to read that Thread is
good to be used or Fiber is good to be used.

When I read all this conversation and you mention corner cases that
still have problems that is a NO GO for me.

IMHO to add yet another Thread like feature it should be "The Killer Feature".

The one that we can say to the all community: Hey people use this
thing because async is a paradise in ruby land at last.
If we don't have this it will be just another Thread, Fiber nightmare
for the very few who accept the overhead of dealing with all the
"buts".

And for the record, I use async libraries but I don't feel confident
about them either knowing that ruby core is not reliable in itself.
Production code in the enterprise world it is not something to mess around.
For me ruby core needs desperately to change this situation so I
really hope your work will be the answer for all of this I'm talking
about.
So yes, if it is it fits in ruby core like a glove IMO. If it is not
then we will be much worst because instead of 2 walking deads we will
have 3.
A 50% increase is a lot in this domain. Turns things into a joke.

So, can you please explain us what peace of mind will we gain with
this new "light thread" in our everyday work?

Thank you very much and keep up the excellent work.
I appreciate specially the care you have in passing across your
knowledge on the subject.
Really helpful and insightful.

Note:

Your last two messages are not part of the issue in redmine. I hope my
message will be there!
It seems mine did came in as well. I'm copy pasting it.

Updated by ioquatix (Samuel Williams) almost 8 years ago Actions
Copy link
#32 [ruby-core:85128]

In async, I called it Async::Task. I think task is a good name for this kind of thing. In your case, you might want to consider Thread::Task. Since, the lexicographic nesting is similar to the logical nesting.

Regarding kqueue bugs. macOS kqueue implementation is horrendous. So, nio4r doesn't use it AFAIK.

Do you have explicit reactor, or is it implicit per-thread or per-process?

Updated by normalperson (Eric Wong) almost 8 years ago Actions
Copy link
#33 [ruby-core:85081]

Eric Wong normalperson@yhbt.net wrote:

hsbt@ruby-lang.org wrote:

Name this "Thread", or something Thread-ish word than Fiber-ish

So if we just use "Thread", then existing Thread becomes M:N?
I will think about that... I have many use cases for native
threads, too; but maybe they can be satisfied transparently.

Thinking about this even more; I don't think it's possible to
preserve round-robin recv_io/accept behavior I want from
blocking on native threads when sharing descriptors between
multiple processes.

So a new class it is...

Matz doesn't have a strong opinion on the name but prefers 2 words (auto-fiber) than a coined word "Thriber."

Next actions:

Give a thread-ish name

OK, naming is hard :<

LightThread? Maybe too long...

Threadlet?

OK, I am liking "threadlet", and it looks like a real word:

https://www.merriam-webster.com/dictionary/threadlet
": a small thread : a delicate filament"

Lock and queue should work with auto-fiber?

I can definitely make Queues work. I think ko1 was mildly
against increasing use of Mutex.

How about we use Threadlet to discourage things we don't like
about normal Threads (such as Mutex, ConditionVariable, ...).

One safety feature I was thinking about was disabling
auto-switching of Fibers while a Mutex is locked, even.

s/Fibers/Threadlets/; but yes, I think it should be possible
to have something like Threadlet.exclusive { ... } to prevent
auto-switch surprises (like Thread.exclusive in 1.8)

Updated by normalperson (Eric Wong) almost 8 years ago Actions
Copy link
#34 [ruby-core:85082]

Thinking about this even more; I don't think it's possible to
preserve round-robin recv_io/accept behavior I want from
blocking on native threads when sharing descriptors between
multiple processes.

The following example hopefully clarifies why I care about
maintaining blocking I/O behavior in some places despite relying
on non-blocking I/O for light-weight threading.

# With non-blocking accept; PIDs do not share fairly:
$ NONBLOCK=1 ruby fairness_test.rb
PID	accept count
5240	55
5220	42
5216	36
5242	109
5230	57
5208	26
5227	53
5212	26
5223	46
5236	43
total: 493

# With blocking accept on Linux; each process gets a fair share:
$ NONBLOCK=0 ruby fairness_test.rb
PID	accept count
5271	50
5278	50
5275	50
5282	49
5286	49
5290	49
5295	49
5298	49
5303	49
5306	49
total: 493

For servers which only handle one client-per-process (e.g.
Apache prefork), unfairness is preferable because the busiest
process will be hottest in CPU cache.

For everything else that serves multiple clients in a single
process, fair sharing is preferable.  This will apply to Guilds
in the future, too.

More information about this behavior I rely on is here:
http://www.citi.umich.edu/projects/linux-scalability/reports/accept.html


require 'socket'
require 'thread'
require 'io/nonblock'
Thread.abort_on_exception = STDOUT.sync = true
host = '127.0.0.1'
srv = TCPServer.new(host, 0)
srv.nonblock = true if ENV['NONBLOCK'].to_i != 0
port = srv.addr[1]
pipe = IO.pipe
nr = 10
running = true
trap(:INT) { running = false }
pids = nr.times.map do
fork do
pipe[0].close
q = Queue.new # per-process Queue
Thread.new do # dedicated accept thread
q.push(srv.accept) while running
q.push(nil)
end
while accepted = q.pop
 # n.b. a real server would do processing, here, maybe spawning
 # a new Thread/Fiber/Threadlet
pipe[1].write("#$$ #{accepted.fileno}\n")
accepted.close
end
end
end
pipe[1].close

sleep(1) # wait for children to start
cleanup = SizedQueue.new(1024)
Thread.new do
cleanup.pop.close while true
end

Thread.new do
loop do
cleanup.push(TCPSocket.new(host, port))
sleep(0.01)
rescue => e
break
end
end
Thread.new { sleep(5); running = false }

counts = Hash.new(0)
at_exit do
tot = 0
puts "PID\taccept count"
counts.each { |pid, n| puts "#{pid}\t#{n}"; tot += n }
puts "total: #{tot}"
end
case line = pipe[0].gets
when /\A(\d+) /
counts[$1] += 1
else
running = false
Process.waitall
end while running

Updated by subtileos (Daniel Ferreira) almost 8 years ago Actions
Copy link
#35 [ruby-core:85087]