Project

General

Profile

Feature #13618

[PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid

Added by normalperson (Eric Wong) 9 months ago. Updated 4 days ago.

Status:
Assigned
Priority:
Normal
Target version:
-
[ruby-core:81492]

Description

auto fiber schedule for rb_wait_for_single_fd and rb_waitpid

Implement automatic Fiber yield and resume when running
rb_wait_for_single_fd and rb_waitpid.

The Ruby API changes for Fiber are named after existing Thread
methods.

main Ruby API:

    Fiber#start -> enable auto-scheduling and run Fiber until it
           automatically yields (due to EAGAIN/EWOULDBLOCK)

The following behave like their Thread counterparts:

    Fiber.start - Fiber.new + Fiber#start (prelude.rb)
    Fiber#join - run internal scheduler until Fiber is terminated
    Fiber#value - ditto
    Fiber#run - like Fiber#start (prelude.rb)

Right now, it takes over rb_wait_for_single_fd() and
rb_waitpid() function if the running Fiber is auto-enabled
(cont.c::rb_fiber_auto_sched_p)

Changes to existing functions are minimal.

New files (all new structs and relations should be documented):

    iom.h - internal API for the rest of RubyVM (incomplete?)
    iom_internal.h - internal header for iom_(select|epoll|kqueue).h
    iom_epoll.h - epoll-specific pieces
    iom_kqueue.h - kqueue-specific pieces
    iom_select.h - select-specific pieces
    iom_pingable_common.h - common code for iom_(epoll|kqueue).h
    iom_common.h - common footer for iom_(select|epoll|kqueue).h

Changes to existing data structures:

    rb_thread_t.afrunq   - list of fibers to auto-resume
    rb_vm_t.iom          - Ruby I/O Manager (rb_iom_t) :)

Besides rb_iom_t, all the new structs are stack-only and relies
extensively on ccan/list for branch-less, O(1) insert/delete.

As usual, understanding the data structures first should help
you understand the code.

Right now, I reuse some static functions in thread.c,
so thread.c includes iom_(select|epoll|kqueue).h

TODO:

    Hijack other blocking functions (IO.select, ...)

I am using "double" for timeout since it is more convenient for
arithmetic like parts of thread.c.   Most platforms have good FP,
I think.  Also, all "blocking" functions (rb_iom_wait*) will
have timeout support.

./configure gains a new --with-iom=(select|epoll|kqueue) switch

libkqueue:

  libkqueue support is incomplete; corner cases are not handled well:

    1) multiple fibers waiting on the same FD
    2) waiting for both read and write events on the same FD

  Bugfixes to libkqueue may be necessary to support all corner cases.
  Supporting these corner cases for native kqueue was challenging,
  even.  See comments on iom_kqueue.h and iom_epoll.h for
  nuances.

Limitations

Test script I used to download a file from my server:
----8<---
require 'net/http'
require 'uri'
require 'digest/sha1'
require 'fiber'

url = 'http://80x24.org/git-i-forgot-to-pack/objects/pack/pack-97b25a76c03b489d4cbbd85b12d0e1ad28717e55.idx'

uri = URI(url)
use_ssl = "https" == uri.scheme
fibs = 10.times.map do
  Fiber.start do
    cur = Fiber.current.object_id
    # XXX getaddrinfo() and connect() are blocking
    # XXX resolv/replace + connect_nonblock
    Net::HTTP.start(uri.host, uri.port, use_ssl: use_ssl) do |http|
      req = Net::HTTP::Get.new(uri)
      http.request(req) do |res|
    dig = Digest::SHA1.new
    res.read_body do |buf|
      dig.update(buf)
      #warn "#{cur} #{buf.bytesize}\n"
    end
    warn "#{cur} #{dig.hexdigest}\n"
      end
    end
    warn "done\n"
    :done
  end
end

warn "joining #{Time.now}\n"
fibs[-1].join(4)
warn "joined #{Time.now}\n"
all = fibs.dup

warn "1 joined, wait for the rest\n"
until fibs.empty?
  fibs.each(&:join)
  fibs.keep_if(&:alive?)
  warn fibs.inspect
end

p all.map(&:value)

Fiber.new do
  puts 'HI'
end.run.join

History

#1 [ruby-core:81493] Updated by normalperson (Eric Wong) 9 months ago

Pull request below for non-"git am" users...

I tried my best to add many comments throughout the code.

I realize this is a lot of new code; and not a typical or common
usage of kqueue or epoll. The kqueue code ended up being very
complicated to support corner cases (see comments in iom_kqueue);
so perhaps the epoll implementation should be easiest-to-understand.

I suggest understanding data structures, first; everything else
will be easier. Please do not hesitate to ask here if you come
across questions or bugs or any comments.

Finally; libkqueue is broken (using epoll on Linux) for corner
cases. I am using a FreeBSD 11.0 VM for kqueue development; but
I may resume working with libkqueue upstream if I have time.
I expect real-world Linux users to be using native epoll,
of course; so no big problems, there.

The following changes since commit d0015e4ac6b812ea1681b1f5fa86fbab52a58960:

Improve performance of implicit type conversion (2017-05-31 12:30:57 +0000)

are available in the git repository at:

git://80x24.org/ruby iom

for you to fetch changes up to 8d6b09d46fcdf6362d6f875347c4790d5cf86401:

auto fiber schedule for rb_wait_for_single_fd and rb_waitpid (2017-06-01 00:07:18 +0000)


Eric Wong (1):
auto fiber schedule for rb_wait_for_single_fd and rb_waitpid

common.mk | 7 +
configure.in | 32 ++
cont.c | 119 +++-
include/ruby/io.h | 2 +
iom.h | 92 ++++
iom_common.h | 198 +++++++
iom_epoll.h | 423 +++++++++++++++
iom_internal.h | 251 +++++++++
iom_kqueue.h | 600 +++++++++++++++++++++
iom_pingable_common.h | 46 ++
iom_select.h | 306 +++++++++++
prelude.rb | 12 +
process.c | 14 +-
signal.c | 40 +-
.../wait_for_single_fd/test_wait_for_single_fd.rb | 44 ++
test/lib/leakchecker.rb | 9 +
test/ruby/test_fiber_auto.rb | 238 ++++++++
thread.c | 42 ++
thread_pthread.c | 5 +
vm.c | 9 +
vm_core.h | 4 +
21 files changed, 2479 insertions(+), 14 deletions(-)
create mode 100644 iom.h
create mode 100644 iom_common.h
create mode 100644 iom_epoll.h
create mode 100644 iom_internal.h
create mode 100644 iom_kqueue.h
create mode 100644 iom_pingable_common.h
create mode 100644 iom_select.h
create mode 100644 test/ruby/test_fiber_auto.rb

#2 [ruby-core:81495] Updated by ko1 (Koichi Sasada) 9 months ago

Thank you for your great work.

summary of this comment

Recent days I'm thinking about this feature's "safety" or "dependability".
Because of this issue, I think it is difficult to employ this feature right now.

Non-auto-fibers

Without this feature, Fiber switching is explicit (Fiber.yield) and most of case, it is easy to write several operations in atomic.

Typical atomic operation is increment. Let's think about it with example: t = n; ...; n = t+1.

def some_method
  Fiber.yield
end

n = 0
f1 = Fiber.new{
  t = n
  some_method
  n = t + 1
}

f1.resume
n += 1
f1.resume

p n #=> 1 (although two increments are tried)

In this case, main fiber and fiber f1 try to increment n and some_method breaks atomicity because of Fiber.yield. Of course, nobody write such silly code and it is easy to check because Fiber.yield is strongly coupled with Fiber operations written by users (basically, libraries don't call Fiber.yield).

auto-fibers

However, auto-fiber switching introduce this kind of danger.

# assume all fibers are auto-scheduling fibers

n = 0
f1 = Fiber.new{
  n = log(t) + 1
}

f1.resume # auto-fibers should not call resume
          # but please allow me, this is pseudo-code to describe an issue.
n += 1
f1.resume

p n

If log() method tries to send a log message over network, Fiber will switch to other fibers.

Problems are:

  • It is difficult to know which operations should be run in atomic (users write code without checking atomicity).
  • It is difficult to find out which method can switch.
    • Not only user writing code, but also all library code can switch fibers.
    • This means that we need to check all of library code to know that they don't violate atomic assumptions.
  • It introduced non-deterministic behavior (with Fiber.yield it will be deterministic behavior and it is easy to reproduce the problem).

This kind of difficulties are same as threading. The impact can be smaller than threading (because threading can switch anywhere and it is very hard to predict the behavior. Auto-fibers switch only at blocking operations especially on IO operations).

Consideration

To solve this behavior, we have several choice.

(1) Introduce synchronization mechanisms for auto-fibers

Like Mutex, Queue and so on.
On Ruby 1.8 era, we have Thread.exclusive to prohibit thread-switching.

I don't want to choice this option because it is what I want to avoid from Ruby.

(2) Introduce limitations

The problem "It is difficult to find out which method can switch" is because we need to check whole of code. If we can restrict the auto-fiber switching, this problem can be smaller.

(2-1) Introduce Fiber switching methods

Instead of implicit blocking (IO) operations, introduce explicit blocking operations can switch. We can check all of source code by grep.

(2-2) Check context

Permit fiber switching only at permitted places, by block, pragma, and so on.

# auto-fiber: true # <- this file can switch fibers automatically
Fiber.new(auto: true){
  ...
  io.read # can switch
  ...
  something_defined_in_gem # can't switch
  ...
}

I think other languages like Python, JavaScript employs this idea. I need to survey more on such languages.

(3) Something else cleaver

Introducing debugger is one choice (maybe it is easy than threading issues).
But we can't avoid troubles (and maybe the troubles should be not frequent, non-reproducible).

Other option is to introduce hooks to implement auto-fibers and provide auto-fibers by gems and advanced users know the above risk use this feature. But not good idea because we can't provide good way to write for many people.

thought?

#3 [ruby-core:81498] Updated by ko1 (Koichi Sasada) 9 months ago

Another idea is change this name from Fiber but thread-related name, but implementation is based on Fiber. It means resurrection of ruby 1.8 green thread without time based preemption (actually, implementation is similar).

Personally I want to avoid threading problems, but I show this idea as an option.

off topic: it is similar to CPU architecture about hardware multi-threading: simultaneous multi-threading (SMT, HT for x86) vs. virtical threading (Sparc, switching on cache miss).

#4 [ruby-core:81500] Updated by normalperson (Eric Wong) 9 months ago

ko1@atdot.net wrote:

Issue #13618 has been updated by ko1 (Koichi Sasada).

Thank you for your great work.

You're welcome :)

summary of this comment

Recent days I'm thinking about this feature's "safety" or "dependability".
Because of this issue, I think it is difficult to employ this feature right now.

I disagree. I do not recall Ruby 1.8 Threads being a big problem
for Rubyists. Modern Rubyists seem OK using native Threads
("OK", not "great" :)

We can improve APIs (maybe more Queue/SizedQueue, less Mutex).

What auto-Fiber provides is an option to reduce memory usage and
improve scalability without rewriting existing synchronous
codebases (e.g. Rack + middlewares).

In my experience, I think Ruby gained more users during 1.8-era
when it memory usage was low for green threads; and lost users
as 1.9/2.x memory usage increase (and I guess 3rd-party libs
grew, too).

The safety difference between auto-Fiber and Thread is a minor
point. Lowering memory usage while retaining compatibility with
existing synchronous code is my reason for working on this.

  • It is difficult to know which operations should be run in atomic (users write code without checking atomicity).
  • It is difficult to find out which method can switch.
    • Not only user writing code, but also all library code can switch fibers.
    • This means that we need to check all of library code to know that they don't violate atomic assumptions.
  • It introduced non-deterministic behavior (with Fiber.yield it will be deterministic behavior and it is easy to reproduce the problem).

Yes; we will document all switch points in RDoc and NEWS,
of course (maybe write a separate doc/auto-fiber.rdoc)

This kind of difficulties are same as threading. The impact
can be smaller than threading (because threading can switch
anywhere and it is very hard to predict the behavior.
Auto-fibers switch only at blocking operations especially on
IO operations).

Right, I think auto-fiber will have some of the same (probably
minor) difficulties as threading. However, I do not believe it
is a big problem since Rubyists should already be used to
threading.

Consideration

To solve this behavior, we have several choice.

(1) Introduce synchronization mechanisms for auto-fibers

Like Mutex, Queue and so on.

Yes, I think Queue/SizedQueue should be able to respect Fiber
scheduling boundaries. Queue/SizedQueue are especially useful
and I plan to implement auto-fiber support for that.

I am not sure about Mutex... (can we defer to Matz for decisions?)

On Ruby 1.8 era, we have Thread.exclusive to prohibit thread-switching.

I don't want to choice this option because it is what I want to avoid from Ruby.

Right.

Maybe Mutex#synchronize can prohibit auto-switch (or, it
will show a warning or raise at auto-switch points).

(2) Introduce limitations

The problem "It is difficult to find out which method can switch" is because we need to check whole of code. If we can restrict the auto-fiber switching, this problem can be smaller.

Right now for IO, it is double opt-in:

It requires both Fiber#start and IO#nonblock=true.

Sidenote:

As a Rubyist who studies the Linux kernel; I consider it
imperative to give Rubyists the choice to make real blocking
syscalls (not the "fake blocking" with auto-fiber/green
threads).

This is because Linux can optimize "wake-one" situations to:
a) give round-robin load distribution across independent processes
b) avoid thundering herd with multiple threads/processes
c) (I forget...)

(sorry I forgot to note this in my original ticket, but it will
be in the final docs)

(2-1) Introduce Fiber switching methods

Instead of implicit blocking (IO) operations, introduce explicit blocking operations can switch. We can check all of source code by grep.

I am against this. Instead, I want it to be easy to port
existing Thread-aware codebases over.

Notice my example test script used net/http from stdlib.

I would like to use existing stdlib (net/*, webrick, drb, ...)
as much as possible without modifications. That means many
existing Ruby libraries can work transparently.

(2-2) Check context

Permit fiber switching only at permitted places, by block, pragma, and so on.

# auto-fiber: true # <- this file can switch fibers automatically
Fiber.new(auto: true){
  ...
  io.read # can switch
  ...
  something_defined_in_gem # can't switch
  ...
}

I think other languages like Python, JavaScript employs this idea. I need to survey more on such languages.

I do not like this, either. I admit I am not familiar with
those languages. I think we should strive to make existing
Thread-aware Ruby code work well, and as transparently as possible...

(3) Something else cleaver

Introducing debugger is one choice (maybe it is easy than threading issues).
But we can't avoid troubles (and maybe the troubles should be not frequent, non-reproducible).

Adding Tracepoint to help track auto-switch should be done
(honestly I have never used this feature in ruby :x).

And yes, I think native threading bugs are trickier to track down
than auto-Fiber switching. Just remember, today we have native
threading and things are OK. And I think there were more happy
Rubyists in 1.8 days.

Other option is to introduce hooks to implement auto-fibers and provide auto-fibers by gems and advanced users know the above risk use this feature. But not good idea because we can't provide good way to write for many people.

thought?

Again, no. I am really in favor of making it easy to port
existing Thread-aware code to auto-Fiber.

Again; from my experience; I do not believe many Ruby
programmers had safety problems with 1.8 green threads.

Today we have Rubyists who are already used to 1.9/2.x native
Thread already.

The safety improvement is a minor point.

#5 [ruby-core:81507] Updated by ko1 (Koichi Sasada) 9 months ago

normalperson (Eric Wong) wrote:

I disagree. I do not recall Ruby 1.8 Threads being a big problem
for Rubyists. Modern Rubyists seem OK using native Threads
("OK", not "great" :)
...
However, I do not believe it
is a big problem since Rubyists should already be used to
threading.
...
And yes, I think native threading bugs are trickier to track down
than auto-Fiber switching. Just remember, today we have native
threading and things are OK. And I think there were more happy
Rubyists in 1.8 days.
...
Again; from my experience; I do not believe many Ruby
programmers had safety problems with 1.8 green threads.

Today we have Rubyists who are already used to 1.9/2.x native
Thread already.

The safety improvement is a minor point.

My opinion is opposite. I think "For human being using threading is too hard to use correctly" or "Rubyist shouldn't care about threading difficulties". I agree my opinion is extreme and many "advanced" programmers like Eric can write correct thread programs. But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe.

(In addition: I heard some advanced programmers say "people can write". I doubt because it is something survivor bias)

(recent days I fixed rubygems' threading problem it is difficult to reproduce)

I often use this metaphor: It is like GC strategy. If people can manage object lifetime, it is faster than using GC (at some case. Some case GC is more faster than manual memory management). However we choose GC because we want to concentrate on writing application code.

I agree auto-fibers is safer than threads. In my mind:

danger <-> safe (this is my opinion)

   parallel threads (JRuby, ...) > concurrent threads (MRI) >>
   auto-fibers (full-auto)       > auto-fiber (restricted) >>
   Guild                         > single thread

But auto-fiber can introduce accident and it should be not so frequent, and it is difficult to reproduce. This means it is difficult to debug.

Ruby has many pit falls to shoot our own legs (meta-programming features, open class and so on) but they are deterministic (at most of case).

I think this is how to evaluate the risk of such danger.

C/C++/Java/... (and many imperative languages) choose performance (people should write correct code).

Some languages try to avoid this kind of difficulties. Rust choose threading but introduce harness by type system. Clojure choose STM to prevent atomic violation.

I agree threading and auto-fiber is easy to use. Maybe most of case it is no problem (especially on auto-fiber). But it can includes accident in only few cases and it will be difficult to find out.

I hope Ruby is safe language because I don't want to bother of such difficulties. This is my wish. I agree there are another wish like Eric's and I respect it.


Other than this point, I agree of all of your opinions. If I can believe "All Rubyist can write correct thread programs", your points make sense for me.

(other points)

Yes; we will document all switch points in RDoc and NEWS,
of course (maybe write a separate doc/auto-fiber.rdoc)

My point is, if method "foo" is switching point, then any method can call "foo" (bar, and baz, the caller of bar, ...) should be noted. Maybe it is impossible to complete because of Ruby's dynamic nature.

I would like to use existing stdlib (net/*, webrick, drb, ...)
as much as possible without modifications. That means many
existing Ruby libraries can work transparently.

I understand your point.

#6 [ruby-core:81514] Updated by normalperson (Eric Wong) 9 months ago

ko1@atdot.net wrote:

My opinion is opposite. I think "For human being using threading is too hard to use correctly" or "Rubyist shouldn't care about threading difficulties". I agree my opinion is extreme and many "advanced" programmers like Eric can write correct thread programs. But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe.

I do not believe I can write correct code of any type, actually.
Everything I write; even trivial single-threaded scripts has bugs.

On the other hand, my likelyhood of introducing bugs seems
nearly identical across any environment and programming models.
However, having less/simpler code (and less dependencies) seems
to result in fewer bugs, in my experience.

(In addition: I heard some advanced programmers say "people can write". I doubt because it is something survivor bias)

Yes.

(recent days I fixed rubygems' threading problem it is difficult to reproduce)

I often use this metaphor: It is like GC jtrategy. If people can manage object lifetime, it is faster than using GC (at some case. Some case GC is more faster than manual memory management). However we choose GC because we want to concentrate on writing application code.

Right. However, it seems choosing "easier" strategies means
less focus on overall design, leading to more problems down the line.

Since around 2010; I believe unicorn caused major, irreparable
damage to Rack ecosystem by promoting single-threaded design and
having a SIGKILL timeout feature. unicorn made Rubyists stop
caring to fix concurrency bugs and do proper timeouts.

Nowadays Rack apps are both too buggy AND use too much memory :<

I know some people disagree with my assessment of unicorn;
but I prefer to hate everything I've done: it's easier to
find improvements that way :)

I agree auto-fibers is safer than threads. In my mind:

danger <-> safe (this is my opinion)

   parallel threads (JRuby, ...) > concurrent threads (MRI) >>
   auto-fibers (full-auto)       > auto-fiber (restricted) >>
   Guild                         > single thread

Agree. So maybe we can design API for "auto-fiber (restricted)"?

But auto-fiber can introduce accident and it should be not so frequent, and it is difficult to reproduce. This means it is difficult to debug.

Ruby has many pit falls to shoot our own legs (meta-programming features, open class and so on) but they are deterministic (at most of case).

Yes. I think these (along too much code + dependencies) cause
more problems than concurrency bugs.

normalperson (Eric Wong) wrote:

Yes; we will document all switch points in RDoc and NEWS,
of course (maybe write a separate doc/auto-fiber.rdoc)

My point is, if method "foo" is switching point, then any method can call "foo" (bar, and baz, the caller of bar, ...) should be noted. Maybe it is impossible to complete because of Ruby's dynamic nature.

Right. Maybe that is a lot of documentation...

What if the API were the opposite of Thread.exclusive/Mutex#synchronize?
Perhaps:

Fiber.new do
Fiber.auto do
# enable auto-fiber inside this block
end
# disable auto-fiber again
end

Maybe Fiber.exclusive can disable Fiber.auto temporarily:

Fiber.new do
Fiber.auto do
# enable auto-fiber

Fiber.exclusive do
# temporarily disable auto-fiber
end
# enable auto-fiber again
...
end
end

Fiber.auto/Fiber.exclusive would be no-ops unless inside
a Fiber.new block...

But maybe that is too much code and nesting levels;
so I still like Fiber.start more.

I would like to use existing stdlib (net/*, webrick, drb, ...)
as much as possible without modifications. That means many
existing Ruby libraries can work transparently.

I understand your point.

Thanks; that is my biggest wish for this feature.

Anyways, I will leave matz, you and others deal with final API
decisions.

#7 [ruby-core:81537] Updated by Eregon (Benoit Daloze) 9 months ago

This is interesting work, I am curious to see how it will work out.

This looks similar to what Crystal has [1].

Does Kernel#puts potentially yields to another auto-Fiber?
I think that would be very counter-intuitive, but it would be tempting if $stdout is a pipe or socket.

Will a read from a socket always yield to the next fiber,
or can it proceed immediately if the socket is ready?
If not, then scheduling is non-deterministic,
even when communicating with a deterministic server.

It seems that the Crystal approach has some issues for terminating correctly.
However, if I understand in your model there is an implicit wait for all auto-fibers until termination at the program end?
This makes more sense to me for cooperative threading.

The description from Crystal mentions:
"Crystal uses green threads, called fibers, to achieve concurrency.
Fibers communicate with each other using channels, as in Go or Clojure, without having to turn to shared memory or locks."
The part about shared memory and locks is a lie though, these fibers do share memory and
atomicity is broken at every possible call that could invoke some IO-like operation.

This is also true for auto-fibers, which is a form of shared-memory concurrency,
and every yielding point will effectively need to assume
any other auto-fiber could have run in between and modified some global state
(unless the yielding order is very clear such as in a small program,
but in larger programs it becomes extremely difficult to know the fiber schedule).

[1] https://crystal-lang.org/docs/guides/concurrency.html

#8 [ruby-core:81543] Updated by normalperson (Eric Wong) 9 months ago

eregontp@gmail.com wrote:

This is interesting work, I am curious to see how it will work out.

Thanks for the interest.

This looks similar to what Crystal has [1].

Right. But actually I would use MRI 1.8 green threads as
a reference point. The key difference between this and 1.8
is this is tickless (or timer-less); so more predictable.

To me, there are only two types threads available to userland:

1) OS kernel knows about them (native thread)
2) OS kernel has no idea about them (fiber/green thread/goroutine)

Does Kernel#puts potentially yields to another auto-Fiber?
I think that would be very counter-intuitive, but it would be tempting if $stdout is a pipe or socket.

Yes, potentially. However, it requires setting IO#nonblock=true
on $stdout (or whatever $> points to), which is rare...

Non-blocking stdout is rare since likely causes headaches if using
system() to run other programs or having 3rd-party libs which
write to stdout.

Will a read from a socket always yield to the next fiber,
or can it proceed immediately if the socket is ready?

It only yields on EAGAIN/EWOULDBLOCK when rb_wait_for_single_fd
is called. It will never yield if there is always data.

AFAIK, Ruby io.c+ext/socket/* does not use rb_wait_for_single_fd
until it encounters EAGAIN/EWOULDBLOCK. (I would consider it a
performance bug if it did)

If not, then scheduling is non-deterministic,
even when communicating with a deterministic server.

(sorry, double negatives are confusing to me to parse and use).

If a socket can always read/write without encountering
EAGAIN/EWOULDBLOCK, the Fiber may run forever. This will starve
other Fibers, so it is up to the programmer to yield explicitly.

We should add Fiber.pass (like Thread.pass) to aid users with
this. This will protect HTTP/1.1 servers from DoS via request
pipelining.

So I guess scheduling is non-deterministic; but actual use
can be deterministic since the programmer should know when
to yield/pass explicitly?

It seems that the Crystal approach has some issues for terminating correctly.
However, if I understand in your model there is an implicit wait for all auto-fibers until termination at the program end?

This makes more sense to me for cooperative threading.

No implicit waiting for termination. Fibers can be forgotten
and dropped at program end; just like threads. I think this is
a necessary condition for supporting fork or exec.

Users must use Fiber#join or Fiber#value to ensure termination;
(same as Thread#join / Thread#value)

The description from Crystal mentions:
"Crystal uses green threads, called fibers, to achieve concurrency.
Fibers communicate with each other using channels, as in Go or Clojure, without having to turn to shared memory or locks."
The part about shared memory and locks is a lie though, these fibers do share memory and
atomicity is broken at every possible call that could invoke some IO-like operation.

This is also true for auto-fibers, which is a form of shared-memory concurrency,
and every yielding point will effectively need to assume
any other auto-fiber could have run in between and modified some global state
(unless the yielding order is very clear such as in a small program,
but in larger programs it becomes extremely difficult to know the fiber schedule).

[1] https://crystal-lang.org/docs/guides/concurrency.html

Yes. Programmers must be careful about shared memory; but
ruby-core can promote+improve APIs like Queue/SizedQueue to use
as communications channels. This should reduce the use of (and
dangers associated with) shared memory.

#9 [ruby-core:81631] Updated by ioquatix (Samuel Williams) 9 months ago

To a certain extent, things discussed here are already implemented in

https://github.com/socketry/async

and

https://github.com/socketry/async-io

What are the benefits of having this implemented in core Ruby as opposed to a gem which can be versioned independently and works with all Rubies 2.x, including JRuby and (in theory) Rubinius?

Why not focus on making core part of Ruby fast, and providing the appropriate hooks, rather than expanding her scope and complexity, in a way which has a proven track record for frustration (poorly designed stdlib which can't be fixed or improved due to breaking backwards compatibility).

#10 [ruby-core:81643] Updated by normalperson (Eric Wong) 8 months ago

samuel@oriontransfer.org wrote:

To a certain extent, things discussed here are already implemented in

https://github.com/socketry/async

and

https://github.com/socketry/async-io

What are the benefits of having this implemented in core Ruby as opposed to a gem which can be versioned independently and works with all Rubies 2.x, including JRuby and (in theory) Rubinius?

Neverblock basically tried the same thing with EM and never took
off. I don't know much about getting software adopted or
popularized, but maybe being in core has a better chance of
gaining adoption and being sustainable.

Being in core provides greater compatibility with external
libraries which are not aware of existing event loops. So
3rd-party DB adapters (e.g. mysql2) will be able to take advantage
of these changes transparently if they use rb_wait_for_single_fd
(and I will add a hook for rb_thread_fd_select, too).

It will also be easily possible to get existing primitives like
Queue/SizedQueue to work with Fibers out-of-the-box. Maybe even
Mutex+ConditionVariable, if approved.

One current example is being able to hook rb_waitpid: any
existing code using trap(:CHLD) continues to work transparently
even if using auto-Fiber for I/O; but auto-Fiber users can also
rely on "blocking" Process.waitpid if they desire.

Anyways, accepting any of this into core is not my decision to
make. I will only provide implementation and advice/hints.

A small rant about existing event loops:

Most existing event loop implementations (libev, libevent, EM)
seem stuck in single-thread mentality from legacy select/poll
APIs. They handle MT by having one event loop per-thread;
instead of taking advantage of the fact that modern primitives
like kqueue and epoll are both MT-friendly queues which are
populated by threads running inside the kernel.

In a world where memory and CPU are your only constraints,
you can run one (native thread|process) per-core and thus one
event loop per-core. This is perfectly fine for things like
memcached which are only memory+CPU bound.

That falls down once you have other constraints, such as
physical disks to deal with. I maintain software which reads
and writes simultaneously to dozens, if not hundreds of
rotational disks (JBOD) in a single process. With current APIs
on GNU/Linux and FreeBSD, the only way I've found(*) to deal
with this effectively is to use >=1 pthread per disk.

(*) Various AIO implementations are lacking, too. They
pessimize the hot cache case, lack open/unlink/rename/stat
equivalents, and userland implementations tend to not be
mountpoint/device-aware. Native AIO requires O_DIRECT in
Linux, so no page cache at all :<

Why not focus on making core part of Ruby fast, and providing the appropriate hooks, rather than expanding her scope and complexity, in a way which has a proven track record for frustration (poorly designed stdlib which can't be fixed or improved due to breaking backwards compatibility).

I think core and stdlib can evolve best if done together.

Fiber has been in production Ruby for nearly a decade now, with
only minor improvements, and seems largely ignored in the wider
scheme of things. I guess they're not that useful in practice.

And just because we're adding new features does not mean we're
not also finding places to optimize our code.
Mutex/Queue/SizedQueue/ConditionVariable are already faster in
trunk because of preparation work to make them auto-Fiber aware:

https://bugs.ruby-lang.org/issues/13517
https://bugs.ruby-lang.org/issues/13552

Why can't stdlib be fixed? Just because we need to support old
behaviors and APIs does not mean we cannot improve things.

Having a solid stdlib is a great way to improve core and
vice-versa, and helps us bridge the gap for end user code.

Finally, keep in mind there are Rubyists who are not
enthusiastic users willing to explore, they're the
"distro users". It'll be easier for them to pick up Ruby
and use Ruby apps if stdlib were better.

Despite using Perl more than Ruby, I'm a conservative "distro
user" myself with Perl. So I'm hesitant to use or depend on
stuff which isn't packaged by distros, especially when it comes
to end user convenience (some who do not even know or care about
what a programming language is).

So yes, I still write Perl 5.8-compatible code, and still
support legacy CentOS 5.x and 6.x systems.

#11 [ruby-core:81672] Updated by ioquatix (Samuel Williams) 8 months ago

I appreciate your detailed response it was interesting.

Does Ruby File.read and File.stat (and others) release the GVL? Otherwise, the performance benefit of multiple threads in this specific case is irrelevant. While I agree with you when writing high performance servers in C/C++, it might not be directly relevant to Ruby as it currently stands.

#12 [ruby-core:81674] Updated by normalperson (Eric Wong) 8 months ago

samuel@oriontransfer.org wrote:

Does Ruby File.read and File.stat (and others) release the GVL? Otherwise, the performance benefit of multiple threads in this specific case is irrelevant. While I agree with you when writing high performance servers in C/C++, it might not be directly relevant to Ruby as it currently stands.

File.read does. File.stat does not, at the moment. I tried
it a while back but the GVL is expensive to release for hot
cache situations(*).

File.open, IO.copy_stream, IO#write, IO#read, readpartial, sysread,
syswrite all release GVL, too.

In particular, IO.copy_stream is great for large, parallel
transfers to/from high-latency storage.

(*) the cost of GVL for quick ops is a big reason I want to get rid of it

But yeah, maybe the small regression from releasing GVL is
acceptable for now with File.stat. It's better than getting
stufk on NFS or slow disks.

File.rename, File.unlink, most Dir methods all have the same
problem with slow storage, too. We already pay the price
for small regressions when releasing GVL in current cases,
so maybe those can be GVL release points.

#13 [ruby-core:81687] Updated by ioquatix (Samuel Williams) 8 months ago

Thanks for your detailed reply. It's impressive and useful that you have such a good knowledge of these issues.

I spent some time just thinking about this issue, and how this feature tries to solve the problem in Ruby.

On the one hand, I'm fundamentally opposed to increasing the surface area of Ruby when it could be done by writing a gem. This has a massive upstream cost, affecting both JRuby and Rubinius. While I appreciate what you are saying w.r.t. maximising usage, I feel like building this into Ruby will cause stagnation of progress long term - one solution for all problems isn't always ideal. Seeing initiatives like stdgems.org only reinforces how I feel about this.

Generally speaking - I really appreciate the work that's been done here. I also feel like you've reinvented nio4r, async and a bunch of other stuff, at a very low level, without as much testing, compatibility, etc.

Ideally, we could move all socket related code into a gem - perhaps that's already on the cards e.g. stdgems. Once that's done, fixing issues like exceptions: false would be easier since it can be versioned.

I was thinking about how we could expose this to Ruby - and ideally, I think we should add two functions:

IO.wait_for_single_fd and IO.wait_for_pid. The C functions rb_wait_for_single_fd and rb_waitpid would invoke these functions, and these functions would implement the current logic of the current C functions. It probably makes sense to think in more detail how these functions should work - e.g. wait_for_multiple_fds (or select), or something more elaborate.

Then, we could allow things like async and auto-fibers to extend Ruby's IO system to provide a policy for blocking IO. auto-fibers could be implemented as a gem with a C extension.

What do you think?

#14 [ruby-core:81695] Updated by normalperson (Eric Wong) 8 months ago

samuel@oriontransfer.org wrote:

Thanks for your detailed reply. It's impressive and useful
that you have such a good knowledge of these issues.

No problem.

I spent some time just thinking about this issue, and how this
feature tries to solve the problem in Ruby.

On the one hand, I'm fundamentally opposed to increasing the
surface area of Ruby when it could be done by writing a gem.
This has a massive upstream cost, affecting both JRuby and
Rubinius. While I appreciate what you are saying w.r.t.
maximising usage, I feel like building this into Ruby will
cause stagnation of progress long term - one solution for all
problems isn't always ideal. Seeing initiatives like
stdgems.org only reinforces how I feel about this.

I understood something it was already decided by matz and ko1 to
do something along the lines of auto-Fiber. Though I can't find
ko1's original message in the archives, it's mostly quoted in in
my reply to him:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/80531

I should note some languages like Go, Erlang, Haskell, and the
afore-mentioned Crystal all have lightweight threading along
these lines in the core language.

In their current state, Fibers are much less useful than the
equivalents in those languages; while native Threads are too
expensive. Something in between Fibers and Threads seems
desirable; maybe we can give auto-Fiber another (short) name;
but I'm not sure it's necessary.

I was also influenced to explore lightweight threading in a
rack-devel thread and the responses James Tucker wrote to me:

Subject: big responses to slow clients: Rack vs PSGI

It's somewhere in https://groups.google.com/group/rack-devel but
that requires JS; so I can't view or link to it using w3m :<

Generally speaking - I really appreciate the work that's been
done here. I also feel like you've reinvented nio4r, async and
a bunch of other stuff, at a very low level, without as much
testing, compatibility, etc.

That's a fair point about less testing and compatibility.
But, I think there is more code using normal Ruby stdlib
that can automatically take advantage of these changes
so we'll be able to nail down any problems quickly.

On a technical level, I consider the design of libev (used by
nio4r and async) too limited in that it does not take advantage
of thread-safety baked into kqueue and epoll. Thinking in terms
of "events vs. threads" too limiting. As I've said before;
combining them is advantageous because both have their uses.
kqueue is an thread-friendly queue, so is epoll.

This feels like the microkernel vs monolithic kernel debate,
too. On one level, isolation and compartmentalization provided
by micro-kernels is appealing; but the ease-of-development of a
monolith allowed Linux to become the kernel for nearly
everything, from tiny IoT devices to giant supercomputers.

And that doesn't preclude things like loadable modules and FUSE
for userspace filesystems from being useful, despite core
filesystem drivers being bundled with Linux. So I think async
can still be supported as an alternative for Ruby; but the
bundled implementation can benefit more from tighter integration
into the core.

A more recent example might be git; which included high-level
non-essential "porcelain" tools early on in addition to the core
"plumbing". Initially, it was intended that separately
maintained wrappers such as "cogito", would implement the
porcelain UI bits and git would remain low-level plumbing. That
ended up making both development and usage more complicated.
Eventually git swallowed up most of the cogito functionality and
cogito was abandoned.

git also ended up with bundled functionality that would've been
separately packaged in other VCSes, including import/export
tools for email, CVS, SVN, etc.

The most relevant example from git might be the bundling of
libxdiff in git, allowing optimizations and tweaks not possible
with an external diff. However, GIT_EXTERNAL_DIFF still remains
supported for less-common use cases.

On a non-technical level:

Finally, this (ruby-core) is one of the few places I can still
contribute to in the Ruby world. All other relevant Ruby
projects requires running non-Free software (including JS) and
having to abide accept Terms-of-Service set by a corporation.

Fwiw, I agree with Rubinius philosophy of implementing more of
Ruby in Ruby and would rather contribute to that; but the above
is a huge factor in why I went on to work on C Ruby, instead.
(the other major factor is I strongly prefer C to C++).

Ideally, we could move all socket related code into a gem -
perhaps that's already on the cards e.g. stdgems. Once that's
done, fixing issues like exceptions: false would be easier
since it can be versioned.

Maybe that'll be done, too, but not my call.
But what about IO.pipe, backtick, and IO.popen?

I was thinking about how we could expose this to Ruby - and
ideally, I think we should add two functions:

IO.wait_for_single_fd and IO.wait_for_pid. The C functions
rb_wait_for_single_fd and rb_waitpid would invoke these
functions, and these functions would implement the current
logic of the current C functions. It probably makes sense to
think in more detail how these functions should work - e.g.
wait_for_multiple_fds (or select), or something more
elaborate.

Maybe.. I guess we already have IO#wait_able in io/wait; and
Process.wait
/IO.select is already possible to override and that
would have the same effect. We'd also have to expose the
optional read/write buffering + encoding conversion and make
that accessible to pure Ruby.

It would make C Ruby feel closer to Rubinius and that would be
nice :) I'm not sure how feasable it would be; to introduce
more Ruby-visible APIs to implement this.

And I think exposing more APIs to handle FDs directly is a
mistake in the presence of native threads. My proposed C API
prefers "int *fd" and "rb_io_t" to deal with close notification
handling. Multithreaded programs recycle FDs frequently and
internal APIs need to be prepared to deal with that.

The implementation I proposed also takes advantage of some
C-only optimizations such as reading/writing to memory across
Fiber stack boundaries: something which cannot be done with
higher-level APIs . Similar optimizations already landed for
thread_sync.c (Mutex/Queue) as well as IO#close in trunk.

Again, designing user-visible APIs is most difficult and
ruby-core have to think most about long-term support and
consequences.

So the difficulty of changing/adding APIs is:

1) internal C API (easiest)
2) public C API (difficult)
3) Ruby API (most difficult)

So, I've mainly done 1) and made minimal additions to 3).
Only changes to 2) are to internal behavior, so use from C
extensions remains the same.

Then, we could allow things like async and auto-fibers to
extend Ruby's IO system to provide a policy for blocking IO.
auto-fibers could be implemented as a gem with a C
extension.

What do you think?

I guess this is meant for matz and ko1.

We could actually have that today; and I guess you already have
that with async. All the IO methods are well-documented and
you can even ignore/override the existing IO buffering if you
override all the methods by monkey patching core classes.
Heck, you may even go as far as to never allocate rb_io_t
if you override IO.open/IO.pipe/*Socket.new/... and replace
them with your own class.

What I think is (or at least ought to be) irrelevant.

I only give matz and ko1 another option to choose from. We can wait
for matz and ko1 to decide what to do, maybe they'll discuss
this at: https://bugs.ruby-lang.org/projects/ruby/wiki/DevelopersMeeting20170616Japan
I certainly won't attend meetings or try to influence anybody
using anything besides plain-text messages, here.

#15 [ruby-core:81721] Updated by ioquatix (Samuel Williams) 8 months ago

Ruby Fibers as they currently stand are perfect and making them more complex is a mistake IMHO.

Let's be clear on this: auto-fibers are really just Fibers that yield when you call a blocking operation. It's as if you are rewriting the blocking function to call Fiber.yield.. and as you have implemented by overriding rb_wait_for_single_fd which invokes something to resume the fiber when the blocking function is done. This is exactly what async does, but it does it the only way currently possible - by wrapping around _nonblock methods. It's the reverse of what your proposed method does - by handling rb_wait_for_single_fd. Because I can't access that method from async without writing C, my choice is limited. But, if it was available, async could use it successfully.

I appreciate what you said about multi-thread multi-fiber execution using your proposed reactor design. I think it's good and it's probably better than libev. It's excellent that you have thought about how to solve these problems and I admire it. However, in my experience, libev is fast enough and n-m concurrency model is fast enough for Ruby. Until Ruby is several orders of magnitude faster, it won't make much difference, except perhaps a tiny bit of latency, but there are benefits to keeping a single request on a single thread or process - you can avoid having to deal with locking and other synchronisation primitives in some cases, e.g. caches. So, there are tangible benefits to using, say, m-process n-fibers vs n-fibers/m-threads model. Ruby has never really suited multi-threaded model unfortunately.

Just to be clear: I'm more interested in semantics than implementation. Get the semantics right and the correct implementation will follow. I see a lot of work done here on an implementation (which is awesome and it looks good), but I'm not completely clear that the semantics are really sound.

In contrast, Async is all about getting the right semantics and finding the implementation that suits.

#16 [ruby-core:81732] Updated by normalperson (Eric Wong) 8 months ago

samuel@oriontransfer.org wrote:

I appreciate what you said about multi-thread multi-fiber
execution using your proposed reactor design. I think it's
good and it's probably better than libev. It's excellent that
you have thought about how to solve these problems and I
admire it. However, in my experience, libev is fast enough and
n-m concurrency model is fast enough for Ruby. Until Ruby is
several orders of magnitude faster, it won't make much
difference, except perhaps a tiny bit of latency, but there
are benefits to keeping a single request on a single thread or
process - you can avoid having to deal with locking and other
synchronisation primitives in some cases, e.g. caches. So,
there are tangible benefits to using, say, m-process n-fibers
vs n-fibers/m-threads model. Ruby has never really suited
multi-threaded model unfortunately.

Just one correction; auto-Fiber does not migrate fibers or
migrate userspace(*) I/O operations across native threads at the
moment. You might be confusing this with my other
non-Fiber-using server designs which do migrate I/O operations
across threads.

For auto-fiber, there's minimal locking requirements even if we
get rid of GVL. It relies on locking already done by the
kernel; kqueue will require extra locking in the corner case
where read and write filters are both installed for an FD.

(*) Of course, Linux kernel soft IRQ handlers can migrate work
across cores in the background.

Just to be clear: I'm more interested in semantics than
implementation. Get the semantics right and the correct
implementation will follow. I see a lot of work done here on
an implementation (which is awesome and it looks good), but
I'm not completely clear that the semantics are really sound.

Anyways, it looks like matz is inclined to accept it; but ko1
wants some semantic tweaks with the API (but I'm not sure
what/how, exactly).

https://docs.google.com/document/d/1z19pKt8jlpiEUR3RnWWBCfs3OR_hbiAZMwpQ6ZTllP0/pub

(I've only viewed it with w3m, no idea if I'm missing anything
due to lack of JS)

#17 [ruby-core:81826] Updated by normalperson (Eric Wong) 8 months ago

Updated patch against r59201:
https://80x24.org/spew/20170629043509.14939-1-e@80x24.org/raw

matz/ko1: any idea on what changes to the Ruby API you guys want?

Anyways, I will make IO.select / rb_thread_fd_select sometime soonish...

#18 [ruby-core:82028] Updated by ko1 (Koichi Sasada) 7 months ago

sorry for long absent about this topic. it is hard task (hard to start writing up because of problem difficulties and my English skil ;p ) to summarize about this topic.

I try to write step by step.


Discussion at last developers meeting

Thread/Fiber switch safety

Koichi: (repeat my opinion about difficulty of thread/fiber safety)

akr: providing better synchronize mechanism (such as go-lang has) and encouraging safe parallel computation seems better.

Koichi: It is one possible solution but my position is "if people can shoot their foot, people will shoot".

Matz: I don't like to force people to use lock and so on.

(the point is Matz doesn't reject "-safe" approach)

Introduce restriction

(The following idea is not available at last meeting (only part of idea I showed))

Koichi:
The problem of this feature is mind gap using auto-fiber user and script writer. This is same as thread-safety. Person A consider the code is auto-fiber safe, and other person B (can be same as A) write a code without auto-fiber safety, then it will be problem.

In general, most of existing libraries are not auto-fiber safe code (it doesn't mean most of libraries are not auto-fiber safe. Many code are auto-fiber safe without any care).

If we can know a code (and code called by this code) is auto-fiber safe, we can use auto-fiber in safe.

There are three type of code.

  • (1) don't care about auto-fiber
  • (2) auto-fiber aware code (assume switching is not allowed at the beginning)
  • (3) auto-fiber aware code (don't care it is allowed or not allowed to switch)

There are three types of status.

  • (a) can't switch
  • (b) can enable to switch, but don't switch
  • (c) can switch

in matrix

    can switch / can enable switch
(a) can't      / can't
(b) can't      / can
(c) can        / ??

matrix with (1-3) and (a-c)

     (a)     (b)     (c)
(1)   OK      NG      NG
(2)   OK      OK      NG
(3)   OK(*1)  OK(*1)  OK

(1)-(b) and (1)-(c) is not accepted because other method called from this code can switch the context.
(2)-(c) is also unacceptable because the beginning of code is not auto-fiber aware.

*1) Possible problem: (3) can introduce dead-lock problem because it can stop forever.

Normal threads start from (a).
Auto-fibers start from (b). They are written in (1), (2) and (3). Maybe (2) is written for auto fiber top-lelvel. This code will call some async methods which can change context.

My proposal is, to write down explicitly of (1) to (3) and (a) to (c) in program.

At the meeting, I proposed non-matured keywords(-like) to control them.
(and just now I don't have good syntax for it yet)

akira: If we introduce such keywords, we need to rewrite all of code if we want to use auto-fiber web application request handler (for example, we need to rewrite Rails to run on auto-fiber based rack server).

Matz: it is unacceptable to introduce huge rewriting for existing code.

(IMO (not appeared in last meeting) we need to rewrite all of code even if we don't introduce keywords to make sure the auto-fiber safety)

after this discussion

Matz and I discussed about this issue, and we conclude that it is too early to introduce this feature on Ruby 2.5.


I want to consider this issue further. auto-fiber based guild is one possibility, this mean we can introduce object isolation and context switching each other.

#19 [ruby-core:82040] Updated by normalperson (Eric Wong) 7 months ago

ko1@atdot.net wrote:

sorry for long absent about this topic. it is hard task (hard
to start writing up because of problem difficulties and my
English skil ;p ) to summarize about this topic.

No problem, thank you for summarizing.

I try to write step by step.


Discussion at last developers meeting

Thread/Fiber switch safety

Koichi: (repeat my opinion about difficulty of thread/fiber safety)

akr: providing better synchronize mechanism (such as go-lang
has) and encouraging safe parallel computation seems better.

Koichi: It is one possible solution but my position is "if
people can shoot their foot, people will shoot".

I think your approach is too cautious.

We already have many dangerous things in Ruby, even in
single-threaded code. For example: File.read, IO#read, IO#gets
are all dangerous with no size limit: they can cause
out-of-memory or swapping on gigantic inputs, leading to DoS.

Fork and inadvertant sharing of open files/sockets can also
cause problems. And there are also pathological Regexp which
can cause unbound CPU usage.

Matz: I don't like to force people to use lock and so on.

(the point is Matz doesn't reject "-safe" approach)

Introduce restriction

(The following idea is not available at last meeting (only
part of idea I showed))

Koichi:

The problem of this feature is mind gap using auto-fiber user
and script writer. This is same as thread-safety. Person A
consider the code is auto-fiber safe, and other person B (can
be same as A) write a code without auto-fiber safety, then it
will be problem.

In general, most of existing libraries are not auto-fiber safe
code (it doesn't mean most of libraries are not auto-fiber
safe. Many code are auto-fiber safe without any care).

Right; most code does not have to care; and all these dangers
already exist with native Threads.

If we can know a code (and code called by this code) is
auto-fiber safe, we can use auto-fiber in safe.

There are three type of code.

  • (1) don't care about auto-fiber
  • (2) auto-fiber aware code (assume switching is not allowed at the beginning)
  • (3) auto-fiber aware code (don't care it is allowed or not allowed to switch)

There are three types of status.

  • (a) can't switch
  • (b) can enable to switch, but don't switch
  • (c) can switch

in matrix

    can switch / can enable switch
(a) can't      / can't
(b) can't      / can
(c) can        / ??

matrix with (1-3) and (a-c)

     (a)     (b)     (c)
(1)   OK      NG      NG
(2)   OK      OK      NG
(3)   OK(*1)  OK(*1)  OK

(1)-(b) and (1)-(c) is not accepted because other method called from this code can switch the context.
(2)-(c) is also unacceptable because the beginning of code is not auto-fiber aware.

*1) Possible problem: (3) can introduce dead-lock problem because it can stop forever.

Perhaps holding Mutex lock should disable auto-fiber switching.
This should prevent deadlocks, I think.

Existing code has Mutexes, so I'm not sure how they should
interact with auto-Fiber. I agree with Matz that we should
discourage locking, so I guess disabling auto-Fiber switch
while Mutex is held is the most straightforward solution.

Normal threads start from (a). Auto-fibers start from (b).
They are written in (1), (2) and (3). Maybe (2) is written for
auto fiber top-lelvel. This code will call some async methods
which can change context.

My proposal is, to write down explicitly of (1) to (3) and (a)
to (c) in program.

At the meeting, I proposed non-matured keywords(-like) to control them.
(and just now I don't have good syntax for it yet)

akira: If we introduce such keywords, we need to rewrite all
of code if we want to use auto-fiber web application request
handler (for example, we need to rewrite Rails to run on
auto-fiber based rack server).

Matz: it is unacceptable to introduce huge rewriting for existing code.

I agree completely with akira's observation and Matz's opinion
of this.

(IMO (not appeared in last meeting) we need to rewrite all of
code even if we don't introduce keywords to make sure the
auto-fiber safety)

I don't agree with this. A lot of code is already auto-fiber
safe because they are written with GVL+Threads in mind.
(see my original Net::HTTP example); and we also have a lot
of code (webrick, net/*) which worked fine with green Threads
in 1.8

Worst case is we release GVL in a native Thread and forget to
yield to other Fibers in the same Thread. However, that is
already a problem with existing code when run inside Fibers
(e.g. getaddrinfo, IO operations on NFS/slow-disk, ...)

I am working on making rb_thread_fd_select auto-fiber aware,
too. (done for iom_select/iom_epoll, working on iom_kqueue)

Matz and I discussed about this issue, and we conclude that it
is too early to introduce this feature on Ruby 2.5.

OK, I will continue to work on implementation improvements
and keep patches rebased to trunk.

I want to consider this issue further. auto-fiber based guild
is one possibility, this mean we can introduce object
isolation and context switching each other.

Do you think this is in the 2.5 timeline?

Thank you.

#20 [ruby-core:82214] Updated by ioquatix (Samuel Williams) 7 months ago

I am following this thread and I find it really fascinating.

Thanks everyone for thinking about these issues and Eric for your insightful work and ideas. Just as an aside, I feel like something is being lost in translation w.r.t. the response from Matz and other core Ruby developers. Perhaps we need to have a hangout to discuss these ideas.

I've just released async, async-io and async-dns 1.0.0, along with rubydns 2.0.0 - in addition to this there is also async-http (client and server library) and falcon, a rack compatible server, built on top of async. The http library lacks support for SSL so it's not 1.x yet - still working on that part.

It works on Ruby 2.0+, and most of it also works on JRuby, excepting JRuby's missing support for UDP sockets (https://github.com/jruby/jruby/pull/4684).

I would like to think async is a proof of concept of what is possible with Ruby, in terms of performance. I think it's a solid platform for making network clients and servers, and I've implemented both DNS client/server and HTTP client/server which provide useful test cases for both performance and design.

In terms of design, it's a very simple concept to use with an API that works as if it's sequential, but yields if the operation would block. The user almost cannot make any mistakes, and implementing complex network logic becomes trivial.

In terms of performance, there are few comparisons I can make. If you like more details, let me know. I'm going to be matter of fact, you can draw your own conclusions.

  • RubyDNS is about as fast as Bind for a trivial benchmark resolving a fixed set of IP addresses.
  • Falcon is as fast as Puma but scales significantly better especially if non-blocking IO is leveraged.
  • Falcon and Puma both process requests significantly faster than typical Rack middleware can cope with them. An example would be, Falcon can easily handle 30,000 conn/s on my 8-core workstation, but as soon as I put any non-trivial rack application behind it, it would drop to < 3000 conn/s. Falcon can handle up to 100,000 req/s on the same hardware (e.g. using keep alive).
  • I implemented a complete stack in C++ of the same concept, and it achieved roughly on 1 core what Ruby required 8 cores. That is, a single process/thread could handle 25,000 conn/s on 1 core, and about 90,000 req/s. So, Ruby is about 10x slower than similar C++ code.

Eric, my opinion at this point is that the work you've done here is awesome.

What I would personally like to see, is a backend, perhaps an alternative to nio4r, which, as an example, async could use to implement it's reactor. I think that when your selector is running for the current fiber, operations like wait_for_pid and wait_one_fd should be hijacked and go via reactor. I think it should be possible for nio4r to tap into this too some how. This would make things completely transparent for user.

I still believe this should be a gem - even if it's an official one distributed with Ruby, and that Ruby should expose the relevant hooks. Otherwise, it's going to make a lot of trouble for other implementations e.g. JRuby, MRI, etc. Ideally they can just expose the same low-level hooks at the VM level.

I would like to say at this point, with the release of async & (-*) 1.0, I believe that this concept has proven itself - e.g. that the implementation works, that it has good performance, and that it can be used to implement good composable libraries. Whatever form the final library takes, I hope that it is (a) modular (b) fast and (c) composable.

One final opinion that I've formed while working on this project, is that Ruby IO primitives are overly complex and fail to expose the right abstraction. *_nonblock methods never should have existed. If there is one thing I'd wish for, it's that once a decent asynchronous library is adopted, that these methods are not made part of it's public API. async does forward these methods, but it's only to make wrapping existing Net::HTTP work better, and essentially the x_nonblock variant is identical to the x method in async.

#21 [ruby-core:82215] Updated by ioquatix (Samuel Williams) 7 months ago

Just to add, Puma has a HTTP parser (and perhaps other bits) written in C, while Falcon is pure Ruby, yet Falcon has better/similar performance in my (hopefully unbiased) tests. Additionally, Falcon had significantly lower latency, and the C++ implementation even moreso.

#22 [ruby-core:82518] Updated by mame (Yusuke Endoh) 6 months ago

I comment in compliance with hsbt's request.

Basically I agree with ko1; Thread is considered harmful. Casual Rubyists (including I) had better not use it.

However, I'm not against introducing the feature in question as a professional feature for mature Rubyists.

One issue that I'm concerned about is, the name. (Sorry, but this is an important point to me!) Fiber is fiber because the programmer manages its control flow completely. "Auto-fiber" looks self-contradictory to me. For example, MSDN says:

A fiber is a unit of execution that must be manually scheduled by the application.
https://msdn.microsoft.com/ja-jp/library/windows/desktop/ms682661(v=vs.85).aspx

I believe that this feature should be introduced with another name. I have no counterproposal, though. Sorry.

#23 [ruby-core:82552] Updated by normalperson (Eric Wong) 6 months ago

mame@ruby-lang.org wrote:

I believe that this feature should be introduced with another
name. I have no counterproposal, though. Sorry.

What about Thriber? Or Fred?

"Fread" might be confused with fread(3) function, and I don't
know anybody named "Fred", so it is a safe name to choose :)

#24 [ruby-core:82756] Updated by normalperson (Eric Wong) 5 months ago

Eric Wrong normalperson@yhbt.net wrote:

mame@ruby-lang.org wrote:

I believe that this feature should be introduced with another
name. I have no counterproposal, though. Sorry.

What about Thriber? Or Fred?

"Fread" might be confused with fread(3) function, and I don't
know anybody named "Fred", so it is a safe name to choose :)

OK, "class Fred" occurs in object.c documentation already,
so maybe it is confusing. So I choose Thriber as a name:

https://80x24.org/spew/20170912053032.13622-1-e@80x24.org/raw

That patch contains the latest version of this feature rebased
against ko1's recent execution context changes in trunk (up to
r59844) along with some bugfixes (infinite wait fix).

It also adds rb_thread_fd_select as a scheduling point
(in addition to rb_wait_for_single_fd and rb_waitpid from
previously published patches). Only lightly tested,
more tests will need to be written...

Naming is hard :<

Pull request available below for git users:

The following changes since commit 65b11a04f10a2438f0d6ba263a78d16367c3aac0:

console.c: set winsize on Windows (2017-09-11 20:10:34 +0000)

are available in the git repository at:

git://80x24.org/ruby thriber

for you to fetch changes up to d9c0095537c3c01d2187e783910cdc92d6c545fc:

thriber: green threads implemented using fibers (2017-09-12 05:29:31 +0000)


Eric Wrong (1):
thriber: green threads implemented using fibers

common.mk | 7 +
configure.in | 32 +
cont.c | 123 ++-
include/ruby/io.h | 2 +
iom.h | 95 +++
iom_common.h | 204 +++++
iom_epoll.h | 697 ++++++++++++++++
iom_internal.h | 280 +++++++
iom_kqueue.h | 899 +++++++++++++++++++++
iom_pingable_common.h | 54 ++
iom_select.h | 448 ++++++++++
prelude.rb | 12 +
process.c | 15 +-
signal.c | 39 +-
.../wait_for_single_fd/test_wait_for_single_fd.rb | 62 ++
test/lib/leakchecker.rb | 9 +
test/ruby/test_thriber.rb | 274 +++++++
thread.c | 76 +-
thread_pthread.c | 5 +
vm.c | 9 +
vm_core.h | 4 +
21 files changed, 3324 insertions(+), 22 deletions(-)
create mode 100644 iom.h
create mode 100644 iom_common.h
create mode 100644 iom_epoll.h
create mode 100644 iom_internal.h
create mode 100644 iom_kqueue.h
create mode 100644 iom_pingable_common.h
create mode 100644 iom_select.h
create mode 100644 test/ruby/test_thriber.rb
--
Mr. Wrong

#25 [ruby-core:83034] Updated by normalperson (Eric Wong) 5 months ago

I've updated the series to support FIBER_USE_NATIVE=0 (along
with the proposed fix for [Bug #13887]).

The primary change for FIBER_USE_NATIVE=0 platforms is to move
away from cross stack linked-list manipulation and use the
heap for allocations, instead. This involved some structure
modifications to make rb_thread_fd_select work on select(2)-based
implementations. Of course, this increases the dependency on
rb_ensure to release heap memory.

FIBER_USE_NATIVE=1 platforms are still more important and faster,
of course.

I've tested on Debian 8.x and FreeBSD 11.0. Test reports from
other platforms appricated, thank you

Patch mbox (gzipped):

https://80x24.org/spew/20170928004228.4538-1-e@80x24.org/t.mbox.gz

...or "git request-pull"-generated pull request:

The following changes since commit d21aab2d3e007372973f2b803d7d8d7f9547f0cc:

  • 2017-09-28 (2017-09-27 21:55:33 +0000)

are available in the git repository at:

git://80x24.org/ruby thriber-copy

for you to fetch changes up to 20ea4d710d3d75d946f74346e6a6f3616dac682d:

thriber: non-native fiber support (2017-09-28 00:41:34 +0000)


Eric Wrong (3):
thriber: green threads implemented using fibers
thread_pthread: do not corrupt stack
thriber: non-native fiber support

common.mk | 9 +
configure.in | 32 +
cont.c | 173 ++--
fiber.h | 54 ++
include/ruby/io.h | 2 +
iom.h | 95 +++
iom_common.h | 228 ++++++
iom_epoll.h | 710 ++++++++++++++++
iom_internal.h | 372 +++++++++
iom_kqueue.h | 907 +++++++++++++++++++++
iom_pingable_common.h | 49 ++
iom_select.h | 473 +++++++++++
prelude.rb | 12 +
process.c | 15 +-
signal.c | 39 +-
.../wait_for_single_fd/test_wait_for_single_fd.rb | 62 ++
test/lib/leakchecker.rb | 9 +
test/ruby/test_thriber.rb | 274 +++++++
thread.c | 76 +-
thread_pthread.c | 10 +-
vm.c | 9 +
vm_core.h | 4 +
22 files changed, 3541 insertions(+), 73 deletions(-)
create mode 100644 fiber.h
create mode 100644 iom.h
create mode 100644 iom_common.h
create mode 100644 iom_epoll.h
create mode 100644 iom_internal.h
create mode 100644 iom_kqueue.h
create mode 100644 iom_pingable_common.h
create mode 100644 iom_select.h
create mode 100644 test/ruby/test_thriber.rb

#26 [ruby-core:84118] Updated by normalperson (Eric Wong) 2 months ago

Too late for 2.5, but I'll maintain and periodically rebase this
in hope it can be accepted for 2.6. I've updated patches for
Thriber support against latest trunk (r61067)

https://80x24.org/spew/20171207041831.29005-2-e@80x24.org/raw
https://80x24.org/spew/20171207041831.29005-3-e@80x24.org/raw

Also available at the "thriber-r61067" branch on git://80x24.org/ruby

#27 [ruby-core:84149] Updated by ioquatix (Samuel Williams) 2 months ago

I think that the work being done here is great. However I feel that this PR requires far more scrutiny than it's receiving.

It's worth considering that nio4r and friends took several years to stabilise and there is a huge amount of hard earned knowledge embedded in those gems, e.g.

I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think.

e.g. https://github.com/socketry/nio4r/issues/140

I think it's a great idea to have non-blocking evented IO. However, it's not as simple as making read/write non-blocking. How about DNS lookups? Filesystem access? The benefit of a library based approach as I proposed is that these limitations can be clearly part of the contract of a specific library, and people can make different libraries to suit their needs, but making it part of core Ruby is a slippery slope. If anything, it would be better to depend on an established solution for this, so that cases like using the system DNS resolver are handled correctly (e.g. libuv). Otherwise, this is a HUGE addition to the surface area of the ruby interpreter.

#28 [ruby-core:84153] Updated by normalperson (Eric Wong) 2 months ago

samuel@oriontransfer.org wrote:

I think that the work being done here is great. However I feel that this PR requires far more scrutiny than it's receiving.

Of course, which is why you don't see me pushing for it's
inclusion in 2.5. I only present and update it so people can
test it if they're bored. And I only started working on it
because ko1 seemed interested in it at the time.

I'd be surprised if this gets into 2.6 or any release in the
future. Nobody besides you and I seems interested in discussing
this anymore; so likely it'll sit here quietly for a few more
years.

Again, I don't make API decisions, I only present options.

I am using "double" for timeout since it is more convenient for arithmetic like parts of thread.c. Most platforms have good FP, I think.

e.g. https://github.com/socketry/nio4r/issues/140

Right. We already have plenty of threading internals using FP
for timing, as well as the public Ruby APIs for IO.select and
IO#wait_*able. Internally, at least it's a minor thing to
change all the internal APIs to use "struct timespec" all around
for maximum precision.

I think it's a great idea to have non-blocking evented IO.
However, it's not as simple as making read/write non-blocking.
How about DNS lookups? Filesystem access? The benefit of a
library based approach as I proposed is that these limitations
can be clearly part of the contract of a specific library, and
people can make different libraries to suit their needs, but
making it part of core Ruby is a slippery slope. If anything,
it would be better to depend on an established solution for
this, so that cases like using the system DNS resolver are
handled correctly (e.g. libuv). Otherwise, this is a HUGE
addition to the surface area of the ruby interpreter.

We have resolv.rb in stdlib; which was at least popular in 1.8
days. It's implemented entirely in Ruby, so it automatically
takes advantage of these Thriber changes, and has seen a fair
amount of use back in the 1.8 days (not that DNS has changed
drastically).

So really, the network I/O part is not a big, or even complex
change, it's 1.8 Threads being made an option again for
Ruby 2.x. I miss 1.8 Threads, but I also like native threads in
1.9/2.x; they each have their place. And the lightweight
threading for network I/O is what people seem to care about in
other languages (Go, Erlang). nio4r/libuv and async can still
be an option and I have no intention of breaking compatibility.

Filesystem access: out-of-scope for this...

I definitely do NOT want to try and make this use callbacks and
threadpools behind users' backs, even internally. It pessimizes
the common hot cache case (which doesn't require waiting); and
more importantly, and I do not want Ruby or any library to
interfere with mountpoint-aware code.

Mountpoint awareness is 100% necessary for me so there's no
queue blocking when one native thread is doing IO on a fast FS
while another native thread is doing IO on a slow FS. I end up
with dozens or hundreds of threads, because I have dozens or
hundreds of mountpoints of different speeds. This is an
uncommon use case, I know, but some people need it and the
VM must not get in the way.

So I think anything to deal with FS access specifically is
out-of-scope for this issue. We already have native Thread
support, and I use it to implement mountpoint-awareness. Some
of the GVL-release changes to File and Dir for 2.5 will help
with this (which reminds me, I still need to document some of
that in NEWS :x).

#29 [ruby-core:84980] Updated by hsbt (Hiroshi SHIBATA) 27 days ago

  • Assignee set to normalperson (Eric Wong)
  • Status changed from Open to Assigned

Hi, Eric.

We discussed your proposal at last developer meeting (Dec 26, 2017)

  • Name this "Thread", or something Thread-ish word than Fiber-ish
  • Matz doesn't have a strong opinion on the name but prefers 2 words (auto-fiber) than a coined word "Thriber."

Next actions:

  • Give a thread-ish name
  • Lock and queue should work with auto-fiber?
  • Is explicit context switching onto auto-fiber possible?

#30 [ruby-core:85012] Updated by normalperson (Eric Wong) 27 days ago

hsbt@ruby-lang.org wrote:

We discussed your proposal at last developer meeting (Dec 26, 2017)

Awesome news.

  • Name this "Thread", or something Thread-ish word than Fiber-ish

So if we just use "Thread", then existing Thread becomes M:N?
I will think about that... I have many use cases for native
threads, too; but maybe they can be satisfied transparently.

  • Matz doesn't have a strong opinion on the name but prefers 2 words (auto-fiber) than a coined word "Thriber."

Next actions:

  • Give a thread-ish name

OK, naming is hard :<

LightThread? Maybe too long...

Threadlet?

Not Thread-ish, but "Task"(*) or "Tasklet" may be a candidate.

This might take a while....

  • Lock and queue should work with auto-fiber?

I can definitely make Queues work. I think ko1 was mildly
against increasing use of Mutex.

One safety feature I was thinking about was disabling
auto-switching of Fibers while a Mutex is locked, even.

  • Is explicit context switching onto auto-fiber possible?

Yes, right now it's a subclass of Fiber so inherits
transfer/resume/yield

(*) Linux kernel uses "task" as generic term for threads, processes,
and everything in-between (different flags describe level of
sharing for clone(2))

#31 [ruby-core:85088] Updated by dsferreira (Daniel Ferreira) 25 days ago

Hi Eric,

I've been reading this issue and I'm finding it fascinating.
Let me play here the role of the ruby developer that is seeking to
understand better the asynchronous ruby capabilities.
Every time I read threads(conversations) like this one about the pros
and cons of Fibers vs Threads I tend to think: stay away from it.

When people like Kochi write comments like this:

"But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe."

or Yusuke Endoh:

"Thread is considered harmful. Casual Rubyists (including I) had better not use it."

what these comments make us mere mortals feel?

I will speak about me. When I read such a line I tend to step away.
So yes, this situation makes me develop single threaded code as much
as possible.
I rely on libraries to handle asynchronous behaviour for me and
specially I rely extensively on the actor model.

I doubt I will change my mind unless I start to read that Thread is
good to be used or Fiber is good to be used.

When I read all this conversation and you mention corner cases that
still have problems that is a NO GO for me.

IMHO to add yet another Thread like feature it should be "The Killer Feature".

The one that we can say to the all community: Hey people use this
thing because async is a paradise in ruby land at last.
If we don't have this it will be just another Thread, Fiber nightmare
for the very few who accept the overhead of dealing with all the
"buts".

And for the record, I use async libraries but I don't feel confident
about them either knowing that ruby core is not reliable in itself.
Production code in the enterprise world it is not something to mess around.
For me ruby core needs desperately to change this situation so I
really hope your work will be the answer for all of this I'm talking
about.
So yes, if it is it fits in ruby core like a glove IMO. If it is not
then we will be much worst because instead of 2 walking deads we will
have 3.
A 50% increase is a lot in this domain. Turns things into a joke.

So, can you please explain us what peace of mind will we gain with
this new "light thread" in our everyday work?

Thank you very much and keep up the excellent work.
I appreciate specially the care you have in passing across your
knowledge on the subject.
Really helpful and insightful.

Note:

Your last two messages are not part of the issue in redmine. I hope my
message will be there!
It seems mine did came in as well. I'm copy pasting it.

#32 [ruby-core:85128] Updated by ioquatix (Samuel Williams) 24 days ago

In async, I called it Async::Task. I think task is a good name for this kind of thing. In your case, you might want to consider Thread::Task. Since, the lexicographic nesting is similar to the logical nesting.

Regarding kqueue bugs. macOS kqueue implementation is horrendous. So, nio4r doesn't use it AFAIK.

Do you have explicit reactor, or is it implicit per-thread or per-process?

#33 [ruby-core:85081] Updated by normalperson (Eric Wong) 23 days ago

Eric Wong normalperson@yhbt.net wrote:

hsbt@ruby-lang.org wrote:

  • Name this "Thread", or something Thread-ish word than Fiber-ish

So if we just use "Thread", then existing Thread becomes M:N?
I will think about that... I have many use cases for native
threads, too; but maybe they can be satisfied transparently.

Thinking about this even more; I don't think it's possible to
preserve round-robin recv_io/accept behavior I want from
blocking on native threads when sharing descriptors between
multiple processes.

So a new class it is...

  • Matz doesn't have a strong opinion on the name but prefers 2 words (auto-fiber) than a coined word "Thriber."

Next actions:

  • Give a thread-ish name

OK, naming is hard :<

LightThread? Maybe too long...

Threadlet?

OK, I am liking "threadlet", and it looks like a real word:

https://www.merriam-webster.com/dictionary/threadlet
": a small thread : a delicate filament"

  • Lock and queue should work with auto-fiber?

I can definitely make Queues work. I think ko1 was mildly
against increasing use of Mutex.

How about we use Threadlet to discourage things we don't like
about normal Threads (such as Mutex, ConditionVariable, ...).

One safety feature I was thinking about was disabling
auto-switching of Fibers while a Mutex is locked, even.

s/Fibers/Threadlets/; but yes, I think it should be possible
to have something like Threadlet.exclusive { ... } to prevent
auto-switch surprises (like Thread.exclusive in 1.8)

#34 [ruby-core:85082] Updated by normalperson (Eric Wong) 23 days ago

Thinking about this even more; I don't think it's possible to
preserve round-robin recv_io/accept behavior I want from
blocking on native threads when sharing descriptors between
multiple processes.

 The following example hopefully clarifies why I care about
 maintaining blocking I/O behavior in some places despite relying
 on non-blocking I/O for light-weight threading.

 # With non-blocking accept; PIDs do not share fairly:
 $ NONBLOCK=1 ruby fairness_test.rb
 PID    accept count
 5240   55
 5220   42
 5216   36
 5242   109
 5230   57
 5208   26
 5227   53
 5212   26
 5223   46
 5236   43
 total: 493

 # With blocking accept on Linux; each process gets a fair share:
 $ NONBLOCK=0 ruby fairness_test.rb
 PID    accept count
 5271   50
 5278   50
 5275   50
 5282   49
 5286   49
 5290   49
 5295   49
 5298   49
 5303   49
 5306   49
 total: 493

 For servers which only handle one client-per-process (e.g.
 Apache prefork), unfairness is preferable because the busiest
 process will be hottest in CPU cache.

 For everything else that serves multiple clients in a single
 process, fair sharing is preferable.  This will apply to Guilds
 in the future, too.

 More information about this behavior I rely on is here:
 http://www.citi.umich.edu/projects/linux-scalability/reports/accept.html


 require 'socket'
 require 'thread'
 require 'io/nonblock'
 Thread.abort_on_exception = STDOUT.sync = true
 host = '127.0.0.1'
 srv = TCPServer.new(host, 0)
 srv.nonblock = true if ENV['NONBLOCK'].to_i != 0
 port = srv.addr[1]
 pipe = IO.pipe
 nr = 10
 running = true
 trap(:INT) { running = false }
 pids = nr.times.map do
 fork do
 pipe[0].close
 q = Queue.new # per-process Queue
 Thread.new do # dedicated accept thread
 q.push(srv.accept) while running
 q.push(nil)
 end
 while accepted = q.pop
  # n.b. a real server would do processing, here, maybe spawning
  # a new Thread/Fiber/Threadlet
 pipe[1].write("#$$ #{accepted.fileno}\n")
 accepted.close
 end
 end
 end
 pipe[1].close

 sleep(1) # wait for children to start
 cleanup = SizedQueue.new(1024)
 Thread.new do
 cleanup.pop.close while true
 end

 Thread.new do
 loop do
 cleanup.push(TCPSocket.new(host, port))
 sleep(0.01)
 rescue => e
 break
 end
 end
 Thread.new { sleep(5); running = false }

 counts = Hash.new(0)
 at_exit do
 tot = 0
 puts "PID\taccept count"
 counts.each { |pid, n| puts "#{pid}\t#{n}"; tot += n }
 puts "total: #{tot}"
 end
 case line = pipe[0].gets
 when /\A(\d+) /
 counts[$1] += 1
 else
 running = false
 Process.waitall
 end while running

#35 [ruby-core:85087] Updated by subtileos (Daniel Ferreira) 23 days ago

Hi Eric,

I've been reading this issue and I'm finding it fascinating.
Let me play here the role of the ruby developer that is seeking to
understand better the asynchronous ruby capabilities.
Every time I read threads(conversations) like this one about the pros
and cons of Fibers vs Threads I tend to think: stay away from it.

When people like Kochi write comments like this:

"But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe."

or Yusuke Endoh:

"Thread is considered harmful. Casual Rubyists (including I) had better not use it."

what these comments make us mere mortals feel?

I will speak about me. When I read such a line I tend to step away.
So yes, this situation makes me develop single threaded code as much
as possible.
I rely on libraries to handle asynchronous behaviour for me and
specially I rely extensively on the actor model.

I doubt I will change my mind unless I start to read that Thread is
good to be used or Fiber is good to be used.

When I read all this conversation and you mention corner cases that
still have problems that is a NO GO for me.

IMHO to add yet another Thread like feature it should be "The Killer Feature".

The one that we can say to the all community: Hey people use this
thing because async is a paradise in ruby land at last.
If we don't have this it will be just another Thread, Fiber nightmare
for the very few who accept the overhead of dealing with all the
"buts".

And for the record, I use async libraries but I don't feel confident
about them either knowing that ruby core is not reliable in itself.
Production code in the enterprise world it is not something to mess around.
For me ruby core needs desperately to change this situation so I
really hope your work will be the answer for all of this I'm talking
about.
So yes, if it is it fits in ruby core like a glove IMO. If it is not
then we will be much worst because instead of 2 walking deads we will
have 3.
A 50% increase is a lot in this domain. Turns things into a joke.

So, can you please explain us what peace of mind will we gain with
this new "light thread" in our everyday work?

Thank you very much and keep up the excellent work.
I appreciate specially the care you have in passing across your
knowledge on the subject.
Really helpful and insightful.

Note:

Your last two messages are not part of the issue in redmine. I hope my
message will be there!

On Wed, Jan 24, 2018 at 10:01 PM, Eric Wong normalperson@yhbt.net wrote:

Thinking about this even more; I don't think it's possible to
preserve round-robin recv_io/accept behavior I want from
blocking on native threads when sharing descriptors between
multiple processes.

The following example hopefully clarifies why I care about
maintaining blocking I/O behavior in some places despite relying
on non-blocking I/O for light-weight threading.

# With non-blocking accept; PIDs do not share fairly:
$ NONBLOCK=1 ruby fairness_test.rb
PID     accept count
5240    55
5220    42
5216    36
5242    109
5230    57
5208    26
5227    53
5212    26
5223    46
5236    43
total: 493

# With blocking accept on Linux; each process gets a fair share:
$ NONBLOCK=0 ruby fairness_test.rb
PID     accept count
5271    50
5278    50
5275    50
5282    49
5286    49
5290    49
5295    49
5298    49
5303    49
5306    49
total: 493

For servers which only handle one client-per-process (e.g.
Apache prefork), unfairness is preferable because the busiest
process will be hottest in CPU cache.

For everything else that serves multiple clients in a single
process, fair sharing is preferable.  This will apply to Guilds
in the future, too.

More information about this behavior I rely on is here:
http://www.citi.umich.edu/projects/linux-scalability/reports/accept.html


require 'socket'
require 'thread'
require 'io/nonblock'
Thread.abort_on_exception = STDOUT.sync = true
host = '127.0.0.1'
srv = TCPServer.new(host, 0)
srv.nonblock = true if ENV['NONBLOCK'].to_i != 0
port = srv.addr[1]
pipe = IO.pipe
nr = 10
running = true
trap(:INT) { running = false }
pids = nr.times.map do
  fork do
    pipe[0].close
    q = Queue.new # per-process Queue
    Thread.new do # dedicated accept thread
      q.push(srv.accept) while running
      q.push(nil)
    end
    while accepted = q.pop
      # n.b. a real server would do processing, here, maybe spawning
      # a new Thread/Fiber/Threadlet
      pipe[1].write("#$$ #{accepted.fileno}\n")
      accepted.close
    end
  end
end
pipe[1].close

sleep(1) # wait for children to start
cleanup = SizedQueue.new(1024)
Thread.new do
  cleanup.pop.close while true
end

Thread.new do
  loop do
    cleanup.push(TCPSocket.new(host, port))
    sleep(0.01)
  rescue => e
    break
  end
end
Thread.new { sleep(5); running = false }

counts = Hash.new(0)
at_exit do
  tot = 0
  puts "PID\taccept count"
  counts.each { |pid, n| puts "#{pid}\t#{n}"; tot += n }
  puts "total: #{tot}"
end
case line = pipe[0].gets
when /\A(\d+) /
  counts[$1] += 1
else
  running = false
  Process.waitall
end while running

#36 [ruby-core:85094] Updated by normalperson (Eric Wong) 23 days ago

danieldasilvaferreira@gmail.com wrote:

Hi Eric,

I've been reading this issue and I'm finding it fascinating.
Let me play here the role of the ruby developer that is seeking to
understand better the asynchronous ruby capabilities.
Every time I read threads(conversations) like this one about the pros
and cons of Fibers vs Threads I tend to think: stay away from it.

When people like Kochi write comments like this:

"But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe."

or Yusuke Endoh:

"Thread is considered harmful. Casual Rubyists (including I) had better not use it."

what these comments make us mere mortals feel?

Often, you will not have to think about things like Threads or
Fibers; and you may use them every day without knowing it.
Fwiw, every project screws up threading (and many other things)
sometimes; even scanning LKML from the past month I see several
subjects with "race condition" in them.

I will speak about me. When I read such a line I tend to step away.
So yes, this situation makes me develop single threaded code as much
as possible.
I rely on libraries to handle asynchronous behaviour for me and
specially I rely extensively on the actor model.

Threadlets/Fibers/Threads can all support the actor model. This
is why I lean towards supporting Queue/SizedQueue but am not as
enthusiastic about increasing scope of Mutex/ConditionVariable.

Threadlet can easily become Actors if matz or ko1 decides to
make such an API. The implementation details which exist today
would barely change.

I'm not a computer language person; to me it's all just bytes in
memory. The key difference is "native thread" has support from
an external layer, the kernel, whereas "userspace" Fiber/Threadlet
are invisible to the kernel.

Any actor API can be either "native" or not, or hybrid (M:N threading).
I believe M:N is too unpredictable/controllable to the
programmer (but I could be wrong).

I doubt I will change my mind unless I start to read that Thread is
good to be used or Fiber is good to be used.

When I read all this conversation and you mention corner cases that
still have problems that is a NO GO for me.

I think the only corner case I mentioned was for libkqueue;
which only affects Linux developers who want to support
some *BSD-specific code without installing FreeBSD.

Normal users won't be expected to use libkqueue.

IMHO to add yet another Thread like feature it should be "The Killer Feature".

No, what I work towards are incremental improvements and
regression fixes. So I consider Threadlet a regression fix for
the lightweight Thread we lost in the MRI 1.8 -> YARV (1.9) change.
It is also an opportunity to improve on what 1.8 had with better
scalability and more predictable (safer) behavior.

The one that we can say to the all community: Hey people use this
thing because async is a paradise in ruby land at last.

I would never say anything that optimistic :P

If we don't have this it will be just another Thread, Fiber nightmare
for the very few who accept the overhead of dealing with all the
"buts".

Huh? If you don't like something, you can ignore them and let
others use/try them. There's plenty of things I don't care for
in Ruby, too. Sometimes we can deprioritize/deprecate them,
make them less intrusive and move on (see 'callcc', $SAFE, taint).

And for the record, I use async libraries but I don't feel confident
about them either knowing that ruby core is not reliable in itself.

I'm not sure what you're talking about. I suppose nothing is
reliable :P For example, see how often "stable" Linux kernel
releases come out with GregKH saying "all users must upgrade".
Yet Linux is trusted with countless mission critical systems.

Best we can do is fix bugs and learn lessons from them to avoid
repeating history.

And life goes on...

Production code in the enterprise world it is not something to mess around.
For me ruby core needs desperately to change this situation so I
really hope your work will be the answer for all of this I'm talking
about.
So yes, if it is it fits in ruby core like a glove IMO. If it is not
then we will be much worst because instead of 2 walking deads we will
have 3.
A 50% increase is a lot in this domain. Turns things into a joke.

Did you see my other post about blocking accept? I have every
intent to continue using Thread as-is; and I also use Fiber
as-is in places where it is the perfect tool for the job.

They each have their uses.

And I also look forward to Guilds, too; which I expect to be
implemented using native threads but with less sharing visible
to the Ruby layer.

So, can you please explain us what peace of mind will we gain with
this new "light thread" in our everyday work?

But often I want something in-between what Thread and Fiber are,
and that's where Threadlet comes in.

Thank you very much and keep up the excellent work.
I appreciate specially the care you have in passing across your
knowledge on the subject.
Really helpful and insightful.

You're welcome.

Note:

Your last two messages are not part of the issue in redmine. I hope my
message will be there!

These two?
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/85081
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/85082

Maybe it's a bug in Redmine mailing list integration plugin,
Will try to get with hsbt (Hiroshi SHIBATA) to track it down...

#37 [ruby-core:85095] Updated by subtileos (Daniel Ferreira) 23 days ago

Eric Wong normalperson@yhbt.net wrote:

These two?

Yes Eric. And the last one as well. And I guess this here that I will
send will happen the same.
I believe it will be better to not reply to way while this is broken.
Which is a petty since I have some things to say but I believe it will
be better to wait. :-|

#38 [ruby-core:85136] Updated by normalperson (Eric Wong) 23 days ago

samuel@oriontransfer.org wrote:

In async, I called it Async::Task. I think task is a good
name for this kind of thing. In your case, you might want to
consider Thread::Task. Since, the lexicographic nesting is
similar to the logical nesting.

I prefer shorter names; and I'm not sure if Thread::Task makes
sense since it's an alternative to Thread (in some situations);
not a helper to Thread (unlike Mutex/Queue/etc).

Regarding kqueue bugs. macOS kqueue implementation is
horrendous. So, nio4r doesn't use it AFAIK.

Yes, there's also a select() implementation which should be a
safe fallback for everybody (not scalable, of course). I'm not
sure if OpenBSD/NetBSD/Dragonfly have acceptable kqueue
implementations, nowadays, either (FreeBSD seems fine).

I will add notes to guide porters into disabling kqueue support,
either broadly or fine-grained (per-type), or better,
eventually fixing their native kqueue implementations.

I also intend to try aio-poll support in future Linux versions
(currently under development).

Do you have explicit reactor, or is it implicit per-thread or
per-process?

Implicit per-process, and lazily created. kqueue and epoll
persistent data structures in the kernel are completely
safe to use across multiple threads. select needs no persistent
structure in the kernel. Userspace structures are of
course done in a thread-safe way and will be adjusted for
guilds or GVL removal.

If guilds end up being what I expect them to be (implemented via
native threads), reactor will likely remain per-process since
FDs are still per-process. Some structures and locking will be
adjusted for guilds, of course.

#39 [ruby-core:85138] Updated by subtileos (Daniel Ferreira) 23 days ago

Hi Eric,

It is really a shame that your replies in this thread are not being
added to the issue tracker.
Samuel's reply is there but your reply once again didn't get in.

Please try to do something about it because the conversation will be
lost in the future if nothing is done on that respect.

On my side, I will continue to wait that the problem can be corrected
in order to continue to give my contribution to it.

Many Thanks,

On Fri, Jan 26, 2018 at 7:13 PM, Eric Wong normalperson@yhbt.net wrote:

samuel@oriontransfer.org wrote:

#40 [ruby-core:85139] Updated by normalperson (Eric Wong) 23 days ago

Daniel Ferreira subtileos@gmail.com wrote:

Please try to do something about it because the conversation will be
lost in the future if nothing is done on that respect.

I've contacted hsbt (Hiroshi SHIBATA) about it, be patient as he is busy.

#41 [ruby-core:85140] Updated by hsbt (Hiroshi SHIBATA) 23 days ago

normalperson (Eric Wong) wrote:

Daniel Ferreira subtileos@gmail.com wrote:

Please try to do something about it because the conversation will be
lost in the future if nothing is done on that respect.

I've contacted hsbt (Hiroshi SHIBATA) about it, be patient as he is busy.

Hi, I've restored missing comments on redmine from our mailing list.
It's affected by server maintenance and has some issues with server configuration.

#42 [ruby-core:85141] Updated by dsferreira (Daniel Ferreira) 23 days ago

hsbt (Hiroshi SHIBATA) wrote:

I've restored missing comments on redmine

Thank you very much Hiroshi.
Feels much better now.

#43 [ruby-core:85144] Updated by dsferreira (Daniel Ferreira) 23 days ago

normalperson (Eric Wong) wrote:

I'm not sure what you're talking about. I suppose nothing is reliable

Let me try to explain what I think about the async subject in ruby land using a different story:

For me there is ruby core and there is ruby.
I'm a ruby kind of guy like most of ruby developers.
I like to use ruby and I like to use it as it is given to us by ruby core.
I prefer to build my own tools in top of ruby core rather than using external libraries/gems.
That with the assumption that ruby core will not break backwards compatibility.
If there is something I really dislike is to fix broken code due to dependency issues.

Ruby core is the rock solid foundation I rely upon for the developments I design and implement.

I started in ruby land with rails like most of us.
As years passed by I went more and more to other territories.
So I believe my story it is a very common story:
The ruby developer that starts at the very high level with rails.
With time becomes progressively more and more familiar with the low level concepts of programming.
Gets to understand the underlying concepts behind the frameworks and starts to grasp at last the ruby essence.
And here I am now speaking with you guys.
Ruby core. The lower level by excellence.
It is fascinating to go through discussions like this one.
The craft of ruby landscape for the future.
Technically speaking I'm learning a lot but I'm not prepared yet to give my contribution at that level.
The contribution I believe I can give is this view I'm speaking about.
The daily user that sometimes struggle to find the right paths for the problems in hands.


Ruby developers like my self (I imagine there will be more that feel this way) are very much impacted by the opinions of ruby core team members.
Specially top team members like Koichi.
We can call it the teacher - student dichotomy.

When Koichi referring to threads functionality in ruby land writes and says:

"But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe."

I do listen. People listen.

(Koichi sentence here is just an handy reference example (sorry Koichi), from the many I have read throughout this many years and many of those comments are here, embedded in redmine issues).

These sentences have a very big impact.

I as a programmer aim to write and develop correct code.
If there is an area that I do not feel comfortable with then I study it, play with it but that is it.
I will not put my job and my company in jeopardy just to show some cool stuff to the team.
Ruby programmer not ruby core hacker remember?

How many ruby developers develop a http server or know the internals of at least one?
(Just as an example of different levels of developer seniority.)
Unicorn or passenger or thin or puma... are black boxes for the most of us.
And yes there are bugs and our applications are impacted by them.
That is the ecosystem and it is good like that. It will not change.

Somehow there are people that feel happy playing in dangerous zones like threads and fibers
(See previous Koichi reference. We know you have a "slightly" different opinion).
Us, mere mortals, just would like to be able to do our daily work at least without compromising.
Although I would like to use libraries to play with my actors without worrying to much I can't.
Knowing that they are dangerous zones tells me I must worry still.

So, why not use Akka and live happy ever after?
In Akka land everyone happily uses actors.
I never heard any reference telling people to be careful about a given issue.
Maybe the issues exist but what you read is that Akka is the solution for all your problems in async world.

I don't want to use Akka but I know that ruby is losing developers every day because of situations like this one I'm referring here.
Ruby desperately needs to resolve once and for all this situation.

The key word for me here is a clear message that could say with confidence:

"Ruby is rock solid for async because..."

If we don't succeed to pass this message to the world of programming ruby will slowly be replaced by other languages.
Parallelism and concurrency and async will be everywhere in the future.

I took the decision to express this thoughts in this conversation because I love ruby and I want to help ruby become better.

In my opinion:

We need to create the foundations for a post ruby 3 future in ruby land where async is the standard for the many and not the exception for the few.

That is my vision.

Many Thanks,

Daniel

#44 [ruby-core:85162] Updated by jeremyevans0 (Jeremy Evans) 23 days ago

dsferreira (Daniel Ferreira) wrote:

We need to create the foundations for a post ruby 3 future in ruby land where async is the standard for the many and not the exception for the few.

That is my vision.

According to the tagline on the homepage, ruby is "A dynamic, open source programming language with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write." Asynchronous code tends to negatively affect simplicity and productivity in order to gain performance, and in general is more difficult to read, more difficult to write, and more difficult to test than synchronous code. To the extent that ruby can improve its support for asynchronous code without compromising its other values, I would probably support it (this feature is one of those cases). However, we should be careful to never sacrifice ruby's core values just to improve support for whatever programming paradigm is currently popular in some other programming subcultures.

In the future, for general philosophical discussion of ruby's goals and future direction, it is probably best to email ruby-core@ruby-lang.org directly. If you have a specific new feature or change in mind, then add it a new feature request. I think we should try to avoid adding tangentially-related philosophical discussion posts as notes on existing features/bugs.

#45 [ruby-core:85163] Updated by dsferreira (Daniel Ferreira) 22 days ago

jeremyevans0 (Jeremy Evans) wrote:

we should be careful to never sacrifice ruby's core values

I couldn't agree more.

If you have a specific new feature or change in mind, then add it a new feature request.

Yes I do Jeremy.
Eric's light thread it was a good starting point for this discussion but I will present something more concrete in a new issue and will link it to the different issues I believe are related to it.

#46 [ruby-core:85164] Updated by dsferreira (Daniel Ferreira) 22 days ago

normalperson (Eric Wong) wrote:

How about we use Threadlet

IMO the name we will chose will be more important then the functionality in itself.
It needs to stand out and create a clear picture in our mind.
Thread, Fiber, Guild? (not so sure about this name either), ?
We will have four entities on our async family.
Each name should be clearly sound.

Light Thread maps well in my mind.
Matz said he preferes two words. I would prefer a single word that could draw the same picture as "LightThread".
No mixtures between Thread and Fiber. That would be saying that the feature is linked to them.
IMO the message should be that this feature can be used by its own independently.
A clear distinction that will put aside any links to Threads and Fibers dos and don'ts.
It would be a good first step towards a smoother async ecosystem.

For all these reasons I would like to propose for the "Light Thread" feature the name:

"Strand"

  • Strand definition: a thin thread of something, often one of a few, twisted around each other to make a string or rope.
  • Strand gem is not used (only 0.1.0) so we can claim the name. See: https://rubygems.org/gems/strand.

#47 [ruby-core:85168] Updated by sam.saffron (Sam Saffron) 22 days ago

Hmmm, what about just bringing in the IO Manager APIs including Ruby helpers prior to re-introducing the green threads?

As it stands kqueue/epoll abstractions always require another fat dependency and there is no official API to consume them.

Even just solving this problem is enough of a hornets nest prior to introduction of other complications.

epoll is notoriously monstrous, http://cvs.schmorp.de/libev/ev_epoll.c?view=markup so having an officially supported abstraction would be a great start.

Wouldn't having these abstractions allow building this by hand using existing Fiber?

#48 [ruby-core:85170] Updated by normalperson (Eric Wong) 22 days ago

sam.saffron@gmail.com wrote:

Hmmm, what about just bringing in the IO Manager APIs
including Ruby helpers prior to re-introducing the green
threads?

One big problem I notice with existing IO manager APIs
(libev/libevent/EventMachine) is multi-threading was as an
afterthought to them. As in, throw a lock around a
single-threaded event loop and call it a day.

Ruby was this way, too; but want to work towards changing that
and embracing the multi-thread friendliness baked into APIs
provided by kqueue and epoll.

Btw, some of the discussion/planning around this started in:
https://public-inbox.org/ruby-core/20170402023514.GB30476@dcvr/t/

As it stands kqueue/epoll abstractions always require another
fat dependency and there is no official API to consume them.

I don't know if exposing a new API around them is desirable.
For human-friendliness, it seems desirable to keep the Ruby API
synchronous even if internal bits become async.

I think it's also desirable to be able to change some/most
existing Thread uses to auto-Fiber/Threadlet/Thriber without
having to re-design things, just changing "Thread.new" to
something else.

Even just solving this problem is enough of a hornets nest
prior to introduction of other complications.

epoll is notoriously monstrous,
http://cvs.schmorp.de/libev/ev_epoll.c?view=markup so having
an officially supported abstraction would be a great start.

I disagree. IMHO, Lehman's notes and complaints against epoll
are either out-of-date or his mental model went wrong somewhere.
Fwiw, fs/eventpoll.c is straightforward and easy-to-understand in
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

Wouldn't having these abstractions allow building this by hand
using existing Fiber?

One question is, how painful will it be in Ruby?

I've kinda soured on _nonblock APIs in Ruby over the years. For
example, in https://bugs.ruby-lang.org/issues/14404 I don't
think there's a non-painful Ruby way to resume a partial writev.
Doable, of course, but it requires extra allocations and copies.
Resuming a partial write_nonblock today without writev isn't great,
either...

With a synchronous interface (IO#write), dealing with partial
writev in C is only a few adds/subracts; and we wont expose
pointer arithmetic in Ruby :)

And then there's also stuff like IO.copy_stream not having
a _nonblock analogue...

#49 [ruby-core:85171] Updated by normalperson (Eric Wong) 22 days ago

danieldasilvaferreira@gmail.com wrote:

When Koichi referring to threads functionality in ruby land writes and says:

"But most (many? some? a few?) of ruby programmer (including me) can not write correct code I believe."

These sentences have a very big impact.

They should not have a big impact. Really make up your
own mind on these things instead just believing somebody;
even if they are a leader of this project.

I suspect if you look at any development archives for any major
projects; you will see similar statements from major contributors.

The key word for me here is a clear message that could say with confidence:

"Ruby is rock solid for async because..."

Saying something like that would open us up to lawsuits.
The following (or similar) disclaimer is in every project
I work on:

  1. THIS SOFTWARE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

So, no, I'm never going to say anything I work on is "rock solid".

If we don't succeed to pass this message to the world of
programming ruby will slowly be replaced by other languages.

In my experience, people left Ruby because incompatibilities got
painful and memory usage was too high.

Ruby 1.8 green Threads were a middle ground. Since 1.9+, Fibers
went in one direction (harder-to-use), while native Threads
went in another direction (too heavy); and there's nothing left
in the middle.

All I aim to do with this feature is fill the void in the middle.

Parallelism and concurrency and async will be everywhere in the future.

They already are, and have been for a while.

We need to create the foundations for a post ruby 3 future in
ruby land where async is the standard for the many and not the
exception for the few.

Internal implementation can be async, but the public API will
likely remain and favor synchronous (because redesiging existing
libs is expensive).

New features should always be opt-in, never a requirement.
That said, it should still be easy to port code over to take
advantage of new features; so I want to minimize publically
visible changes.

#50 [ruby-core:85172] Updated by normalperson (Eric Wong) 22 days ago

danieldasilvaferreira@gmail.com wrote:

For all these reasons I would like to propose for the "Light Thread" feature the name:

"Strand"

No, I don't want to introduce a non-obvious term nobody has seen
before in concurrency. I expect "Strand" will be mistaken for some
thing String-related. (String and Thread are also interchangeable
in English).

Anyways, I think Threadlet is an acceptable name.

#51 [ruby-core:85173] Updated by normalperson (Eric Wong) 22 days ago

Eric Wong normalperson@yhbt.net wrote:

  • Matz doesn't have a strong opinion on the name but prefers 2 words (auto-fiber) than a coined word "Thriber."

Next actions:

  • Give a thread-ish name

Threadlet?

OK, I changed to Threadlet for now.

  • Lock and queue should work with auto-fiber?

I can definitely make Queues work. I think ko1 was mildly
against increasing use of Mutex.

One safety feature I was thinking about was disabling
auto-switching of Fibers while a Mutex is locked, even.

Still TODO; I don't expect much time for more development
until March; but maybe I'll find pockets of time here and
there (much of the other work I do here is while procrastinating)

Anyways, rebased against r62077:

The following changes since commit 46bfa65fccf58cee280bf552193f93388b00d16d:

internal.h: add BITFIELD macro to aid C99 users (2018-01-27 21:04:42 +0000)

are available in the Git repository at:

git://80x24.org/ruby threadlet-r62077

for you to fetch changes up to 6b5c8ba6cbfd33d557748cad6ef4928332893083:

threadlet: non-native fiber support (2018-01-28 10:31:48 +0000)

Raw patches here:
https://80x24.org/spew/20180128103907.12069-2-e@80x24.org/raw
https://80x24.org/spew/20180128103907.12069-3-e@80x24.org/raw

#52 [ruby-core:85174] Updated by dsferreira (Daniel Ferreira) 22 days ago

normalperson (Eric Wong) wrote:

They should not have a big impact.

Playing the ruby developer role here remember? Do you think most ruby developers don't care about those statements?
What about good and straight forward async guides for regular ruby developer users?
Documentation and evangelism is a very important part of a language.
If we have the features but they are not very well explained to the main public the features become a knowledge to the few.
We need to put outside to the public a better message.
I'm planning to work on that as well and help fix it by the way.

So, no, I'm never going to say anything I work on is "rock solid".

That is called an hyperbole if it was not obvious. Sometimes I use them to emphasise a certain point.

In my experience, people left Ruby because incompatibilities got painful and memory usage was too high.

Fair enough. It would be interesting to research more about the subject. Is there any discussion on the subject in ruby core?
Stoically we are holding our position but losing to Python big time.
If it was me I would target Python as a reference.

Parallelism and concurrency and async will be everywhere in the future.

They already are, and have been for a while.

Yeah. Now I was being defensive. :-)
The "everywhere" might put some people arguing.
We can not fail with ruby 3. That was the main point of my alert.
But I agree with you that ruby is already behind the point where it should be.

Internal implementation can be async, but the public API will likely remain and favor synchronous

That is inline with what I have in my mind.
The issue I'm planning to open is about that.

New features should always be opt-in, never a requirement.

Totally agree.

#53 [ruby-core:85180] Updated by ko1 (Koichi Sasada) 22 days ago

On 2018/01/25 6:51, Eric Wong wrote:

Threadlet?
OK, I am liking "threadlet", and it looks like a real word:

https://www.merriam-webster.com/dictionary/threadlet
": a small thread : a delicate filament"

Another idea is to pass option to Thread.new() like
Thread.new(preemption: false).

Note: we can't pass thread options because of args/keywords spec.
It was pseudo code.
Note2: I agree it can be confusing with mixing normal Threads and
Threadlet. It likes proc and lambda.

--
// SASADA Koichi at atdot dot net

#54 [ruby-core:85181] Updated by ko1 (Koichi Sasada) 22 days ago

On 2018/01/25 7:01, Eric Wong wrote:

For everything else that serves multiple clients in a single
process, fair sharing is preferable.

Could you elaborate more? Generally, fairness is preferable. But I think
we can document "we don't guarantee fairness scheduling on this
feature", because our motivation is to provide a way to process multiple
connections. Thoughts?

Or dose it cause live-lock? (no-problem on server-client apps, but
multi-agents programs seems to cause live locking)

--
// SASADA Koichi at atdot dot net

#55 [ruby-core:85183] Updated by ko1 (Koichi Sasada) 22 days ago

On 2018/01/24 2:31, Eric Wong wrote:

  • Lock and queue should work with auto-fiber? I can definitely make Queues work. I think ko1 was mildly against increasing use of Mutex.

One safety feature I was thinking about was disabling
auto-switching of Fibers while a Mutex is locked, even.

If we name it as Thread-like (Threadlet), we can use all synchronization
tools with Threads (I feel it is natural). I'm not sure we should limit
to use them on Threadlet or not.

  1. Threads and Threadlets can share same synchronization tools
    -> Good: no learning efforts
    -> Bad: People can cause sync issues with mis-using or missing syncs

  2. Introduce Threadlets special synchronization tools and introduce
    special rules communicate with other threads
    -> Good: people can only use good tools (such as Queues)
    -> Bad: we need to learn new tools and rules

If we think Threadlet is a special Thread (and the name indicates it),
then (1) seems nice for me.

With both options, we can enjoy advantages of Threadlet:
(a) lightweight creation
(b) predictable (than preemptive threads) switching

--
// SASADA Koichi at atdot dot net

#56 [ruby-core:85186] Updated by dsferreira (Daniel Ferreira) 22 days ago

ko1 (Koichi Sasada) wrote:

I'm not sure we should limit to use them on Threadlet or not.

  1. Threads and Threadlets can share same synchronization tools
    -> Good: no learning efforts
    -> Bad: People can cause sync issues with mis-using or missing syncs

  2. Introduce Threadlets special synchronization tools and introduce special rules communicate with other threads
    -> Good: people can only use good tools (such as Queues)
    -> Bad: we need to learn new tools and rules

I'm all for (2) for the reasons I already mentioned:

  • Specially the big minus that we have in (1): "People can cause sync issues"
  • Using only good tools is a big +.
  • Not causing sync issues is a big ++.
  • The fact that people will be forced to learn new tools and rules is also a big + for me.
    • It draws the border between the old async scenario and the new one we are trying to implement.

If we think Threadlet is a special Thread (and the name indicates it),
then (1) seems nice for me.

I agree Threadlet has that implication.

Since we prefer to use names already in use in the async world what about call it:

Lane

  • Lua is always a source of inspiration to me.
  • Lanes is a lightweight, native, lazy evaluating multithreading library for Lua.
  • Lane meaning: a narrow road or division of a road
  • Lane gem (v0.1.0). 247 downloads. https://rubygems.org/gems/lane.

The sense of speed and direction pleases me a lot.

Note:

About Threads vs Lanes in Lua

LuaThread provides thread creation...and need therefore to be guarded against multithreading conflicts. 

Whether this is exactly what you want, or whether a more loosely implemented
multithreading (s.a. Lanes) would be better, is up to you. One can argue that
a loose implementation is easier for the developer, since no application level
lockings need to be considered.

#57 [ruby-core:85189] Updated by normalperson (Eric Wong) 22 days ago

Koichi Sasada ko1@atdot.net wrote:

On 2018/01/25 7:01, Eric Wong wrote:

For everything else that serves multiple clients in a single
process, fair sharing is preferable.

Could you elaborate more? Generally, fairness is preferable. But I think we
can document "we don't guarantee fairness scheduling on this feature",
because our motivation is to provide a way to process multiple connections.
Thoughts?

If I write a multi-process server with many long-lived
connections, it's best to balance those connections to mitigate
bottlenecks/problems which exist in each process. That way, any
slowdown or crash which affects one process only affects
its fair subset of connections.

This is fair sharing across different *nix processes...
Within each process, Threadlets are also round-robin scheduled,
but run until they cannot proceed.

Or dose it cause live-lock? (no-problem on server-client apps, but
multi-agents programs seems to cause live locking)

It should not, Threadlet is FIFO for "ready" Fibers;
epoll and kqueue are readiness queues are FIFO internally, too.

Blocking accept() mitigates live-lock/thundering herd across
different processes. For non-blocking accept(), I will add
EPOLLEXCLUSIVE support.

#58 [ruby-core:85190] Updated by normalperson (Eric Wong) 22 days ago

danieldasilvaferreira@gmail.com wrote:

ko1 (Koichi Sasada) wrote:

I'm not sure we should limit to use them on Threadlet or not.

  1. Threads and Threadlets can share same synchronization tools
    -> Good: no learning efforts
    -> Bad: People can cause sync issues with mis-using or missing syncs

  2. Introduce Threadlets special synchronization tools and introduce special rules communicate with other threads
    -> Good: people can only use good tools (such as Queues)
    -> Bad: we need to learn new tools and rules

I'm all for (2) for the reasons I already mentioned:

  • Specially the big minus that we have in (1): "People can cause sync issues"
  • Using only good tools is a big +.
  • Not causing sync issues is a big ++.
  • The fact that people will be forced to learn new tools and rules is also a big + for me.
    • It draws the border between the old async scenario and the new one we are trying to implement.

No, I'm against making major changes. For 2, I mean we limit
usage to queues for now, which is a a subset of 1; but I'm also
OK implementing mutex/condvar support for 1.

Having less things to learn is better for adoption and improving
usefulness

If we think Threadlet is a special Thread (and the name indicates it),
then (1) seems nice for me.

I agree Threadlet has that implication.

Since we prefer to use names already in use in the async world
what about call it:

Lane

Too obscure and not obvious for me; do non-Lua people know about it?

Terms such as process, thread, task, actor are already in wide use
across several different languages; so it should be obvious.

  • Lane meaning: a narrow road or division of a road

When comparing to physical objects, it seems more appropriate for
something like a channel or pipe.

#59 [ruby-core:85191] Updated by dsferreira (Daniel Ferreira) 22 days ago

normalperson (Eric Wong) wrote:

No, I'm against making major changes. For 2, I mean we limit
usage to queues for now, which is a a subset of 1; but I'm also
OK implementing mutex/condvar support for 1.

Having less things to learn is better for adoption and improving
usefulness

I would agree with that comment if the "less" doesn't imply in itself an overlap of confusions.
How will be the documentation? We need to think very careful about that.

Too obscure and not obvious for me; do non-Lua people know about it?

Do we have Threadlets in other languages?
It seems Lua has got something very similar (how similar?) and calls it Lanes.
Am I wrong with this assumption?

When comparing to physical objects, it seems more appropriate for
something like a channel or pipe.

In a dedicated Lane I see "vehicles" moving steady and fast in between the traffic chaos.
I consider it a fortunate choice from Lua people.
The notion of async for me is management of traffic in between the chaos.
Why thread? Because it is a kind of channel or pipe as well, isn't it?

#60 [ruby-core:85193] Updated by normalperson (Eric Wong) 22 days ago

danieldasilvaferreira@gmail.com wrote:

normalperson (Eric Wong) wrote:

No, I'm against making major changes. For 2, I mean we limit
usage to queues for now, which is a a subset of 1; but I'm also
OK implementing mutex/condvar support for 1.

Having less things to learn is better for adoption and improving
usefulness

I would agree with that comment if the "less" doesn't imply in itself an overlap of confusions.
How will be the documentation? We need to think very careful about that.

I prefer minimal documentation and having it do obvious/predictable
things which are already familiar to existing users of Thread.

In my experience, too much documentation overhwhelms users and
they ignore it.

And about the comments you see from developers here: the vast
majority of Ruby users will never read or see them even.
There's too much to read for most people.

Too obscure and not obvious for me; do non-Lua people know about it?

Do we have Threadlets in other languages?
It seems Lua has got something very similar (how similar?) and calls it Lanes.
Am I wrong with this assumption?

The "let" suffix is commonly associated with a smaller version
of something; and the "Thread" prefix already exists; so it
should be immediately familiar (at least to English speakers)

When comparing to physical objects, it seems more appropriate for
something like a channel or pipe.

In a dedicated Lane I see "vehicles" moving steady and fast in between the traffic chaos.
I consider it a fortunate choice from Lua people.
The notion of async for me is management of traffic in between the chaos.
Why thread? Because it is a kind of channel or pipe as well, isn't it?

Not exactly. Pipes are a type of queue (ring buffer), it is
something which data passes through. Threads/Processes/Fibers
are execution contexts which can use pipes/queues to pass data along.

#61 [ruby-core:85199] Updated by sam.saffron (Sam Saffron) 21 days ago

I am not a huge fan of the name threadlet, it just does not sound right.

What if a new construct is introduced:

pool = ThreadPool.new(concurrency: 100, max_workers: 5 # optional)

thread = pool.run do
  sleep # thread pool should be aware, this would preempt a context switch to another fiber
end

Using this construct one could manage many pools of fibers.

That can simplify all sorts of stuff, like creating a proxy that can only download 3 streams concurrently

DOWNLOAD_POOL = ThreadPool.new(concurrency: 3)
def proxy_url(url)

   if DOWNLOAD_POOL.queued > 5
     raise "too many things queued"
   end

   done = Queue.new
   t = DOWNLOAD_POOL.run do 
     done << download(url)
   end 
   render body: done.pop
end

#62 [ruby-core:85204] Updated by normalperson (Eric Wong) 21 days ago

sam.saffron@gmail.com wrote:

I am not a huge fan of the name threadlet, it just does not sound right.

Is "Task" better? Or "CoThread" (like "coroutine").
Actually I don't like "CoThread" much, but "Task" is
short and a somewhat popular name:

https://en.wikipedia.org/wiki/Task_(computing)

What if a new construct is introduced:

pool = ThreadPool.new(concurrency: 100, max_workers: 5 # optional)

I really don't like that. It's too much up-front cost to having
to declare a pool ahead-of-time. One thing I love about
Fiber/Thread/fork is they can be used anywhere, even when deep
inside libraries.

That said, glibc has internal caching of thread stacks, and Ruby
also caches Fiber stacks internally, but they're completely
transparent to the user. There's also code for an internal
Thread cache for Ruby, but it's broken with fork and disabled, atm

#63 [ruby-core:85206] Updated by sam.saffron (Sam Saffron) 21 days ago

I like Task a lot, it is short and makes much sense.

So conceptually a kernel thread will be allowed to schedule N Tasks.

How would you manage scheduling tasks that are potentially blocking. Should Ruby opt for a goroutine type implementation where core just handles spawning "enough" underlying threads to handle the work, or would the management be at a higher level and you would spawn N threads and then tasks from said threads.

I think it probably makes sense to always have Tasks coupled tightly with threads initially cause debugging will be much simpler.

If this is coupled to Thread does this make sense?

t = Thread.new do
   sleep
end

t.add_task do
   # my task
end

Thread.current.add_task do
   # some task
end

even when deep inside libraries.

Note, you only get 1500 or so frames these days on Fiber and over 10k or so on Thread, this will be limited to a degree by Fiber design. This should be plenty for Rails apps that love deep stacks cause I don't think we usually pass 400 or so frames deep these days.

#64 [ruby-core:85207] Updated by ko1 (Koichi Sasada) 21 days ago

On 2018/01/29 14:06, sam.saffron@gmail.com wrote:

I like Task a lot, it is short and makes much sense.

I strongly oppose the name Task because it is ambiguous, many language
(and OSs) uses this word as many purpose.

--
// SASADA Koichi at atdot dot net

#65 [ruby-core:85209] Updated by sam.saffron (Sam Saffron) 21 days ago

What about Job?

job = Thread.current.queue do
  sleep 100
end

job.cancel

#66 [ruby-core:85217] Updated by normalperson (Eric Wong) 21 days ago

sam.saffron@gmail.com wrote:

I like Task a lot, it is short and makes much sense.

I guess there's a risk of namespace conflict with existing
code with such a generic name like "Task" or "Job". But,
maybe the class name should not matter as much as adding
new ones can always cause conflict with existing code.

So, based on your add_task proposal; maybe the name of the
class wouldn't even matter, and we can use whatever name,
(I just chose "async") to create it:

foo = Thread.current.async do
   # some task
end

foo.class => RubyVM::ThingWeCannotDecideANameFor

# (Or Thread.async, because only current is supported atm)
foo = Thread.async {}

foo.class => RubyVM::ThingWeCannotDecideANameFor

In other words, API for usage and class name can be orthogonal.

So conceptually a kernel thread will be allowed to schedule N Tasks.

Right.

How would you manage scheduling tasks that are potentially
blocking. Should Ruby opt for a goroutine type implementation
where core just handles spawning "enough" underlying threads
to handle the work, or would the management be at a higher
level and you would spawn N threads and then tasks from said
threads.

That would be M:N threading which I am uncertain about.

Mainly, I want to still be able to do real blocking operations
even when non-blocking operations are supported for sockets:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/85082
https://public-inbox.org/ruby-core/20180124220143.GA5600@80x24.org/

(likewise with recv_io or small IO#sysread on IO.pipe)

So "enough" is difficult to determine (not just CPU count).
I have use cases which involve multiple mount points which
I'd like to be able to optimize for with Ruby.

I think it probably makes sense to always have Tasks coupled
tightly with threads initially cause debugging will be much
simpler.

Yes, it's a requirement at the moment since migrating Fibers
across Threads is not possible.

I think we'd have to give up fast native Fiber switching
(ucontext_t) if we want to migrate Fibers across Threads (maybe
ko1 can confirm).

So that's why "rb_thread_t.afrunq" came to be:

Changes to existing data structures:

rb_thread_t.afrunq   - list of fibers to auto-resume

#67 [ruby-core:85235] Updated by shan (Shannon Skipper) 21 days ago

Looking at naming in few languages that implement a similar feature, there seems to be no consensus:

  • Goroutine (Go)
  • Lane (Lua)
  • Spark on a Haskell Thread (Haskell)
  • Task (Elixir - though there's more is going on here, and Process is really closer)
  • Process (Erlang)
  • Fiber (Crystal)

Process and Fiber obviously won't work because Ruby already uses these terms. Of the remaining options, Goroutine is the most widely known. I think RubyRoutine is overly verbose, but it'd be easy to explain that a Routine in Ruby is like a Goroutine.

I think Threadlet, Task, Routine, Lane and Spark are all viable options. Most folk won't know what a Lane or Spark are, so I'm not sure there's a big advantage in using one of those names for the sake of consistency across languages.

#68 [ruby-core:85236] Updated by sam.saffron (Sam Saffron) 21 days ago

I think Routine is a bit tricky to spell so I would recommend avoiding it. In Go people talk about Goroutines but never actually write it in code. That said, this is pretty hidden.

In other words, API for usage and class name can be orthogonal.

I agree with this and do not think we can afford some level of extra verbosity here:

I like ThreadTask a lot since these things are coupled with threads. I think ThreadJob works as well.

API wise we can even avoid this altogether with

Thread.current << lambda { }

So we don't even need to think about async vs add_task vs add_job

Yes, it's a requirement at the moment since migrating Fibers
across Threads is not possible.

I would like to hear a bit more about this, could there be an "expensive" thread transfer operator added perhaps that only moves when the fiber is suspended?

j = Thread.current << lambda { sleep }

Thread.new do
   sleep
end.transfer(j)

#69 [ruby-core:85237] Updated by normalperson (Eric Wong) 21 days ago

sam.saffron@gmail.com wrote:

I like ThreadTask a lot since these things are coupled with
threads. I think ThreadJob works as well.

Maybe we can call it what it is: Thread::Green

I suspect using top-level namespace is unnecessary and may
introduce conflicts.

API wise we can even avoid this altogether with

Thread.current << lambda { }

So we don't even need to think about async vs add_task vs add_job

I like that.

One question, is how will Thread#[]/#[]= be handled inside
the lambda?

Yes, it's a requirement at the moment since migrating Fibers
across Threads is not possible.

I would like to hear a bit more about this, could there be an
"expensive" thread transfer operator added perhaps that only
moves when the fiber once suspended?

One problem is the act of suspending it (Fiber.yield)
will need to change. Maybe it could default to fast suspend,
and the migrate operation would:

1. set a flag to indicate migration in progress
2. resume
3. see + clear migration flag
4. suspend again immediately, but slowly for migration

But it's a totally orthogonal issue to auto-fiber

/me goes back to working on non-Ruby stuff...

#70 [ruby-core:85273] Updated by ioquatix (Samuel Williams) 19 days ago

Wouldn't having these abstractions allow building this by hand using existing Fiber?

Yes, it's feasible and already implemented here https://github.com/socketry/async and it's backwards compatible with older Rubies.

Even just solving this problem is enough of a hornets nest prior to introduction of other complications.

I agree with this, but not for the reason stated. I think modern epoll/kqueue/select basically just work. Yes, there are some odd issues you have to deal with, but for the most part things work well and it's as efficient as it's going to get in a general sense.

What I think is a bigger issue is blocking system calls. An example of this would be system name lookup (e.g. DNS). The two main mitigations are using a threadpool (libuv and neverblock do this AFAIK), or having multiple reactors and migrating other Fibers if the reactor is blocked. Even just having a tight loop can cause problems, and even in the case where you have non-blocking IO, if it never actually blocks and yields back to the reactor.

pool = ThreadPool.new(concurrency: 100, max_workers: 5 # optional)

It's a bit surprising to see this, but your example is almost exactly the same as using Async::Reactor, simply replace ThreadPool with Async::Reactor and the code will almost work. Semantically it's about the same as what I think is the ideal solution.

I think that abstracting around the Reactor pattern is a good idea. It provides strong guarantees about the state of the program.

Here is the main entry point for an Async::DNS::Server instance: https://github.com/socketry/async-dns/blob/5ec883c0dd3d69b766668e4e6811561aba847ac6/lib/async/dns/server.rb#L106-L120

Async::Reactor#run handles nesting: https://github.com/socketry/async/blob/4f695ed6e340031f27f6db5100ab86ba139ae3d9/lib/async/reactor.rb#L38-L61

If you call the run method inside an existing reactor, it returns an async task which you can use to stop the server and all async tasks started within the server. If you call it outside of a reactor, it will create a reactor and block forever. In both cases the life cycle is managed correctly.

Simply making a per-thread reactor and making read/write calls non-blocking only solves about 10% of the problem IMHO.

To compare some of the pseudo examples with real code, take a look at the C10k implemented here: https://github.com/socketry/async-io/blob/master/spec/async/io/c10k_spec.rb

#71 [ruby-core:85335] Updated by sam.saffron (Sam Saffron) 17 days ago

Having discussed this with Koichi I think he is wanting to merge this into core but the big blocker here is naming and some small details.

Koichi is not particularly fond of Thread.current << lambda {} cause he feels it is a bit confusing. Especially since we have Thread.current["x"].

I think this works (albeit with some multithreading concerns):

Thread.current.scheduler << lambda {}
Thread.current.scheduler.resume
Thread.current.scheduler.current
Thread.current.scheduler.current&.yield

One question, is how will Thread#[]/#[]= be handled inside the lambda?

I think it should be simply treated as a Thread global so it is shared between the lambdas.

If you need lambda specific storage we could implement something else. Otherwise it complicates stuff.

Regarding:

Simply making a per-thread reactor and making read/write calls non-blocking only solves about 10% of the problem IMHO.

I am not sure if I agree with this. This change will give us a single threaded reactor and allow us to continue using the exact same API we use elsewhere. It drops in to existing Ruby code much cleaner than introducing new APIs, File#read yields, PG::Connection#exec yields and so on. This is something we have wanted for your years. We basically get EventMachine without needing to adhere to the EventMachine API. It would be a great first step.

One big question I have though is how rb_thread_call_with_gvl and rb_thread_call_without_gvl will be handled, cause without magic handling there we don't get free PG / MiniRacer support and many others which is a huge shame.

#72 [ruby-core:85336] Updated by normalperson (Eric Wong) 17 days ago

sam.saffron@gmail.com wrote:

Having discussed this with Koichi I think he is wanting to
merge this into core but the big blocker here is naming and
some small details.

I'm leaning towards Thread::Green, so existing users can do
s/Thread.new/Thread::Green.new/ in many cases.

But, it would be easier if somebody good at API design (matz)
chimed in :>

Meanwhile, I think get rid of floating point timeouts:
https://bugs.ruby-lang.org/issues/14431
Then it might be easier to work on Queue/Mutex/... support.

One question, is how will Thread#[]/#[]= be handled inside the lambda?

I think it should be simply treated as a Thread global so it is shared between the lambdas.

If you need lambda specific storage we could implement something else. Otherwise it complicates stuff.

That's probably too incompatible; I think the current Fiber#[]/#[]=
behavior is fine (Thread::Green implemented as subclass of Fiber)

One big question I have though is how rb_thread_call_with_gvl
and rb_thread_call_without_gvl will be handled, cause without
magic handling there we don't get free PG / MiniRacer support
and many others which is a huge shame.

I expect PG to be able to benefit from rb_wait_for_single_fd when
using sockets. I know mysql2 uses rb_wait_for_single_fd, at least.

rb_thread_call_* is meant for CPU (or FS/memory)-bound tasks,
and wouldn't MiniRacer be CPU-bound? Dunno much about it...

#73 [ruby-core:85353] Updated by sam.saffron (Sam Saffron) 16 days ago

I'm leaning towards Thread::Green, so existing users can do
s/Thread.new/Thread::Green.new/ in many cases.

Yes I think this works the problem though is that people will
expect this to work like green threads, meaning they also
should auto-yield regularly. You should be allowed to have
two green threads doing expensive computations. One tight loop
in a reactor now and you blow up everything (unlike normal threads)

This would mean you would have to pull in the 1.8 scheduler
or something. But then this stops being a proper reactor :(.

I guess this is the underlying reason you just wanted to call this auto
yielding fibers instead of threads to start with.

A question for Matz and Koichi is if they expect the scheduler from 1.8
to be brought back, if this is "safe" by default and "opt-in" for unsafe.

I expect PG to be able to benefit from rb_wait_for_single_fd when
using sockets. I know mysql2 uses rb_wait_for_single_fd, at least.

I am not sure about this, libpq abstracts all of this stuff away from you
this is why Sean G wrote a complete binary protocol implementation in rust,
to gain control. pg gem does not use rb_wait_for_single_fd it just releases gvl.

We have to make sure there is some sort of path forward with Postgres here
it is a huge issue.

MiniRacer is CPU bound its basically packaging libv8 into Ruby. I am on the fence here
On one hand it would be nice to auto yield so we feel reduced GVL pain and Ruby code
can run while v8 does it's thing. On the other hand the semantics of one thread at 100%
suddenly becomes 2 threads at 100% is not ideal. Hard to decide.

#74 [ruby-core:85362] Updated by normalperson (Eric Wong) 16 days ago

sam.saffron@gmail.com wrote:

Issue #13618 has been updated by sam.saffron (Sam Saffron).

I'm leaning towards Thread::Green, so existing users can do
s/Thread.new/Thread::Green.new/ in many cases.

Yes I think this works the problem though is that people will
expect this to work like green threads, meaning they also
should auto-yield regularly. You should be allowed to have
two green threads doing expensive computations. One tight loop
in a reactor now and you blow up everything (unlike normal threads)

Good point. rb_thread_call_without_gvl could be used to migrate
work to a thread pool (so we end up with M:N threads),
and maybe that's not horrible as a default behavior.
Data migration across native threads would hurt locality-wise
for short-lived tasks (e.g. rb_stat), though...

Fwiw, I was planning on adding a hinting mechanism to
rb_thread_call_without_gvl anyways later on (for GC, maybe);
but hints could be added to prevent/encourage migration based
on the expected duration/bottleneck of the function.

This would mean you would have to pull in the 1.8 scheduler
or something. But then this stops being a proper reactor :(.

I guess this is the underlying reason you just wanted to call this auto
yielding fibers instead of threads to start with.

Right, the predictability of not having a timer switch threads
automatically is appealing, sometimes.

Having rb_thread_call_without_gvl become a scheduling point of
some sort for green threads would be fine, however, since all
callers already assume a context switch will happen.

A question for Matz and Koichi is if they expect the scheduler from 1.8
to be brought back, if this is "safe" by default and "opt-in" for unsafe.

I expect PG to be able to benefit from rb_wait_for_single_fd when
using sockets. I know mysql2 uses rb_wait_for_single_fd, at least.

I am not sure about this, libpq abstracts all of this stuff away from you
this is why Sean G wrote a complete binary protocol implementation in rust,
to gain control. pg gem does not use rb_wait_for_single_fd it just releases gvl.

We have to make sure there is some sort of path forward with Postgres here
it is a huge issue.

I don't know how expensive it is to parse the Pg protocol;
but I remember in the 1.8 days pg was one of the few gems to
use rb_thread_select and it played nicely with 1.8 green threads.
Can't say I know Pg well these days, it's been over a decade
since I used it with Ruby.

MiniRacer is CPU bound its basically packaging libv8 into
Ruby. I am on the fence here On one hand it would be nice to
auto yield so we feel reduced GVL pain and Ruby code can run
while v8 does it's thing. On the other hand the semantics of
one thread at 100% suddenly becomes 2 threads at 100% is not
ideal. Hard to decide.

How long does it release the GVL for? The thread pool /
workqueue idea I mentioned above might be a good fit for this
if the communications overhead can masked by the length of the
task. Nothing wrong with 2 threads at 100% if they're getting
work done faster than 1 thread at 100%.

I wouldn't want a native pool to be used for something like
getaddrinfo, however, that's hugely inefficient (but exactly
what getaddrinfo_a does internally in glibc).

#75 [ruby-core:85371] Updated by jjyr (jy j) 15 days ago

Excited to see this awesome feature! I'm implemented fiber-auto-schedule at ruby userland(light) few month ago(using monkey patch). Due to ruby complexity IO API (like: getc, getbyte, put,c, putbyte), it's hard to implement these methods without C, the built-in Threadlet or Thread::Green is all I want as a ruby user. (bad news for me is my library have no meaning to exists).

Two opinions:

  • The name Threadlet or Thread::Green both is easy to understand and to guess it behaviour, so as a application level user I think both is fine.
  • I think Mutex, ConditonVariable needed to be Thread::Green aware, cause if I write a thread-safe library using mutex, it's not make sense if it can't work under Thread::Green.

#76 [ruby-core:85417] Updated by normalperson (Eric Wong) 14 days ago

jjyruby@gmail.com wrote:

Excited to see this awesome feature! I'm implemented
fiber-auto-schedule at ruby
userland(light) few
month ago(using monkey patch). Due to ruby complexity IO API
(like: getc, getbyte, put,c, putbyte), it's hard to
implement these methods without C, the built-in Threadlet or
Thread::Green is all I want as a ruby user. (bad news for me
is my library have no meaning to exists).

Thank you for your response.

I agree a lot of the current IO stuff is difficult or costly to
implement outside of C. I hope some dependencies on C can
eventually be reduced; but stuff like supporting writev in
IO#write_nonblock https://bugs.ruby-lang.org/issues/14404
remind me some things are perhaps best done in C.

Anyways lightio can be counted as another reason to implement
this feature natively in core (along with previous efforts
dating back to Neverblock), so perhaps lightio already served
a great purpose :)

Two opinions:

The name Threadlet or Thread::Green both is easy to
understand and to guess it behaviour, so as a application
level user I think both is fine. I think the Mutex,
ConditonVariable needed to be Thread::Green aware, cause
if I write a thread-safe library using mutex, it's not make
sense if it can't work under Thread::Green.

Yes, I am strongly leaning towards making mutex, cv and queues
green-thread aware and I'm working on improving time
representations in core to that end:
https://bugs.ruby-lang.org/issues/14431
https://bugs.ruby-lang.org/issues/14452

#77 [ruby-core:85472] Updated by sam.saffron (Sam Saffron) 11 days ago

How long does it release the GVL for?

I would see it heavily depends on workload, but usually for our loads it is milliseconds for v8 work, in PGs case shortest duration is probably 0.5ms with a median more around 4-5ms

I would like to expand on the auto scheduler question here with a code example:

t1 = Thread::Green.new do
   while true
   end
end

t2 = Thread::Green.new do
   puts "hi"
end

t1.stop

I think the general expectation here is for this to output "hi" just like standard threads do.

I think we should probably support a ninja mode

Thread::Green.automatic_scheduling = false

Or something like that if we just want the fiber auto yield and nothing else, but the default should be safe.

Clearly safety is going to have to be somewhat limited until Fibers can move between threads cause you can be lost in C land.

Wondering what Matz and Koichi are thinking here?

Totally support mutex, cv and queue being green thread aware. Also would like to see that native timer which is green thread aware.

#78 [ruby-core:85531] Updated by normalperson (Eric Wong) 6 days ago

sam.saffron@gmail.com wrote:

How long does it release the GVL for?

I would see it heavily depends on workload, but usually for
our loads it is milliseconds for v8 work, in PGs case shortest
duration is probably 0.5ms with a median more around 4-5ms

It looks like currently pg is in the same boat as filesystem
access (which sucks): the GVL release overhead for file.c and
dir.c operations in 2.5 is painful to stomach on fast SSDs; but
they make dealing with HDDs, network FSes and USB/MMC devices
tolerable...

But yeah, synchronously waiting on read/write from the Pg
sockets is a total waste of native thread resources.

(to that end, I still want to get rid of the GVL because it
slows down those operations in single-threaded mode)

I would like to expand on the auto scheduler question here with a code example:

t1 = Thread::Green.new do
   while true
   end
end

t2 = Thread::Green.new do
   puts "hi"
end

t1.stop

I think the general expectation here is for this to output "hi" just like standard threads do.

Earlier messages from ko1 indicated he favors fewer
opportunities where scheduling happens:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/81495
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/81507

I definitely do not like switching at unpredictable points;
I would only want to switch when the current execution context
cannot proceed immediately.

I think we should probably support a ninja mode

Thread::Green.automatic_scheduling = false

Global switches like that probably lead to unpredictable
code across libraries. Maybe per-thread or per-block
options would be better; but even then libraries might
get confused or thrown off by it. However, if existing code
all assumes timeslice-based scheduling; maybe
per-block/per-thread isn't so bad.

Or something like that if we just want the fiber auto yield
and nothing else, but the default should be safe.

"safe" is a relative term :) Working on
https://bugs.ruby-lang.org/issues/14357 was yet another reminder
of why I don't like switching execution contexts at unpredictable
points.

Clearly safety is going to have to be somewhat limited until
Fibers can move between threads cause you can be lost in C
land.

Not sure what you mean by that.

Wondering what Matz and Koichi are thinking here?

ko1 has given some hints on this thread; and I remember reading
a developer's meeting summary where matz didn't want people to
massively rewrite their code to take advantage of this new
feature.

Totally support mutex, cv and queue being green thread aware.
Also would like to see that native timer which is green thread
aware.

  1. Thread::Green will have timeslice scheduling, 100% compatible
    API-wise with built-in mutex/cv/queue/etc and pure-Ruby code

  2. pipe and sockets become O_NONBLOCK by default (as in 1.8)
    when created inside green threads.

  3. rb_thread_blocking_region - uses a native thread pool transparently
    inside green threads.

This pool can auto-grow/shrink but the bound is the total
number of green threads in the system. It's safe to use a
big upper bound for existing applications since they already
expect heavyweight native threads from 1.9+.
We will fix+reuse USE_THREAD_CACHE hidden in current source
to manage this thread pool.

  1. introduce Thread options to give users ability to:
  2. force rb_thread_blocking_region to run in the current native thread
  3. disable timeslice-based switching

Disabling timeslice-based scheduling should become an option
with native threads, too.

While writing this email, I considered making Thread
green-by-default while doing the items 2-4 above; but
C extensions relying on pthread_{get,set}specific would
be broken by the transparent thread pool.

#79 [ruby-core:85575] Updated by ioquatix (Samuel Williams) 4 days ago

How does Process.wait behave in Thread::Green?

#80 [ruby-core:85576] Updated by normalperson (Eric Wong) 4 days ago

samuel@oriontransfer.org wrote:

How does Process.wait behave in Thread::Green?

Process.wait* methods use rb_waitpid internally, so it's
always been a scheduling point which lets other
auto-fibers/green-threads/whatever-we-call-them-this-week run.

#81 [ruby-core:85585] Updated by ioquatix (Samuel Williams) 4 days ago

The PG gem which uses libpq provides both synchronous and asynchronous APIs, and it is up to the client code to select one or the other. You can already use poll/select with the PG gem, that is not the issue here. Making it transparently async is simply not possible if a client uses the sync APIs.

Also available in: Atom PDF