Feature #21704
openExpose rb_process_status_new to C extensions
Description
A fiber scheduler implementation with a hook for #process_wait needs to return a Process::Status object, but currently it is not possible for a C extension to directly create an instance of Process::Status. The technique currently used in various fiber scheduler implementations is to use pidfd_open, poll for fd readiness, then call Process::Status.wait, which creates the instance.
On recent Linux kernels (6.7 or newer), io_uring_prep_waitid can be used to directly wait for process termination, which provides us with the pid and status of the terminated child process, but there's no way to directly create an instance of Process::Status, required for implementing the #process_wait hook.
Exposing the internal rb_process_status_new function would allow such an implementation. Using io_uring_prep_waitid would also lead to better compatibility of fiber schedulers with calls to Process.wait(0) or Process.wait(-1), as those cannot be done using pidfd_open.
The associated PR is here: https://github.com/ruby/ruby/pull/15213
An working fiber scheduler implementation of process_wait using io_uring_prep_waitid has been submitted here: https://github.com/socketry/io-event/pull/154
Updated by akr (Akira Tanaka) 4 months ago
rb_process_status_new is declared in the PR as follows.
/**
* Creates a new instance of Process::Status.
*
* @param[in] pid The process ID.
* @param[in] status The "waitpid status", as returned by waitpid(2). This is NOT the exit status/exit code, see waitpid(2).
* @param[in] error Error code (if waitpid(2) returned -1).
* @return VALUE An instance of Process::Status.
*/
VALUE rb_process_status_new(rb_pid_t pid, int status, int error);
I doubt about error argument.
If waitpid(2) returns -1, waitpid is failed.
Why Process::Status object is created in such situations?
What does it mean?
I feel the API is not well designed.
Updated by ioquatix (Samuel Williams) 11 days ago
ยท Edited
Why Process::Status object is created in such situations?
Because Process::Status represents either "the result of waitpid" or "errno of waitpid" and is used to transfer this information until it's finally consumed by the end user, e.g. as an instance of Process::Status or raising an error as per Process.wait etc.
IOW, it captures the full fidelity result of waitpid.
Technically, it might be preferential to write something like:
Process.wait -> Process::Status.new or raise
But due to the internal design, it's not that simple. And things like the scheduler still need to pass both "status" and "errno" back to the caller, and when I say "errno", I don't actually mean the global errno, as errno could be passed back via epoll, kqueue or io_uring data structures. And "errno" isn't aways strictly an error, e.g. ECHILD -> no more child processes to wait on.
I'm not strongly in favour of this API but we probably need something like it (accepts both the result of waitpid as well as the "errno").