Project

General

Profile

Actions

Feature #19057

open

Hide implementation of `rb_io_t`.

Added by ioquatix (Samuel Williams) about 2 years ago. Updated 3 months ago.

Status:
Assigned
Target version:
[ruby-core:110300]

Description

In order to make improvements to the IO implementation like https://bugs.ruby-lang.org/issues/18455, we need to add new fields to struct rb_io_t.

By the way, ending types in _t is not recommended by POSIX, so I'm also trying to rename the internal implementation to drop _t where possible during this conversion.

Anyway, we should try to hide the implementation of struct rb_io. Ideally, we don't expose any of it, but the problem is backwards compatibility.

So, in order to remain backwards compatibility, we should expose some fields of struct rb_io, the most commonly used one is fd and mode, but several others are commonly used.

There are many fields which should not be exposed because they are implementation details.

Current proposal

The current proposed change https://github.com/ruby/ruby/pull/6511 creates two structs:

// include/ruby/io.h
#ifndef RB_IO_T
struct rb_io {
  int fd;
  // ... public fields ...
};
#else
struct rb_io;
#endif

// internal/io.h
#define RB_IO_T
struct rb_io {
  int fd;
  // ... public fields ...
  // ... private fields ...
};

However, we are not 100% confident this is safe according to the C specification. My experience is not sufficiently wide to say this is safe in practice, but it does look okay to both myself, and @Eregon (Benoit Daloze) + @tenderlovemaking (Aaron Patterson) have both given some kind of approval.

That being said, maybe it's not safe.

There are two alternatives:

Hide all details

We can make public struct rb_io completely invisible.

// include/ruby/io.h
#define RB_IO_HIDDEN
struct rb_io;
int rb_ioptr_descriptor(struct rb_io *ioptr); // accessor for previously visible state.

// internal/io.h
struct rb_io {
  // ... all fields ...
};

This would only be forwards compatible, and code would need to feature detect like this:

#ifdef RB_IO_HIDDEN
#define RB_IOPTR_DESCRIPTOR rb_ioptr_descriptor
#else
#define RB_IOPTR_DESCRIPTOR(ioptr) rb_ioptr_descriptor(ioptr)
#endif

Nested public interface

Alternatively, we can nest the public fields into the private struct:

// include/ruby/io.h
struct rb_io_public {
  int fd;
  // ... public fields ...
};

// internal/io.h
#define RB_IO_T
struct rb_io {
  struct rb_io_public public;
  // ... private fields ...
};

Considerations

I personally think the "Hide all details" implementation is the best, but it's also the lest compatible. This is also what we are ultimately aiming for, whether we decide to take an intermediate "compatibility step" is up to us.

I think "Nested public interface" is messy and introduces more complexity, but it might be slightly better defined than the "Current proposal" which might create undefined behaviour. That being said, all the tests are passing.


Related issues 1 (0 open1 closed)

Related to Ruby master - Bug #19704: Unable to install readline-ext since 18e55fc1e1ec20e8f3166e3059e76c885fc9f8f2Closedioquatix (Samuel Williams)Actions

Updated by Eregon (Benoit Daloze) about 2 years ago

I support the Hide all details approach.
With one tweak: not int rb_ioptr_descriptor(struct rb_io *ioptr); but int rb_io_descriptor(VALUE io);.
The goal is to not expose the struct in the C API, so let's not expose it (even if it becomes just an opaque pointer).

Not exposing structs in the C API makes it much easier to support the C API and evolve it.
For example it would make it easier to implement the C API with JNI or so in TruffleRuby, similar to HPy efforts in Python.
Having to emulate structs on GraalVM Sulong is additional complexity and makes C extensions slower to warm up (https://medium.com/graalvm/porting-matplotlib-from-c-api-to-hpy-aa32faa1f0b5).

If we go directly to the Hide all details approach it might be more painful to transition though, i.e., more of a breaking change.

Maybe we can simply deprecate GetOpenFile (which BTW does not have a proper RB_ prefix so more reason to deprecate), or deprecate rb_io_t itself if that's possible?
And then of course we'd already add int rb_io_descriptor(VALUE io); etc.
So for 1-2 releases we wouldn't break anything, just deprecation warnings and let more time for people to switch to int rb_io_descriptor(VALUE io); etc.

Updated by ioquatix (Samuel Williams) about 2 years ago

I agree with Hide all details but I'm more willing to compromise on the interface.

With one tweak: not int rb_ioptr_descriptor(struct rb_io *ioptr); but int rb_io_descriptor(VALUE io);.

Maybe it's CRuby problem, but there is a performance issue with such an implementation which has to extract and follow the pointer to the struct over and over again. If we can accept that, I'm okay with your proposal, but it's also a significant cost to compatibility. If we don't care about backwards compatibility and only forwards compatibility, I'm happy to support that.

Actions #4

Updated by jeremyevans0 (Jeremy Evans) about 2 years ago

  • Tracker changed from Bug to Feature
  • Backport deleted (2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN)

Updated by kjtsanaktsidis (KJ Tsanaktsidis) over 1 year ago

I did a bit of research on this topic this evening.

Firstly, some technical notes r.e. undefined behaviour.

My understanding is that "Current proposal" is really undefined behaviour. This is in the C standard, section 6.5.2.3, point 6 (http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf)

One special guarantee is made in order to simplify the use of unions: if a union contains several
structures that share a common initial sequence (see below), and if the union object currently contains
one of these structures, it is permitted to inspect the common initial part of any of them anywhere
that a declaration of the completed type of the union is visible. Two structures share a common initial
sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a
sequence of one or more initial members

It seems that this "common initial sequence" rule is really only for structures accessed through unions. Another fun gotcha is that, even if you do do all the accesses through a union, the compiler can assume that rb_io_public_t * and rb_io_private_t * should never alias each other (which is the exact opposite of what you want). E.g. - this program has a different result depending on whether optimizations are on or not - http://coliru.stacked-crooked.com/a/b57c8dd9e2ef3a02

I think to be technically correct, we would need to go for the "Nested public interface" approach - structs can be converted to a pointer to their first field according to the C standard (although I guess the reverse cast from rb_io_public_t * back to rb_io_private_t * would be UB?)


I don't think this matters though, because I agree that, looking at the contents of rb_io_t, almost none of this should be public API, and we should strive to encapsulate it fully, like "Hide all details" (or like @Eregon (Benoit Daloze) suggested too).

I did a Github code search for usages of rb_io_t, and all of the usages of it I could find basically fell into this pattern.

VALUE some_io = /* from somewhere */;
rb_io_t *fptr;
GetOpenFile(some_io, fptr)

// call an io.h method that takes rb_io_t *
rb_io_set_nonblock(fptr);
// or read the fd member and do something with that
do_something_external_with_fd(fptr->fd);

I think we could avoid 99% of the breakage, and still hide all the implementation details of rb_io_t, by basically redefining rb_io_t to contain just the FD, and define a new internal, opaque type for internal IO use. We would do this like so:

Firstly, struct RFile is currently in include/ruby/internal/core/rfile.h as:

struct rb_io_t;
struct RFile {
    struct RBasic basic;
    struct rb_io_t *fptr;
};

There are two spare words in there, and we would use one to add a pointer to a new structure. The definition of that structure would not be provided in public headers anyhere. So, we would change RFile in include/ruby/internal/core/rfile.h to be:

struct rb_io_t;
struct rb_io_impl;
struct RFile {
    struct RBasic basic;
    struct rb_io_t *fptr;
    struct rb_io_impl *impl;
};

Then, we would change the definition of struct rb_io_t in include/ruby/io.h to be:

typedef struct rb_io_t {
    VALUE self;
    int fd;
    // _everything_ else is removed
} rb_io_t;

We would move all of its current contents to a new struct in internal/io.h:

struct rb_io_impl {
    // All the juicy implementation details go here - _except_ no need to duplicate `self` and `fd`.
};

The current definition of GetOpenFile essentially does RFILE(obj)->fptr, so that would return the newly-slimmed-down public rb_io_t. Attempts to read the file descriptor off that would continue to work.

Other methods in include/ruby/io.h which take a rb_io_t * as an argument, like e.g. rb_io_set_nonblock, would do RFILE(fptr->self)->impl to get access to the implementation details struct. We could also define variants of these which took the VALUE instead of the rb_io_t * to avoid the pointer-chase of fptr->self if desired.

The downside of this approach is that, as you identified, it costs an extra indirection in the implementation of most IO methods to obtain the impl from the VALUE. But, the benefits are that we get almost total encapsulation whilst maintaining backwards compatibility with almost all existing code. We can also shuffle fields between the private & public structs depending on observed compatibility issues found in the wild too.

Updated by Eregon (Benoit Daloze) over 1 year ago

ioquatix (Samuel Williams) wrote in #note-3:

Maybe it's CRuby problem, but there is a performance issue with such an implementation which has to extract and follow the pointer to the struct over and over again.

It's pretty rare to access multiple fields of rb_io_t in code I have seen, so then performance-wise it is the same if accessing a single field.
When accessing two fields it would be indeed an extra indirection, although I highly doubt this would be measurable overhead.

Coming from the other way, on all Ruby implementations besides CRuby, exposing struct rb_io *ioptr would mean allocating a struct just to support those accesses, for every IO instance, possibly lazily. That's a much much bigger overhead than an extra pointer indirection.

As HPy shows, a fast and portable C API does not expose structs.

Updated by ioquatix (Samuel Williams) over 1 year ago

Given the discussion here, I just want to aim for "Hide all the details".

Thanks everyone, the discussion has been most helpful.

I especially like the discussions around undefined behaviour and the example showing aliasing issues (did not know it could be possible haha).

Updated by ianks (Ian Ker-Seymer) over 1 year ago

With one tweak: not int rb_ioptr_descriptor(struct rb_io *ioptr); but int rb_io_descriptor(VALUE io);.
The goal is to not expose the struct in the C API, so let's not expose it (even if it becomes just an opaque pointer).

I have mixed feelings about this. On one hand, it does successfully hide the implementation. On the other, casting pointers to unsigned ints has implications for pointer provenance models. Would be nice to get the opaqueness without the int-cast.

Updated by Eregon (Benoit Daloze) over 1 year ago

ianks (Ian Ker-Seymer) wrote in #note-8:

With one tweak: not int rb_ioptr_descriptor(struct rb_io *ioptr); but int rb_io_descriptor(VALUE io);.
The goal is to not expose the struct in the C API, so let's not expose it (even if it becomes just an opaque pointer).

On the other, casting pointers to unsigned ints has implications for pointer provenance models.

What do you mean? struct RBasic* to VALUE or vice-versa?
Almost every C API function does that, so that seems a separate issue.

Although maybe the above is not clear? int rb_io_descriptor(VALUE io); would be called with the Ruby (IO) object (like almost every other C API function), not with a pointer (opaque or not) to the rb_io_t struct.

Updated by ioquatix (Samuel Williams) over 1 year ago

  • Status changed from Open to Closed

https://github.com/ruby/ruby/pull/6511 was merged.

In addition, the following related PRs were merged:

A few notes:

  • @Eregon (Benoit Daloze) was mostly correct in that most code was "Get the file descriptor and do something". Very few code paths were getting fptr and doing multiple things.
  • This does break some backwards compatibility but this is considered to be acceptable as we essentially want to hide the implementation and clean up legacy code paths.
  • This allows us to restructure some of the internals of the IO implementation for improved semantics and performance.
  • We will probably want to continue the effort to clean up some of the legacy APIs, e.g. GetOpenFile and MakeOpenFile etc.
  • We probably want to continue to deprecate and remove fptr style functions.
Actions #11

Updated by hsbt (Hiroshi SHIBATA) over 1 year ago

  • Related to Bug #19704: Unable to install readline-ext since 18e55fc1e1ec20e8f3166e3059e76c885fc9f8f2 added

Updated by ioquatix (Samuel Williams) over 1 year ago

Unfortunately this was reverted due to some extensions no longer compiling.

I think it's expected that some dependencies using internal details of rb_io_t should be fixed.

Actions #13

Updated by ioquatix (Samuel Williams) over 1 year ago

  • Status changed from Closed to Assigned

Updated by Eregon (Benoit Daloze) over 1 year ago

I am not sure there is an easier path here, IMO it's OK for some extensions to break while they test against ruby-head, and to adapt to it (and we are not close to December).
It would be nicer to deprecate first, but I am not sure how feasible is that.

It seems maybe GCC & Clang might support deprecating struct members: https://stackoverflow.com/questions/22823477/c-portable-way-to-deprecate-struct-members
Then we could deprecate every member of rb_io_t.
That seems useful, even if the warning only works on those compilers, most people would see it.

That said, even with deprecations for one release I'm pretty sure not all extensions using rb_io_t members will have migrated, so there will be some breakage anyway, just like for every deprecated thing.
But if extensions didn't migrate for more than a year, then it's really on them if they did not address it in time.

Another way could be to deprecate GetOpenFile (or the whole struct type if that's possible).
But that seems difficult because many rb_io_* functions take a rb_io_t*, e.g., FILE *rb_io_stdio_file(rb_io_t *fptr);, so we would then need to add a variant taking VALUE io for all of these (with another name).
There is less value in that, because if rb_io_t is opaque these functions are not really problematic, e.g. TruffleRuby could just return and cast the io for GetOpenFile(io), and it would require larger efforts to migrate.
The last paragraph of https://bugs.ruby-lang.org/issues/19057#note-2 was mentioning similar ideas about deprecation.

Updated by ioquatix (Samuel Williams) over 1 year ago

I'm in favour of eventually deprecating rb_io_t and I think that means anything related to it.

I think deprecations can make things a little easier but I'm not against breaking things that are out of our control.

I'm happy to fix things as they come up. Today we fixed nio4r, and polyphony, along with readline-ext and the other io-* gems.

I agree, once we stop exposing rb_io_t, even CRuby can take the same approach (just return the IO value). Whether we want to or not is another question, but I imagine only a few VALUE io functions are missing now.

Updated by ioquatix (Samuel Williams) over 1 year ago

Okay, here is a PR to introduce deprecations: https://github.com/ruby/ruby/pull/7916

Updated by byroot (Jean Boussier) over 1 year ago

@ioquatix (Samuel Williams) I'm not sure which change exactly is the cause, but it appears that the recent rb_io_t changes broke raindrops

current directory: /usr/local/lib/ruby/gems/3.3.0+0/gems/raindrops-0.20.1/ext/raindrops
make DESTDIR\= sitearchdir\=./.gem.20230609-25-giurc0 sitelibdir\=./.gem.20230609-25-giurc0 clean

current directory: /usr/local/lib/ruby/gems/3.3.0+0/gems/raindrops-0.20.1/ext/raindrops
make DESTDIR\= sitearchdir\=./.gem.20230609-25-giurc0 sitelibdir\=./.gem.20230609-25-giurc0
compiling linux_inet_diag.c
In file included from /usr/include/x86_64-linux-gnu/bits/libc-header-start.h:33,
                 from /usr/include/stdio.h:27,
                 from /usr/local/include/ruby-3.3.0+0/ruby/defines.h:16,
                 from /usr/local/include/ruby-3.3.0+0/ruby/ruby.h:25,
                 from /usr/local/include/ruby-3.3.0+0/ruby.h:38,
                 from linux_inet_diag.c:1:
/usr/include/features.h:187:3: warning: #warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE" [-Wcpp]
  187 | # warning "_BSD_SOURCE and _SVID_SOURCE are deprecated, use _DEFAULT_SOURCE"
      |   ^~~~~~~
In file included from linux_inet_diag.c:4:
my_fileno.h:9:7: warning: "HAVE_RB_IO_T" is not defined, evaluates to 0 [-Wundef]
    9 | #if ! HAVE_RB_IO_T
      |       ^~~~~~~~~~~~
my_fileno.h:16:8: warning: "HAVE_RB_IO_T" is not defined, evaluates to 0 [-Wundef]
   16 | #  if !HAVE_RB_IO_T || (RUBY_VERSION_MAJOR == 1 && RUBY_VERSION_MINOR == 8)
      |        ^~~~~~~~~~~~
my_fileno.h: In function ‘my_fileno’:
my_fileno.h:10:19: error: unknown type name ‘OpenFile’; did you mean ‘GetOpenFile’?
   10 | #  define rb_io_t OpenFile
      |                   ^~~~~~~~
my_fileno.h:25:2: note: in expansion of macro ‘rb_io_t’
   25 |  rb_io_t *fptr;
      |  ^~~~~~~
In file included from my_fileno.h:3,
                 from linux_inet_diag.c:4:
/usr/local/include/ruby-3.3.0+0/ruby/io.h:395:55: warning: assignment to ‘int *’ from incompatible pointer type ‘struct rb_io *’ [-Wincompatible-pointer-types]
  395 | #define RB_IO_POINTER(obj,fp) rb_io_check_closed((fp) = RFILE(rb_io_taint_check(obj))->fptr)
      |                                                       ^
/usr/local/include/ruby-3.3.0+0/ruby/io.h:401:21: note: in expansion of macro ‘RB_IO_POINTER’
  401 | #define GetOpenFile RB_IO_POINTER
      |                     ^~~~~~~~~~~~~
my_fileno.h:30:2: note: in expansion of macro ‘GetOpenFile’
   30 |  GetOpenFile(io, fptr);
      |  ^~~~~~~~~~~
/usr/local/include/ruby-3.3.0+0/ruby/io.h:395:55: warning: passing argument 1 of ‘rb_io_check_closed’ from incompatible pointer type [-Wincompatible-pointer-types]
  395 | #define RB_IO_POINTER(obj,fp) rb_io_check_closed((fp) = RFILE(rb_io_taint_check(obj))->fptr)
      |                                                  ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                                       |
      |                                                       int *
/usr/local/include/ruby-3.3.0+0/ruby/io.h:401:21: note: in expansion of macro ‘RB_IO_POINTER’
  401 | #define GetOpenFile RB_IO_POINTER
      |                     ^~~~~~~~~~~~~
my_fileno.h:30:2: note: in expansion of macro ‘GetOpenFile’
   30 |  GetOpenFile(io, fptr);
      |  ^~~~~~~~~~~
/usr/local/include/ruby-3.3.0+0/ruby/io.h:638:34: note: expected ‘rb_io_t *’ {aka ‘struct rb_io *’} but argument is of type ‘int *’
  638 | void rb_io_check_closed(rb_io_t *fptr);
      |                         ~~~~~~~~~^~~~
In file included from linux_inet_diag.c:4:
my_fileno.h:17:41: error: request for member ‘f’ in something not a structure or union
   17 | #    define FPTR_TO_FD(fptr) fileno(fptr->f)
      |                                         ^~
my_fileno.h:31:7: note: in expansion of macro ‘FPTR_TO_FD’
   31 |  fd = FPTR_TO_FD(fptr);
      |       ^~~~~~~~~~
linux_inet_diag.c: At top level:
cc1: warning: unrecognized command line option ‘-Wno-self-assign’
cc1: warning: unrecognized command line option ‘-Wno-parentheses-equality’
cc1: warning: unrecognized command line option ‘-Wno-constant-logical-operand’
make: *** [Makefile:248: linux_inet_diag.o] Error 1

Aside from the warnings (which do seem bad) the hard error seem to come from:

#include <ruby.h>
#ifdef HAVE_RUBY_IO_H
#  include <ruby/io.h>
#else
#  include <stdio.h>
#  include <rubyio.h>
#endif

#if ! HAVE_RB_IO_T
#  define rb_io_t OpenFile
#endif

Updated by kamil-gwozdz (Kamil Gwóźdź) over 1 year ago

byroot (Jean Boussier) wrote in #note-17:

@ioquatix (Samuel Williams) I'm not sure which change exactly is the cause, but it appears that the recent rb_io_t changes broke raindrops

I've noticed the same thing while trying to install kgio with the latest ruby HEAD.

Show

Updated by k0kubun (Takashi Kokubun) over 1 year ago

I've noticed the same thing while trying to install kgio with the latest ruby HEAD.

It looks like @byroot (Jean Boussier) sent a patch for this at https://yhbt.net/kgio-public/B5843088-6EC1-4D1B-A45A-2699CA75DD7F@gmail.com/T/#u.

Both patches are still not released. Unless @normalperson (Eric Wong) cuts a timely release for kgio and raindrops, we will not be able to use Unicorn on Ruby 3.3 because of this Feature.

Updated by nobu (Nobuyoshi Nakada) over 1 year ago

k0kubun (Takashi Kokubun) wrote in #note-20:

It looks like @byroot (Jean Boussier) sent a patch for this at https://yhbt.net/kgio-public/B5843088-6EC1-4D1B-A45A-2699CA75DD7F@gmail.com/T/#u.

rb_convert_type(io, T_FILE, "IO", "to_io") can be rb_io_get_io(io).

Actions #22

Updated by naruse (Yui NARUSE) over 1 year ago

  • Target version set to 3.3

Updated by k0kubun (Takashi Kokubun) over 1 year ago

cool.io is another gem broken by this feature. FYI, I filed a pull request here https://github.com/tarcieri/cool.io/pull/79.

Updated by ioquatix (Samuel Williams) over 1 year ago

Status of several projects that are affected by this change:

Updated by naruse (Yui NARUSE) over 1 year ago

I'll release Ruby 3.3.0 preview 2 soon.
I'm concerning that those three projects don't support the preview yet.
A preview release is to allow people to test their applications, but without support of those foundational projects it's hard to achieve the goal.
If those projects are not fixed before the preview release, I'll revert changes related to Feature #19057 for preview 2.

Updated by ioquatix (Samuel Williams) over 1 year ago

@naruse (Yui NARUSE) Here is the compatibility fix that will allow kgio, unicorn and so on to compile with no changes: https://github.com/ruby/ruby/pull/8286

Please feel free to merge that before doing the preview 2 release, if @normalperson (Eric Wong) has not released updated gems.

However, bear in mind this delays the inevitable - I'd still like to propose we remove this completely in 3.4 - 3.3 has deprecations, and 3.4 is removed. Does that sound okay to you?

Updated by byroot (Jean Boussier) about 1 year ago

Eric just said he'll cut new unicorn and kgio releases: https://yhbt.net/kgio.git/dbf5290cf9f89174f6b35a597af9a4226633d79b/s/

Will push out a release once docs are updated to more strongly
discourage the use of kgio and unicorn.

Actions #28

Updated by hsbt (Hiroshi SHIBATA) 12 months ago

  • Target version changed from 3.3 to 3.4

Updated by ioquatix (Samuel Williams) 11 months ago

Latest update:

Actions #31

Updated by Anonymous 11 months ago

"ioquatix (Samuel Williams) via ruby-core" wrote:

Still working on writing release notes and docs.
sharing common struct fields example (more below)

However, we are not 100% confident this is safe according to
the C specification. My experience is not sufficiently wide to
say this is safe in practice, but it does look okay to both
myself, and @Eregon (Benoit Daloze) + @tenderlovemaking (Aaron Patterson) have both given some
kind of approval.

That being said, maybe it's not safe.

It's safe. Normal (unpacked) C structs have a well-defined and
stable ABI. Ruby has been relying on it with RVALUE union and
RBasic->flags for decades. FFI users all rely on this C ABI
stability for calling C code w/o a C compiler.

gcc and clang have a transparent_union attribute designed
to make migrations like this easy and painless for old code.

rb_io_descriptor also introduces extra function call overhead.


Updated by ioquatix (Samuel Williams) 11 months ago

Are you saying that in general:

struct A {
  int x;
  float y;
  char z;
  struct S s;
};

and

struct B {
  int x;
  float y;
};

That reinterpreting a pointer to A as a pointer to B, and accessing field B::y is never undefined? And that generalises? The only rule I knew was the first field, if a struct, is guaranteed to be at offset=0. Are you able to link me to the standard?

While I understand what you are saying may be true, I was not able to find confirmation of it in the standard.

Actions #33

Updated by Anonymous 11 months ago

"ioquatix (Samuel Williams) via ruby-core" wrote:

Are you saying that in general:

struct A {
  int x;
  float y;
  char z;
  struct S s;
};

and

struct B {
  int x;
  float y;
};

That reinterpreting a pointer to A as a pointer to B, and accessing
field B::y is never undefined? And that generalises? The only rule I
knew was the first field, if a struct, is guaranteed to be at offset=0.
Are you able to link me to the standard?

(oops, battery died last week and I forgot about Ruby entirely :x)

As long as it's the same HW arch, yes. AFAIK C itself doesn't say anything
about it, it's up to every ABI. https://refspecs.linuxfoundation.org/elf/
has a bunch of ABIs. abi386-4.pdf (via pdftotext) states:

Each member is assigned to the lowest available offset with the appropriate
alignment. This may require internal padding, depending on the previous
member.

While I understand what you are saying may be true, I was not able to
find confirmation of it in the standard.

It's a lot of arch-specific ABIs to read through and I haven't
read every one; but keep in mind FFI/Rust interop would not work
at all without a predictable C ABI. I do similar things via
Perl syscall wrapper and pack (same as Ruby Array#pack) on
various architectures the GCC Compiler Farm (cfarm) provides
with no problems.

You can always add STATIC_ASSERT on the offsets to ensure new
platforms are behaving. I don't expect Ruby will be able to run
at all if there's a hypothetical platform with unpredictable offsets.

All this only applies to C and normal C types (not C++)


Updated by Eregon (Benoit Daloze) 11 months ago

Agreed, from the hardware POV it has to store the struct in a reasonable/consistent manner in memory.
I would be more worried about the C compiler, if the C compiler figures out a pointer is casted to "unrelated" struct types it might consider that as undefined behavior and do anything.
But I suspect that's explicitly not undefined behavior, because C has no notion of struct types being related or not, and there is likely tons of software using structs "inheriting/extending" another smaller struct.
It might not explicitly be allowed either because I guess the fully correct way to do this is using a union of both structs, and not casting to a specific struct directly then.

Actions #35

Updated by Anonymous 10 months ago

"Eregon (Benoit Daloze) via ruby-core" wrote:

I would be more worried about the C compiler, if the C
compiler figures out a pointer is casted to "unrelated" struct
types it might consider that as undefined behavior and do
anything.

It's not undefined, every platform defines its own C ABI.
Any toolchain which ignores that ABI cannot interoperate at all
(which is fine for cases where you don't need interop)

Consider 1) how many languages/runtimes call C libraries.
Now consider 2) how many languages/libraries get called by C.

1 is common, 2 is rare and needs explicit instructions in
the non-C language (e.g. extern "C" in C++ headers).

gcc and clang generate objects from C which can safely be linked
by the others' linker. This isn't true for C++ at all.

FFI is completely dependent on a stable ABI.

But I suspect that's explicitly not undefined behavior,
because C has no notion of struct types being related or not,
and there is likely tons of software using structs
"inheriting/extending" another smaller struct.

It might not explicitly be allowed either because I guess the
fully correct way to do this is using a union of both structs,
and not casting to a specific struct directly then.

FFI and Rust would not be able to use C libraries if the
C toolchain were told to violate its platform ABI.

Consider "struct sockaddr *" use in C standard library:

Functions like accept, connect, bind, getpeername, getsockname,
etc. all take a "struct sockaddr *" arg, but callers are expected
to allocate one of sockaddr_{in,in6,un,storage,...} to call them.

No unions are forced on a user for those sockaddr functions.
(I end up defining unions myself in what little C code I maintain,
but that's my choice for maintainability).

Same for POSIX fcntl locks, various ioctls, getsockopt/setsockopt,
or any other function which takes multiple struct types.

You can use FFI or syscall+pack/unpack on all of those
functions.

To make things more explicit, gcc and clang has
transparent_union which is intended to help document and
preserve ABI compatibility:
https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Common-Type-Attributes.html#index-transparent_005funion-type-attribute


Updated by mame (Yusuke Endoh) 9 months ago

This change has broken our internal CI. Because our CI contributes to assure the quality of Ruby master, it is a shame that it will not work until the release of unicorn, which we do not know when (and whether) it will be released. I hope to postpone the change at least until unicorn etc. is released.

Updated by ioquatix (Samuel Williams) 9 months ago

@mame (Yusuke Endoh) are you able to point it at the git head?

e.g.

gem "unicorn", git: "https://yhbt.net/unicorn.git"
# or
gem "unicorn", git: "https://github.com/socketry/unicorn.git"

That should get things moving again.

I hope Eric will release an updated unicorn soon. However, if that doesn't happen, I see a couple of options:

  • We can release unicorn2 gem.
  • Companies with a vested interest could fork unicorn for internal use.
  • Companies could contact Eric and offer incentives for him to make a release.

Updated by byroot (Jean Boussier) 9 months ago

are you able to point it at the git head?

You can't just do that because Unicorn has a bunch of files that need to be generated and are not compiled. So you need a custom fork: https://github.com/k0kubun/unicorn/commit/6215d4cad7a964bf4b8bdef48edadf334eae73ee

Updated by ioquatix (Samuel Williams) 9 months ago

I don't use unicorn so thanks for the clarification. Maybe someone who does know, can write some documentation about the current state of affairs and how to use the current head. My trivial gem build and gem install seemed to work, but I didn't test it beyond that.

Updated by mame (Yusuke Endoh) 9 months ago

Why don't you reconsider the "nested public interface" approach?

From the reaction to this ticket, it is clear that forcing the "hide all the details" approach could destroy the Ruby ecosystem. And there is no need to force it because you have a more moderate alternative approach.

Updated by matz (Yukihiro Matsumoto) 9 months ago

I agree with @mame (Yusuke Endoh). This change would break too many tests, apps, etc. We cannot accept the change at the moment.
Can we be more conservative?

Matz.

Updated by ioquatix (Samuel Williams) 9 months ago

The simplest option right now is to revert this change and try again later.

@mame (Yusuke Endoh) are there any other gems apart from unicorn you are concerned about? In other words, if unicorn makes a release, then you don't have a problem with this change right?

@matz (Yukihiro Matsumoto) is this also your position? Is unicorn the only blocker that you are concerned about?

Updated by Eregon (Benoit Daloze) 9 months ago

From the reaction to this ticket, it is clear that forcing the "hide all the details" approach could destroy the Ruby ecosystem.

"destroy the Ruby ecosystem" seems an exaggeration if it's just unicorn not working, because there was no release in 2+ years.
Are we going to never remove an API because one gem seems not maintained anymore and relies on it?

It might even be useful if people notice unicorn is no longer maintained actively, e.g. if security issues are found there might not be a release fixing them either.
Note that I would be very happy to be proven wrong about unicorn being no longer maintained.

Updated by ioquatix (Samuel Williams) 9 months ago · Edited

As an alternative (optional) course of action, I've released the latest head of unicorn as unicorn-maintained gem. You can use this instead of unicorn and it will work on Ruby head.

https://github.com/unicorn-ruby/unicorn

Anyone who wants to help contribute/maintain it, I am happy to add you to the organisation. It also includes raindrops-maintained as this has not been released.

Updated by mame (Yusuke Endoh) 9 months ago

ioquatix (Samuel Williams) wrote in #note-43:

Here is the revert PR: https://github.com/ruby/ruby/pull/10283

Thanks!

Just FYI, I think the precedence in Ruby's decision is basically:

  • beauty of the language (conciseness and intuitiveness for users) >= compatibility > runtime performance and efficiency >>> beauty of the implementation (simplicity for the core developers).

Compared to the past, compatibility has become much more of a priority, and performance has become somewhat more important. The order is sometimes reversed in rare cases, but the beauty of the implementation has always been the lowest priority.

The "beauty of the implementation", such as hiding the rb_io_t implementation, can beat "compatibility" only in limited and special cases, e.g., only when there is no other way at all, or when the practical impact is very small (no affected gem is identified, or it is only an issue with minor gems). In this case, unicorn, which is undeniably an important gem and still part of the Ruby ecosystem, is affected and there are known implementation workarounds. In that case, there is no reason not to choose the workaround.

Updated by Eregon (Benoit Daloze) 9 months ago

Re beauty of the implementation this change is more than that, we can see in https://github.com/ruby/ruby/pull/10283/files that rb_io_t exposes way too much internals, which in turns prevents changing/evolving these internals without affecting extensions, can easily break ABI unintentionally, can have negative performance impact, etc.
It's also rather brittle with the internal duplicate definition of rb_io_t, and could easily cause segfaults if the structs don't match (e.g. when working on this code and then wasting a bunch of time due to that, or maybe subtle enough that the CI would not catch and it would only be found later).

I hope we can still land this change, if it needs to wait 3.5 or so then that sounds a good compromise.

Updated by ioquatix (Samuel Williams) 9 months ago · Edited

At @mame's request, I've reverted this change. The reversion itself was straightforward, yet it does not mitigate the underlying concerns that have come to light through this work. My attempt to enhance Ruby's IO implementation has highlighted a vulnerability within our ecosystem: the infrequency of updates and the ambiguity surrounding future releases from the maintainers of essential gems, such as unicorn.

This places the Ruby community in a precarious position, not just in terms of adopting improvements but more importantly, in our ability to respond to security vulnerabilities. The reliance on these gems by a significant segment of our community increases the potential impact of any unresolved issues.

@mame (Yusuke Endoh) if you truly believe a malfunctioning unicorn can "destroy the Ruby ecosystem", then I fear we have bigger problems that need to be discussed - it would only take a security issue or two to put us all companies that depend on unicorn in a very difficult position. Because I care about the future of Ruby, I made the decision to make a community fork of unicorn as I don't see any other realistic option. I wish there was another way.

Updated by k0kubun (Takashi Kokubun) 9 months ago · Edited

I don't see any other realistic option. I wish there was another way.

FWIW, @byroot (Jean Boussier) built a fork https://github.com/shopify/pitchfork, which has a reforking feature Unicorn doesn't have, and Shopify uses that for our largest application in production.

The reliance on these gems by a significant segment of our community increases the potential impact of any unresolved issues.

I'd say it's unrealistic to expect all existing users to switch to something else before the Ruby 3.4 release. For example, Sprockets v3 is much more popular than v4 (source) while it was released 6 years ago and is way past its EOL. Since that version depends on the legacy interface of ERB (which I maintain), the gem's legacy interface (deprecated 6 years ago) is very much stuck, like rb_io_t. But I'd rather not break existing Rails applications unless it's absolutely necessary. I don't think it's more important to improve/hide implementation details than to keep existing applications running.

Since the author himself strongly discourages the use of Unicorn, users should, however, eventually stop using Unicorn. For those who are looking for Unicorn alternatives, you may use Pitchfork, or Puma with 1 thread per process (for Unicorn compatibility).

Actions #50

Updated by Anonymous 8 months ago

ioquatix (Samuel Williams) wrote:

  • Companies could contact Eric and offer incentives for him to make a release.

That's not possible, https://yhbt.net/unicorn/ISSUES states:

The author of unicorn must never be allowed to profit off the damage it's
done to the entire Ruby world.

I'm 100% banned for life from ever profitting off anything related to unicorn.

"mame (Yusuke Endoh) via ruby-core" wrote:

Why don't you reconsider the "nested public interface" approach?

Samuel: please do this. Ruby even has (Linux||ccan) `container_of'
macro as another option:

struct rb_io_private {
	struct rb_io { // public ABI
		int fd;
		// any other public fields in used in real-world
	} io_pub;
	// private stuff here
	// private fields can go above `io_pub', too
};

Then only expose the `io_pub' field to public structs and access
rb_io_private via ccan_container_of.

But the previously discussed ways are valid C since every known
platform has a stable ABI (otherwise FFI would never work)

I expect there are other gems and private extensions affected by
this C API change (if they survived the 1.8 -> 1.9 changes).

From the reaction to this ticket, it is clear that forcing the
"hide all the details" approach could destroy the Ruby
ecosystem. And there is no need to force it because you have a
more moderate alternative approach.

Too bad that's already happened over the decades I've been
around Ruby. Ruby lost numerous users due to a neverending
stream of incompatibilities introduced every year. The only way
I can maintain the legacy Ruby code I still have is to rewrite
tests in a different language (e.g. Perl or POSIX sh (NOT bash))

I'm completely burned out with having to constantly deal with a
never ending stream of incompatibilities over the past ~20
years. This mentality has propagated to the entire ecosystem;
e.g. Rack::Chunked was deprecated and my proposed patches sent
to to maintain compatibility were
completely ignored in Sep 2022.

frozen_string_literal will be another major pain point, and
the nagging from chilled strings won't do much to make things
better (I thought that was decided against a decade ago).

Finally, MFA on Rubygems is a misguided corporate attempt at
security. I'm an amateur volunteer refuse to be held
responsible for the security of multi-billion dollar
corporations. I've never claimed any professional or academic
qualifications. Nobody knows me, nobody ever will; I only show
you code and that's all anybody should need for security.

I'll probably end up self-hosting my own gems and only put
future releases on a self-hosted server. Of course, I claim no
qualifications in security or systems administration.

Users are welcome to fork (and pitchfork exists) if they'd
rather live under the boot of corporate rule and Terms of
Service that can change at any time. I'm not going to put
myself in a position where I can't contribute to code I still
depend on. I'm already effectively banned from 99.9% of
projects due to draconian corporate terms of service and
high HW requirements.


Updated by ioquatix (Samuel Williams) 8 months ago · Edited

Why don't you reconsider the "nested public interface" approach?

My assessment of this approach is that it would require a rewrite of internal code that accesses rb_io_t private outer struct. Rewriting code isn't a problem, but it takes a lot of time and effort that I don't have in abundance right now. Additionally, it's still only a partial solution. One of the issues is accessing the file descriptor directly and the handling of IO#close from a different thread/fiber. Using the file descriptor directly can be problematic.

Ruby lost numerous users due to a never-ending stream of incompatibilities introduced every year.

Compatibility and progress are always something to balance out. I think on the whole, Ruby does a pretty decent job at it. Lack of forward progress also causes users to look elsewhere as the language stagnates. So this isn't just a matter of "preserve compatibility at all costs" as the outcome can be equally bad. The hard part is finding the middle ground of compatibility and progress. Both compatibility at any cost, and progress at any cost, are naive statements.

Updated by mame (Yusuke Endoh) 8 months ago

Eric, thanks for your comment. I understand your concern about Ruby's incompatibility.

Sometimes it is inevitable to introduce incompatibilities to improve a language specification (you may not agree with me here), but I believe that we shoud make reasonable efforts to avoid other incompatibilities.

I am not familiar with the situation of Rack, but I have heard a rumor that the incompatibilities in Rack 3 are so large that the migration has not progressed. That's unfortunate.
Also, frozen_string_literal change was unfortunately approved, which I personally still believe that is the wrong direction. It is really a shame to have an inconvenient and inconsistent language specification for a performance reason.

I am also sorry about RubyGems MFA. I think the policy like Eric's is not very uncommon in the OSS world. I can do nothing because I am not involved in the RubyGems operation, but I hope the RubyGems team will give some consideration to a policy like yours.

Updated by byroot (Jean Boussier) 8 months ago

the nagging from chilled strings

Eric, I understand that it won't make your annoyance go away, but just in case you didn't know, if you wish to get rid of these warnings easily, you can add # frozen_string_literal: false at the top of all Unicorn's source files, like so:

https://github.com/unicorn-ruby/unicorn/commit/23da1fc46a18ed0330081b55892b34476e7910e5.patch

Actions #54

Updated by Anonymous 8 months ago

"ioquatix (Samuel Williams) via ruby-core" wrote:

Issue #19057 has been updated by ioquatix (Samuel Williams).

Why don't you reconsider the "nested public interface" approach?

My assessment of this approach is that it would require a rewrite of all internal code that accesses rb_io_t internals. Rewriting code isn't a problem, but it takes a lot of time and effort that I don't have in abundance right now. Additionally, it's still only a partial solution. One of the issues is accessing the file descriptor directly and the handling of IO#close from a different thread/fiber. Using the file descriptor directly can be problematic.

Again, why not the container_of solution I proposed?

It shouldn't be that much for you since the relevant core
internals could be divorced from extensions and significantly
less work for stressed out and overworked downstream maintainers.

Cross-thread IO#close or close(2) is a PITA to deal with
portably if another thread is using that FD in any way.
This applies to pure C projects, too, not just Ruby ones.
Either:

  1. ensure only one thread operates on a given FD and closes it;
    this is typical for stream FDs.

  2. get all threads to abandon the FD before closing it.
    I do this for thread-shared packetized FDs which are
    analogous to Queue (O_DIRECT pipe, SOCK_SEQPACKET,
    listen sockets, kqueue/epoll FDs, etc...).

Atomics + out-of-band (pthread_kill) signaling work for both
cases. I've also used sentinel values in-band for packetized
FDs (analogous to pushing `:stop' messages into a thread-shared
Queue, like forcing a connect() in one thread to get blocking
accept() to wake up; or adding an (eventfd||pipe) into a
(kqueue||epoll) from another thread (it's 100% safe to
EPOLL_CTL_ADD||EV_ADD from different threads and I've been
abusing this safety for well over a decade)

Ruby lost numerous users due to a never-ending stream of incompatibilities introduced every year.

Compatibility and progress are always something to balance out. I think on the whole, Ruby does a pretty decent job at it. Lack of forward progress also causes users to look elsewhere as the language stagnates. So this isn't just. a matter of "preserve compatibility at all costs", as those costs will be the same. The hard part is finding the middle ground of compatibility and progress. Both compatibility at any cost, and progress at any cost, are naive statements.

git and Linux kernel do a pretty good job at maintaining
compatibility while adding new features. OTOH some of the
incompatibility proposals for Perl 5 have made me look into
writing more C...


Updated by nobu (Nobuyoshi Nakada) 3 months ago

I found rb_io_open_descriptor is a bad design, it can results in FD-leaks, because a file descriptor isn't a Ruby object with finalizer.
It should be "allocate then open & set" instead of "open then allocate & initialize".

Updated by nobu (Nobuyoshi Nakada) 3 months ago

OK, unless FMODE_EXTERNAL is given descriptor will be closed when a new object could not be allocated.
I thought about pty extension, and it seems to work at least for pty.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0