Project

General

Profile

Misc #16803

Discussion: those internal macros reside in public API headers

Added by shyouhei (Shyouhei Urabe) 2 months ago. Updated about 2 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
[ruby-core:97992]

Description

A while ago I merged https://github.com/ruby/ruby/pull/2991 ("Split ruby.h"). This seems working. But the changeset raised several questions.

The biggest one relates to those newly publicised macros and inline functions. For instance RUBY3_STATIC_ASSERT is a macro that expands to either _Static_assert (for C) or static_assert (for C++). A similar mechanism has been implemented inside of our repository for a while. The pull request moved the definition around to be visible outside.

Discussion #1

Is it a good idea or a bad idea, to make them visible worldwide?

Discussion #2

Why not publicise everything? For instance debuggers could benefit from ruby internal symbols.

Discussion #3

It is relatively hard for us to change public APIs (doing so could break 3rd party gems). We don't want that happen for internal APIs. How do we achieve future flexibility?

Updated by Eregon (Benoit Daloze) 2 months ago

I think as much as feasible macros accessing internals should not be in public headers.
OTOH, I think macros purely for portability between platforms, compilers, C/C++ compatibility are fine.

RUBY3_STATIC_ASSERT seems harmless (just a help for a messy C/C++ language and portability) but anything poking at the internals of, e.g., RString is harmful.
Harmful because it prevents further optimizations,
it increases maintenance cost as it become time-consuming to remove/change,
it increases the cost for alternative implementations to support it (e.g., TruffleRuby),
and it increase the chance for 3rd-party extensions misusing it (= more bugs).

As we've seen with the so-called "intern.h" header it's now considered public API, so 3rd-party extensions seem to take for granted that anything in public headers is public API.

Could you give some pointers or example of other macros introduced in that change?
Are they mostly about portability, or also about accessing internals?

Updated by shyouhei (Shyouhei Urabe) 2 months ago

Eregon (Benoit Daloze) wrote in #note-1:

Could you give some pointers or example of other macros introduced in that change?
Are they mostly about portability, or also about accessing internals?

Almost every new macros are for portability. For instance RUBY3_HAS_BUILTIN is another macro that got publicised this time. Its former name was __has_builtin, which of course mimics clang behaviour for everybody else. There are also newly introduced inline functions. They are refactored bits of former macros, hence they touch internals of e.g. RString.

If you want the complete list, new macros are:

git grep '^# *define RUBY3.*[^H]$'

and new inline functions are:

git grep '^ruby3_.*)$'

Also, I left this paragraph to every new header files introduced:

Symbols prefixed with either RUBY3 or ruby3 are
implementation details. Don't take them as canon. They could
rapidly appear then vanish. The name (path) of this header file
is also an implementation detail. Do not expect it to persist
at the place it is now. Developers are free to move it anywhere
anytime at will.

I believe I made it clear that 3rd parties shall never use those names. No enforcement though.

Updated by Eregon (Benoit Daloze) 2 months ago

shyouhei (Shyouhei Urabe) wrote in #note-2:

If you want the complete list, new macros are:

git grep '^# *define RUBY3.*[^H]$'

I took a look, these look fine to me, they seem almost all about portability, except RUBY3_ANYARGS_DISPATCH*.

I think most of those could even be public portability helper macros (with a different prefix then, like RB_), and they wouldn't be hard to support long term as MRI needs them anyway.

and new inline functions are:

git grep '^ruby3_.*)$'

Few of these, but the naming seems like they are more likely to be used by native extensions.
For example, intuitively ruby3_str_new_cstr() sounds like a "the Ruby 3 way/the new C-API to create a Ruby String from a C String".
But it's not, it's just an internal function and implementation detail.

I think the only issue here is that ruby3_/RUBY3_ is fairly unclear, unless people read that comment.
But I can easily imagine they might miss that comment due to looking at the middle of the file, or just ignore it.

I think we should consider other prefixes which might be clearer.
ruby3_/RUBY3_ sounds too much like "the new way to do things/the new C-API" when it's not.

Maybe ruby_internal_/RUBY_INTERNAL_?
That's quite long, but those macros/functions are already long, and maybe a longer name is intuitively clearer it shouldn't be used in native extensions (plus, you know, there is "internal" in it).

Any thought on using ruby_internal_/RUBY_INTERNAL_?

Updated by shyouhei (Shyouhei Urabe) 2 months ago

Thank you for looking at the new macros/functions.

Eregon (Benoit Daloze) wrote in #note-3:

I took a look, these look fine to me, they seem almost all about portability, except RUBY3_ANYARGS_DISPATCH*.

Yes. ANYARGS related ones were explicitly converted to have RUBY3_* prefix, to make them something non-standard. I don't think we want people to use them directly.

I think most of those could even be public portability helper macros (with a different prefix then, like RB_), and they wouldn't be hard to support long term as MRI needs them anyway.

Or maybe they could be separated into different header-only library, like ccan. I guess they are ruby-agnostic.

For example, intuitively ruby3_str_new_cstr() sounds like a "the Ruby 3 way/the new C-API to create a Ruby String from a C String".
But it's not, it's just an internal function and implementation detail.

I think the only issue here is that ruby3_/RUBY3_ is fairly unclear, unless people read that comment.
But I can easily imagine they might miss that comment due to looking at the middle of the file, or just ignore it.

I think we should consider other prefixes which might be clearer.
ruby3_/RUBY3_ sounds too much like "the new way to do things/the new C-API" when it's not.

Maybe ruby_internal_/RUBY_INTERNAL_?
That's quite long, but those macros/functions are already long, and maybe a longer name is intuitively clearer it shouldn't be used in native extensions (plus, you know, there is "internal" in it).

Any thought on using ruby_internal_/RUBY_INTERNAL_?

I chose RUBY3/ruby3 only to avoid jamming my fingers by typing some long names. I feel for instance ruby_internal_right_shift_is_arithmetic_p() is a bit too long. So while I don't stick at 3, I would prefer something shorter than _internal.

Looking at other projects, Boost for instance uses detail for this purpose. It seems there are other projects using impl. What about those names, like ruby_impl or even, rbimpl?

Updated by Eregon (Benoit Daloze) 2 months ago

shyouhei (Shyouhei Urabe) wrote in #note-4:

What about those names, like ruby_impl or even, rbimpl?

I think those are already much clearer than ruby3_.
ruby_internal_ might be even clearer but indeed it's long.

I think rbimpl_ is a good compromise.
Would it be OK to change the prefix to that?
I think it's a great improvement over the ruby3_ prefix.

To help comparing visually:

static inline long
rb_fix2long(VALUE x)
{
    if (ruby3_right_shift_is_arithmetic_p()) {
        return ruby3_fix2long_by_shift(x);
    }

    if (ruby_impl_right_shift_is_arithmetic_p()) {
        return ruby_impl_fix2long_by_shift(x);
    }

    if (rbimpl_right_shift_is_arithmetic_p()) {
        return rbimpl_fix2long_by_shift(x);
    }

    if (ruby_internal_right_shift_is_arithmetic_p()) {
        return ruby_internal_fix2long_by_shift(x);
    }

Updated by ioquatix (Samuel Williams) 2 months ago

If it was me, I'd literally call it ruby/internal/file.h to make it very clear, along with ruby3 -> rb_internal_/RB_INTERNAL_

Assuming impl means implementation, it's not clear that it's private, but internal does mean a certain kind of private.

Updated by shyouhei (Shyouhei Urabe) 2 months ago

ioquatix (Samuel Williams) wrote in #note-7:

If it was me, I'd literally call it ruby/internal/file.h to make it very clear, along with ruby3 -> rb_internal_/RB_INTERNAL_

Assuming impl means implementation, it's not clear that it's private, but internal does mean a certain kind of private.

That arrangement could be confusing when nobu (Nobuyoshi Nakada) merges his https://github.com/nobu/ruby/tree/feature/src-dir branch. You should persuade him first.

Updated by Eregon (Benoit Daloze) 2 months ago

shyouhei (Shyouhei Urabe) wrote in #note-6:

https://github.com/ruby/ruby/pull/3079

Looks a lot clearer to me :)

I think indeed having ruby/internal/attr/const.h instead of ruby/impl/attr/const.h would be even clearer,
and for the directory name I expect the length to type doesn't matter much.

Regarding nobu (Nobuyoshi Nakada)'s branch, I see it moves internal/vm.hinclude/internal/vm.h and so we'd have include/internal/ and include/ruby/internal/.
Maybe slightly confusing but I also think it makes sense: both are internal, and one of them is make install-ed and the other not, because include/ruby/internal/ needs to be public headers to compile correctly.

Updated by ioquatix (Samuel Williams) about 2 months ago

also think it makes sense: both are internal, and one of them of make install-ed and the other not, because include/ruby/internal/ needs to be public headers to compile correctly.

Agreed. I think because nobu (Nobuyoshi Nakada) was going to use internal we should also use internal here to be consistent.

Updated by shyouhei (Shyouhei Urabe) about 2 months ago

OK then, let's move the directory from impl to internal. Will update the pull request.

Updated by ko1 (Koichi Sasada) about 2 months ago

Please respect current convention rb_/RB_ are called in ruby interpreter (such as rb_str_new()) and ruby_/RUBY_ can be called from outside of ruby interpreter (such as ruby_xmalloc()). RUBY3_... are already renamed, but I want to note it here.

PS. I like rb_internal_ because it is clearer. rb_internal_ functions are not exposed, so I don't think the length is important.

Updated by Eregon (Benoit Daloze) about 2 months ago

ko1 (Koichi Sasada) wrote in #note-12:

Please respect current convention rb_/RB_ are called in ruby interpreter (such as rb_str_new()) and ruby_/RUBY_ can be called from outside of ruby interpreter (such as ruby_xmalloc()). RUBY3_... are already renamed, but I want to note it here.

I think very few people know this, is this documented somewhere? It would be nice to document it e.g., in ruby.h or ruby/ruby.h.

PS. I like rb_internal_ because it is clearer. rb_internal_ functions are not exposed, so I don't think the length is important.

rb_internal_ looks nice, but it also begins with rb_ which gives a "public API" feeling. So rbimpl_ seems better visually as "don't use (externally)".

Also available in: Atom PDF