Project

General

Profile

Actions

Feature #19521

closed

Support for `Module#name=` and `Class#name=`.

Added by ioquatix (Samuel Williams) almost 2 years ago. Updated over 1 year ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:112773]

Description

See https://bugs.ruby-lang.org/issues/19450 for previous discussion and motivation.

This proposal introduces Module#name= (and thus also Class#name=) to set the temporary class name. The name assignment has no effect if the module/class already has a permanent name.

c = Class.new do
  self.name = "fake name"
end

c = Class.new
c.name = "fake name"

Alternatively, we could use set_name:

Class.new do
  set_name "fake_name"
end

Setting the name of a class changes its current name, irrespective of whether it's been assigned a permanent name, or has nested modules/classes which have cached a previous name. We might like to limit the cases where a name is set, e.g. only once, only if the name is nil, or only if it's not already permanent. There is no real harm in any of those options, just inconsistency.

Example usage

The current Ruby test suite has code which shows the usefulness of this new method:

  def labeled_module(name, &block)
    Module.new do
      singleton_class.class_eval {
        define_method(:to_s) {name}
        alias inspect to_s
        alias name to_s
      }
      class_eval(&block) if block
    end
  end
  module_function :labeled_module

  def labeled_class(name, superclass = Object, &block)
    Class.new(superclass) do
      singleton_class.class_eval {
        define_method(:to_s) {name}
        alias inspect to_s
        alias name to_s
      }
      class_eval(&block) if block
    end
  end
  module_function :labeled_class

The updated code would look like this:

  def labeled_module(name, &block)
    Module.new do
      self.name = name
      class_eval(&block) if block
    end
  end

  def labeled_class(name, superclass = Object, &block)
    Class.new(superclass) do
      self.name = name
      class_eval(&block) if block
    end
  end
  module_function :labeled_class

Because the name cannot be set as part of .new, we have to have a separate block to set the name, before calling class_eval. I think the ergonomics and performance of this are slightly worse than the counter proposal.


Related issues 3 (0 open3 closed)

Related to Ruby master - Feature #19520: Support for `Module.new(name)` and `Class.new(superclass, name)`.RejectedActions
Related to Ruby master - Feature #19450: Is there an official way to set a class name without setting a constant?ClosedActions
Related to Ruby master - Bug #20892: `ObjectSpace.dump` can produce broken JSON for classes with temporary namesClosedActions
Actions #1

Updated by ioquatix (Samuel Williams) almost 2 years ago

  • Description updated (diff)
Actions #2

Updated by ioquatix (Samuel Williams) almost 2 years ago

  • Description updated (diff)
Actions #3

Updated by ioquatix (Samuel Williams) almost 2 years ago

  • Description updated (diff)
Actions #4

Updated by ioquatix (Samuel Williams) almost 2 years ago

  • Description updated (diff)
Actions #5

Updated by Eregon (Benoit Daloze) almost 2 years ago

  • Related to Feature #19520: Support for `Module.new(name)` and `Class.new(superclass, name)`. added
Actions #6

Updated by Eregon (Benoit Daloze) almost 2 years ago

  • Related to Feature #19450: Is there an official way to set a class name without setting a constant? added

Updated by Eregon (Benoit Daloze) almost 2 years ago

At least it should be forbidden to change the name of a Module multiple times.
So from that POV that, #19520 is better.

Also the name of a Module get cached into other modules "namespaced" under that first Module.
So it seems highly problematic to allow changing the name of a Module.

In general I'm against this functionality as explained in https://bugs.ruby-lang.org/issues/19450#note-14.
It can lead to confusion and lies about the program state, e.g., pretend there is Foo::Bar when there isn't, even when remove_const/const_set are not used.

Updated by ioquatix (Samuel Williams) almost 2 years ago

At least it should be forbidden to change the name of a Module multiple times.
So from that POV that, #19520 is better.

We can make assignment fail if the name is not nil.

Also the name of a Module get cached into other modules "namespaced" under that first Module.
So it seems highly problematic to allow changing the name of a Module.

Yes, that's correct, the temporary name gets cached into nested modules,

c = Class.new
c.class_eval("class Nested; end")

c::Nested
=> #<Class:0x00007f7f11ae6aa0>::Nested

c.class_eval("class Nested2; end")

c::Nested
=> #<Class:0x00007f7f11ae6aa0>::Nested
c::Nested2
=> fake::Nested2

I agree, #19520 is better in this regard.

It can lead to confusion and lies about the program state, e.g., pretend there is Foo::Bar when there isn't, even when remove_const/const_set are not used.

As discussed, this is already the case, and can be easily replicated:

Foo = foo = Class.new
=> Foo

Foo.class_eval("class Bar; end")
Foo::Bar
=> Foo::Bar

Object.send(:remove_const, "Foo")
=> Foo

Foo = b = Class.new
=> Foo

a::Bar.name
=> "Foo::Bar"

Foo::Bar
uninitialized constant Foo::Bar (NameError)

I agree that the proposal makes it easier to fake class names, but that functionality is already possible and thus I don't see that as an issue being introduced by this proposal.

Updated by Eregon (Benoit Daloze) almost 2 years ago

ioquatix (Samuel Williams) wrote in #note-8:

It can lead to confusion and lies about the program state, e.g., pretend there is Foo::Bar when there isn't, even when remove_const/const_set are not used.

As discussed, this is already the case, and can be easily replicated:

I just said: "even when remove_const/const_set are not used".
So sure it's possible with remove_const with a made-up example. Is it done in practice? I don't think so.

I agree that the proposal makes it easier to fake class names, but that functionality is already possible and thus I don't see that as an issue being introduced by this proposal.

It makes it extremely easy to have fake names, in fact I would expect with this feature many people would do the mistake to use a constant-like name and so break the symmetry between module names and constant paths.
In the current situation, it's extremely rare remove_const is used and you would still have access to that module, e.g. I know of no gem doing that.

Updated by ioquatix (Samuel Williams) almost 2 years ago

So sure it's possible with remove_const with a made-up example. Is it done in practice? I don't think so.

You can't argue that the existing implementation is sound if it's not, and you can't argue that this proposal breaks soundness if it's already broken. The argument doesn't stand up to scrutiny.

I can certainly understand the appeal of eval(class.name) == class always being true. However, it's trivial to demonstrate that it's not the case, even without this proposal. If you think that's important, then we should consider how to fix it, because it's an entirely separate issue from this, even if they are tangentially related.

Updated by matz (Yukihiro Matsumoto) over 1 year ago

I agree with adding an ability to name classes/modules. But I am against the method name name=. It's too short and handy (and tempting to abuse).
If we could have a better name, I accept this proposal.

Matz.

Updated by ioquatix (Samuel Williams) over 1 year ago

Thanks Matz, are you happy with #set_temporary_name which is how it's referred to internally (and in contrast to assigning a permanent name when you assign to an actual constant).

An alternative would be #set_anonymous_name but I think it's a bit confusing (what is an anonymous name, if something is named, it's no longer anonymous).

Updated by matz (Yukihiro Matsumoto) over 1 year ago

Unfortunately, I am not 100% satisfied for both names. For set_temporary_name, the name implies limited scope or time of name availability, but it's not.
For set_anonymous_name, the name assumes the target is an anonymous (nameless) class or module, but it's not.
If I have to choose one from above two, I'd take set_temporary_name, but I still want to seek a better name.

Matz.

Updated by ioquatix (Samuel Williams) over 1 year ago

For more context, CRuby itself refers to permanent and temporary names, and anonymous components/names.

Here: https://github.com/ruby/ruby/blob/d6bddcb0137d5a640eb22fbd17f9aa83f71fbd48/variable.c#L111

/**
 * Returns +classpath+ of _klass_, if it is named, or +nil+ for
 * anonymous +class+/+module+. A named +classpath+ may contain
 * an anonymous component, but the last component is guaranteed
 * to not be anonymous. <code>*permanent</code> is set to 1
 * if +classpath+ has no anonymous components. There is no builtin
 * Ruby level APIs that can change a permanent +classpath+.
 */
static VALUE
classname(VALUE klass, bool *permanent) {...}

and here: https://github.com/ruby/ruby/blob/d6bddcb0137d5a640eb22fbd17f9aa83f71fbd48/internal/class.h#L71

struct rb_classext_struct {
    // ...
    bool permanent_classpath : 1;
    // ...
    VALUE classpath;
};

It's also referred to as a temporary path here: https://github.com/ruby/ruby/blob/d6bddcb0137d5a640eb22fbd17f9aa83f71fbd48/variable.c#L138

static VALUE
make_temporary_path(VALUE obj, VALUE klass)
{
    VALUE path;
    switch (klass) {
      case Qnil:
        path = rb_sprintf("#<Class:%p>", (void*)obj);
        break;
      case Qfalse:
        path = rb_sprintf("#<Module:%p>", (void*)obj);
        break;
      default:
        path = rb_sprintf("#<%"PRIsVALUE":%p>", klass, (void*)obj);
        break;
    }
    OBJ_FREEZE(path);
    return path;
}

I don't think there is a better name if you want to be consistent with the existing implementation and documentation.

I will wait for a week to see if anyone has any further feedback, otherwise I'll use the name set_temporary_name. Does that seem reasonable?

Updated by Dan0042 (Daniel DeLorme) over 1 year ago

matz (Yukihiro Matsumoto) wrote in #note-11:

I agree with adding an ability to name classes/modules. But I am against the method name name=. It's too short and handy (and tempting to abuse).

What if name= was a no-op or raised an error if the class already has a permanent name? That way it would be hard to abuse.

Updated by ufuk (Ufuk Kayserilioglu) over 1 year ago

If we are adding a setter method, do we really need the set_ prefix? Moreover, does assigning a "temporary name" change the result of calling #name?

IMO, we should add #temporary_name= and #temporary_name methods, and a non-nil value for "temporary name" should only change the behaviour of #to_s and #inspect, not #name.

Updated by Dan0042 (Daniel DeLorme) over 1 year ago

Note that #name already returns the temporary name

m1 = Module.new
m1::NAMED = m2 = Module.new
m2.name #=> "#<Module:0x00005606bc180560>::NAMED"

So if you're talking about changing that current behavior, it may be ok but I'm not sure.

Updated by ioquatix (Samuel Williams) over 1 year ago

@matz (Yukihiro Matsumoto) do you mind clarifying what kind of abuse you are concerned about and whether limitations on Module#name= as outlined above are sufficient to consider using #name=?

What if name= was a no-op or raised an error if the class already has a permanent name? That way it would be hard to abuse.

Updated by timcraft (Tim Craft) over 1 year ago

What about Module#label= and Class#label= etc?

  • If the value e.g. "fake name" cannot be used to reference the class then is that really a name?
  • The naming of the #labeled_module and #labeled_class methods posted in the description suggest label.
  • A string that contains descriptive text for human / display purposes and isn't meaningful otherwise... in user interfaces that's commonly called a label. (A similar alternative would be #display_name but that feels clunkier than #label.)
  • With the implementation @ufuk suggests (#to_s and #inspect changing behaviour if the attribute is not nil), that would seem to satisfy the use case of nicer debugging output, without needing to affect #name.

Updated by ioquatix (Samuel Williams) over 1 year ago

@timcraft thanks for your input. All those issues have already been discussed, and it won't work for nested classes or existing code that uses class names without a huge retrofit. I also don't like introducing label as a new concept since it's entirely unrelated to anything that already exists in the documentation and semantically doesn't really align up with what is known internally as a temporary class/module name.

Updated by ioquatix (Samuel Williams) over 1 year ago

Based on the discussion with Matz, here is the updated proposal:

c = Class.new
# => #<Class:0x00000001031b99d8>
c.set_temporary_name("Hello")
# => Hello

C = c
# => C
C.set_temporary_name("Hello")
# `set_temporary_name': can't change permanent name (RuntimeError)

After the class is assigned a permanent name, the temporary name can no longer be set.

Updated by ioquatix (Samuel Williams) over 1 year ago

I've updated the PR with documentation and tests: https://github.com/ruby/ruby/pull/7483

Updated by matz (Yukihiro Matsumoto) over 1 year ago

Accepted. Thank you.

Matz.

Updated by Eregon (Benoit Daloze) over 1 year ago

@matz (Yukihiro Matsumoto) Are you really OK with things like:

c = Class.new
c.set_temporary_name "String"
c.new.foo # => undefined method 'foo' for #<String:0x00007efc38711fc0> (NoMethodError)

c = Class.new
c.set_temporary_name "Object"
c.new.foo # => undefined method 'foo' for #<Object:0x00007efc38711fc0> (NoMethodError)

The potential to use it wrong seems extremely high to me.
I think a simple way to avoid this confusion is to raise an exception if the name given to set_temporary_name starts with an upper-case letter.

Updated by ioquatix (Samuel Williams) over 1 year ago

@Eregon (Benoit Daloze) I understand your point, but why would anyone do that? Isn't that kind of shooting yourself in the foot?

And in any case, your example is already trivially possible without this PR/change:

c = Class.new
String = c
c.new.foo # => undefined method `foo' for #<String:0x00000001308896c8> (NoMethodError)

Updated by byroot (Jean Boussier) over 1 year ago

Aside from the feature itself, I must say I'm very surprised by the set_ prefix of that method, it's quite rare in Ruby.

Updated by ioquatix (Samuel Williams) over 1 year ago

My understanding is, because it's not strictly an assignment, it shouldn't look like that of the user. Also, it's more convenient to use:

my_class = Class.new do
  set_temporary_name "my_class"
end

Using name= or temporary_name= would always be prefixed by self as you know it would otherwise introduce a local variable. So, I think all things considered, it's okay.

Updated by Eregon (Benoit Daloze) over 1 year ago

ioquatix (Samuel Williams) wrote in #note-25:

@Eregon (Benoit Daloze) I understand your point, but why would anyone do that? Isn't that kind of shooting yourself in the foot?

Well, even the docs and specs you added in that PR do that, it's inviting people to do it without telling them the big problem or protecting against it.

And in any case, your example is already trivially possible without this PR/change:

Nope, that is fine (besides of course the bad idea to override a core constant, but at least you get a warning for that),
because the class of that object can in fact be referenced by String or at least it has been related to the constant path String at some point.

Class.new.tap { _1.set_temporary_name "Foo" } has never been related to the constant Foo. It's a lie.

And if you look at usages of labeled_module/labeled_class e.g. with git grep labeled_ test you can see for almost all of them it would be confusing to use set_temporary_name.
First they should use a non-valid constant name to make it clear e.g., what is currently labeled as klass = EnvUtil.labeled_class("Parent") cannot be accessed through Parent (otherwise good luck to write tests or any Ruby if you regularly cannot trust the output of a NoMethodError due to this new feature/bug).

Also a problem for things as basic as p obj while debugging, the class name it shows could be a lie if there is no restriction here, likely wasting huge amounts of time due to confusion.

Updated by Dan0042 (Daniel DeLorme) over 1 year ago

@ioquatix (Samuel Williams), how would you feel about raising an error if the temporary name is a valid constant name?

Updated by ufuk (Ufuk Kayserilioglu) over 1 year ago

ioquatix (Samuel Williams) wrote in #note-25:

And in any case, your example is already trivially possible without this PR/change:

c = Class.new
String = c
c.new.foo # => undefined method `foo' for #<String:0x00000001308896c8> (NoMethodError)

There is a big difference with what's possible today and what would be possible after this change:

Curently:

c = Class.new
String = c
# both of the following show basically the same error, since `String` now means something else.
c.new.upcase # => undefined method `upcase` for #<String:0x00000001308896c8> (NoMethodError)
String.new.upcase # => undefined method `upcase` for #<String:0x00000001308896ca> (NoMethodError)

After this change:

c = Class.new
c.set_temporary_name "String"
# this is now a confusing error
c.new.upcase # => undefined method `upcase` for #<String:0x00000001308896c8> (NoMethodError)
# since this is not an error at all
String.new.upcase # => ""

In my opinion, it is very confusing to refer to something as "String" when actually referring to the String constant gives you different behaviour.

I believe this exactly what @Eregon (Benoit Daloze) is trying to prevent, and I am in agreement. The current "temporary names" are things like "<Module:0x00005606bc180560>" which are clearly not references to constants, due to the naming, and any other temporary should also be explicitly non-constant names to prevent any confusion like the above.

Updated by ioquatix (Samuel Williams) over 1 year ago

I am okay with restricting names to be not-constant-names, e.g.

        if (rb_is_const_name(name)) {
            rb_raise(rb_eArgError, "name must not be valid constant name");
        }

However, this will prevent labeled_class/labeled_module from using set_temporary_name, at least until those labels are updated to be "not a constant name" or we remove this restriction. When I did this change, ~35 tests failed. I don't know if all my use cases will be satisfied, so I can try to introduce this limitation, and if I'm satisfied, I can leave it. Otherwise, I might revisit it if such a restriction turns out to be an issue.

Ultimately, I don't disagree that such things can be confusing. It's pretty obvious how such a feature can be used incorrectly, as can many features of Ruby.

@ufuk your example is reasonable, but it seems like it's fairly trivial to make it confusing as the first:

old_string = String
c = Class.new
String = c
s = c.new
String = old_string
p (s.upcase rescue $!)
p String.new.upcase

I have yet to see an example of something that can be done with set_temporary_name that can't be done with a few lines of code that work on Ruby 3.2 today. Anyone can package the above code into a method, e.g.

def set_name(thing, name)
  thing.set_temporary_name(name)

  return thing
rescue ArgumentError
  Object.class_eval do
    if current = const_get(name)
      remove_const(name)
    end

    const_set(name, thing)
    remove_const(name)
  ensure
    if current
      const_set(name, current)
    end
  end

  return thing
end

Such an implementation is pretty trivial and bypasses any protections we can add to set_temporary_name. Yes, maybe we make it a little harder? But there are valid use cases which we now can't support directly. Ruby is a dynamic language, and this is a consequence of such dynamic behaviour which cannot, realistically, be completely prevented.

Updated by Dan0042 (Daniel DeLorme) over 1 year ago

However, this will prevent labeled_class/labeled_module from using set_temporary_name, at least until those labels are updated to be "not a constant name" or we remove this restriction. When I did this change, ~35 tests failed. I don't know if all my use cases will be satisfied, so I can try to introduce this limitation, and if I'm satisfied, I can leave it. Otherwise, I might revisit it if such a restriction turns out to be an issue.

Ah interesting, I didn't realize it was already used that way. Since the ruby main repo already demonstrates how to "lie" about the class name in a sensible and useful way, I believe this is perfectly fine.

Ruby is a dynamic language, and this is a consequence of such dynamic behaviour which cannot, realistically, be completely prevented.

FWIW, I 100% agree. As Matz himself said:

The possibility to make code cryptic itself should not be the reason to withdraw a feature.

Updated by Eregon (Benoit Daloze) over 1 year ago

Dan0042 (Daniel DeLorme) wrote in #note-32:

Ah interesting, I didn't realize it was already used that way. Since the ruby main repo already demonstrates how to "lie" about the class name in a sensible and useful way, I believe this is perfectly fine.

I think there is a misunderstanding, currently there is no lying in exception messages showing the module name of such a labeled module:

# labeled_class from the description
c = labeled_class("Parent")
c.new.foo # => undefined method `foo' for #<#<Class:0x00007f1c4c1f5878>:0x00007f1c4c1f50a8> (NoMethodError)

If set_temporary_name is unrestricted, it would be (for labeled_class("Parent")):

c.new.foo # => undefined method `foo' for #<Parent:0x00007f1c4c1f50a8> (NoMethodError)

which would naturally make you look for a class called Parent except either there isn't one or it's not the same class, which is rather confusing.

However, this will prevent labeled_class/labeled_module from using set_temporary_name, at least until those labels are updated to be "not a constant name" or we remove this restriction.

This seems very easy to address, labeled_module/labeled_class could wrap the name in <...> or #<...> or #<Class:...> or the callers could start with a lowercase letter, etc. All these make it clear those are not the constant paths (and never have been) of these anonymous modules, which is what I believe is most important here.

Every Rubyist expects the name of a module if it looks like a constant path to be a way to reach it, let's preserve that. Yes it's not a guarantee and yes it's possible to break it, but production code doing that would just be broken code. And this unrestricted set_temporary_name would make super easy to break it, without even realizing that, that I believe would make it a very confusing and dangerous feature, because it breaks something that every Rubyist relies on every day, not on purpose.

It's easy to add the restriction that set_temporary_name can only be used with a name which is not a valid constant path now. It's impossible to add it later. I am sure many Rubyists will regret it if we don't add this restriction which preserves the design of Ruby module naming.

Updated by ioquatix (Samuel Williams) over 1 year ago

  • Status changed from Open to Closed

Thanks everyone for your input.

To start with, we adopted the limitation of only allowing non-constant names.

I think this is a reasonable limitation but it prevents some usage for existing cases.

If it becomes a problem, we can revisit it.

Updated by sawa (Tsuyoshi Sawada) over 1 year ago

Given that we have const_set, the name for this feature should be temporary_name_set, not set_temporary_name. Don't you think so?

Updated by Dan0042 (Daniel DeLorme) over 1 year ago

@sawa there are also methods that start with set_, for example set_trace_func and set_encoding. I think one difference is that most xyz_set methods tend to also have a xyz_get counterpart. Which is not the case for set_temporary_name.

ObjectSpace.each_object(Module){ |c| a=c.instance_methods(false).grep(/\A[gs]et_/); p c=>a unless a.empty? }
{Exception=>[:set_backtrace]}
{#<Class:Kernel>=>[:set_trace_func]}
{Thread=>[:set_trace_func]}
{IO::Buffer=>[:get_value, :get_values, :set_value, :set_values, :get_string, :set_string]}
{ARGF.class=>[:set_encoding]}
{IO=>[:set_encoding, :set_encoding_by_bom]}

ObjectSpace.each_object(Module){ |c| a=c.instance_methods(false).grep(/_[gs]et\z/); p c=>a unless a.empty? }
{Enumerable=>[:to_set]}
{Kernel=>[:instance_variable_get, :instance_variable_set]}
{Module=>[:const_get, :class_variable_get, :const_set, :class_variable_set]}
{Thread=>[:thread_variable_get, :thread_variable_set]}
{Binding=>[:local_variable_get, :local_variable_set]}
{#<Class:Gem::Requirement>=>[:source_set]}
Actions #37

Updated by byroot (Jean Boussier) 2 months ago

  • Related to Bug #20892: `ObjectSpace.dump` can produce broken JSON for classes with temporary names added
Actions

Also available in: Atom PDF

Like3
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0