Project

General

Profile

Actions

Feature #19450

closed

Is there an official way to set a class name without setting a constant?

Added by ioquatix (Samuel Williams) about 1 year ago. Updated about 1 year ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:112494]

Description

This is the best I could come up with:

klass = Class.new

Object.const_set("Klass", klass)
Object.send(:remove_const, "Klass")

puts klass.new
# => #<Klass:0x0000000100a9d688>

Can we do better?

What about something like:

Class.new(name: "Klass")

or

Class.new do
  def self.name
    "Klass"
  end
end

etc


Related issues 2 (0 open2 closed)

Related to Ruby master - Feature #19521: Support for `Module#name=` and `Class#name=`.ClosedActions
Related to Ruby master - Feature #19520: Support for `Module.new(name)` and `Class.new(superclass, name)`.RejectedActions
Actions #1

Updated by ioquatix (Samuel Williams) about 1 year ago

  • Description updated (diff)

Updated by nobu (Nobuyoshi Nakada) about 1 year ago

  • Description updated (diff)

ioquatix (Samuel Williams) wrote:

Can we do better?

I'm not sure what you want to do exactly.

Class.new do
  def self.name
    "Klass"
  end
end

This is what EnvUtil#labeled_class does.

Updated by ioquatix (Samuel Williams) about 1 year ago

Thanks @nobu (Nobuyoshi Nakada) let me check it. Well, the implementation of labeled_class looks a bit tricky but if it works, it's good enough.

Updated by ioquatix (Samuel Williams) about 1 year ago

Okay, so I tried it out, but it doesn't work consistently:

def labeled_class(name, superclass = Object, &block)
  Class.new(superclass) do
    singleton_class.class_eval {define_method(:to_s) {name}; alias inspect to_s}
    class_eval(&block) if block
  end
end

foo = labeled_class("foo")
foo.class_eval("class Bar; end")

p foo
p foo::Bar

So, basically, given the above, I still prefer an interface like Class.new(..., name: ...). The name I want to use is something like "Controller(path/to/file.rb)" or "Test(path/to/file.rb)" or something like that.

Updated by nobu (Nobuyoshi Nakada) about 1 year ago

ioquatix (Samuel Williams) wrote in #note-4:

So, basically, given the above, I still prefer an interface like Class.new(..., name: ...).

Although I'm fine with this interface,

The name I want to use is something like "Controller(path/to/file.rb)" or "Test(path/to/file.rb)" or something like that.

These names look tricky too.

Updated by ioquatix (Samuel Williams) about 1 year ago

Sure, but it's better than

irb(main):053:0> p foo::Bar
#<Class:0x00007fd0ad4b3860>::Bar
=> #<Class:0x00007fd0ad4b3860>::Bar
irb(main):054:0> foo::Bar.new
=> #<#<Class:0x00007fd0ad4b3860>::Bar:0x00007fd0ad81ec70>

Updated by Eregon (Benoit Daloze) about 1 year ago

I don't think Class#inspect should ever lie, so IMO there shouldn't be a way.

In the comment above, the problem is foo is not assigned to a proper constant, that's the obvious fix, to assign foo to a constant.
The file in which the class is defined is another feature, maybe we should add Module#source_location (TruffleRuby already tracks that for interop).

Updated by zverok (Victor Shepelev) about 1 year ago

Just an aside note: ActiveRecord models redefine #inspect (and not #name) to achieve "informative description" effect:

p User
User(id: integer, email: citext, name: string, time_zone: string, ...)

For me, it seems more reasonable than redefining #name (like, Class#name has an implicit contract of "being reachable by the name, considered as a name of a constant containing the class).

Updated by matheusrich (Matheus Richard) about 1 year ago

Semi-related (maybe it should be a separate issue?) but assigning a class to a constant via rightward assignment results in NameError

Class.new => Klass
# uninitialized constant Klass (NameError)

Updated by jeremyevans0 (Jeremy Evans) about 1 year ago

matheusrich (Matheus Richard) wrote in #note-9:

Semi-related (maybe it should be a separate issue?) but assigning a class to a constant via rightward assignment results in NameError

Class.new => Klass
# uninitialized constant Klass (NameError)

Not a bug. "Rightward assignment" is not assignment, but pattern matching:

1 => Bar     # raises NameError, because Bar is not defined
1 => Integer # nil
1 => String  # NoMatchingPatternError, because String === 1 is not true 

Updated by Eregon (Benoit Daloze) about 1 year ago

What is the use-case here?
To name otherwise-anonymous test classes/modules?
Why not simply assigning them to a real constant, so their name is truthful and not a lie?

Having a filename in Module#name like in https://bugs.ruby-lang.org/issues/19450#note-4 is almost certainly a bad idea and will break various things.

Updated by ioquatix (Samuel Williams) about 1 year ago

The use case is explained in https://bugs.ruby-lang.org/issues/19450#note-6 - to make it more descriptive when class names are printed which have anonymous ancestors. An example of this is web frameworks which load files into anonymous modules, those modules can be named by path or mounted location in the web application. My test framework sus uses anonymous classes for tests, and I want to give them better names. RSpec uses the "assign to global constant approach" and this makes for poor names and complicated code. My web framework utopia loads controllers into anonymous modules and classes (for live reloading they are not assigned globally) and I want to give them better names. The labeled_class and labeled_module helpers are used in over 100 different places in the Ruby test suite alone. There seem endless examples of code that would benefit from a simple and correct approach to specifying the class name.

I'm also pretty sure assigning to a constant is incompatible with Ractor in some cases..

Why not simply assigning them to a real constant, so their name is truthful and not a lie?

Not all useful names are valid constant names, and not all modules should have a specific place in the global namespace.

Whether or not you agree with it, it's already happening in a lot of places, e.g. overriding Class#name is fairly common, but doesn't cover all the cases unfortunately due to the internal design/cache of class names. So you end up with odd edge cases where things don't work. This PR seeks to simplify that.

Updated by Eregon (Benoit Daloze) about 1 year ago

ioquatix (Samuel Williams) wrote in #note-13:

An example of this is web frameworks which load files into anonymous modules, those modules can be named by path or mounted location in the web application.

Which web framework? Who wants model/user.rb::User instead of User/MyApp::User? Nobody I suspect.
Also this just wouldn't work if any class needs to refer to another class, they need to be in the same namespace anyway.
You could name the outer module, that would make a lot more sense and be usable to refer to the class from Ruby code.

I think it's just a bad idea to use anonymous modules/classes for important functionality.
I think anonymous modules/classes are basically only good for tests/mocks or quick hacks.

Whether or not you agree with it, it's already happening in a lot of places, e.g. overriding Class#name is fairly common, but doesn't cover all the cases unfortunately due to the internal design/cache of class names.

Where? I don't count labeled_{class,module}, as there is no need for that in ruby/spec.
I think some test framework used to/does something like that, but it seems generally accepted as a mistake.

Updated by Eregon (Benoit Daloze) about 1 year ago

I'm also pretty sure assigning to a constant is incompatible with Ractor in some cases..

No, modules and classes are always shared between Ractors. Assigning to a constant needs to be in the main Ractor but so does loading any code anyway.

My web framework utopia loads controllers into anonymous modules and classes (for live reloading they are not assigned globally) and I want to give them better names.

I suppose they share the outer module then?
Then the names already look like #<Module:0x00007f416ffb7950>::C, no?
That's good enough IMO and a good representation of what is being done.


The fundamental thing here is Module#name must return a String which is how to access that module, unless some part is anonymous and then of course it cannot be accessed via a constant path.
This feature would break that, and I think no feature is worth breaking that.

Updated by ioquatix (Samuel Williams) about 1 year ago

The fundamental thing here is Module#name must return a String which is how to access that module, unless some part is anonymous and then of course it cannot be accessed via a constant path.
This feature would break that, and I think no feature is worth breaking that.

It's already the case that Module#name can return a string which is not how you access that module (or class). Rails already does that for ActiveRecord models. Anyone can override that method, so I don't see how that's relevant.

Also, the proposed implementation does not change the behaviour of permanent names, so it does not break non-anonymous classes.

x = Class.new(Object, "foo")
x.name # => "foo"
X = x
x.name # => "X"

Then the names already look like #Module:0x00007f416ffb7950::C, no? That's good enough IMO and a good representation of what is being done.

I basically disagree with this. Object.const_set to assign a name to what should otherwise be anonymous modules, is a hack. The proposed PR provides a mechanism to give anonymous modules and classes a name, and there are literally hundreds of examples of people trying to achieve that.

Updated by Eregon (Benoit Daloze) about 1 year ago

ioquatix (Samuel Williams) wrote in #note-16:

It's already the case that Module#name can return a string which is not how you access that module (or class). Rails already does that for ActiveRecord models. Anyone can override that method, so I don't see how that's relevant.

As @zverok (Victor Shepelev) pointed above, Rails only overrides #inspect, and does super first in all cases. That's quite fair.

Yes, anyone can redefine name on their module.
But it doesn't affect Module#name and error messages, which IMO is good, so error messages don't lie about this.
And BTW it seems useless to put the file path in the module name for exceptions, since those will show the relevant path in the backtrace anyway.

Related, Marshal and other places use rb_mod_name() (or similar) to find the class name.
If this new feature affects that, it's basically a security issue as it could load the wrong class/module all too easily.
Also whether a module is named is even more complex with this, because these fake names shouldn't be considered to be actually fully named by a constant path.

Also, the proposed implementation does not change the behaviour of permanent names, so it does not break non-anonymous classes.

One requirement for that is the given name should not be a valid constant name then (not start with a uppercase letter).
That would at least helpful to identify these fake module names.

Then the names already look like #Module:0x00007f416ffb7950::C, no? That's good enough IMO and a good representation of what is being done.

I basically disagree with this. Object.const_set to assign a name to what should otherwise be anonymous modules, is a hack.

I see the other side of it, this feature is a hack, the way to name a module in Ruby is and has always been to assign to a constant.
That's great because it means consistency between name and the way to access that module.

The proposed PR provides a mechanism to give anonymous modules and classes a name, and there are literally hundreds of examples of people trying to achieve that.

Yeah, and I think in all these cases it's worth rethinking the approach. Either name it, or accept you are using anonymous modules and classes and so don't try to hide the fact they are anonymous by giving a fake constant path.

Updated by ioquatix (Samuel Williams) about 1 year ago

Here are some of the places in my own code I'd like to use such a feature:

There are other systems that would benefit from this, e.g. RSpec, or any test framework which loads tests into anonymous modules, or any system that basically creates Module.new/Class.new and subsequently loads code from one or more files into them.

But it doesn't affect Module#name and error messages, which IMO is good, so error messages don't lie about this.

I disagree, the anonymous #<#<Class:0x00007fd0ad4b3860>::Bar:0x00007fd0ad81ec70> names are near useless. Where did it come from? Where was it defined?

Related, Marshal and other places use rb_mod_name() (or similar) to find the class name.

In my PR, it's still considered an anonymous class. So it can't be marshalled. I don't think the security is any worse than existing usage of Marshal.

One requirement for that is the given name should not be a valid constant name then (not start with a uppercase letter).

That's not how it works internally, there is actually a flag for anonymous/permanent name.

Updated by ioquatix (Samuel Williams) about 1 year ago

I wondered about whether your point of class name consistent was valid, but I found cases like this:

irb(main):005:0> k = Class.new
=> #<Class:0x000000010488b9b8>
irb(main):006:0> K = k
=> K
irb(main):007:0> k.name
=> "K"
irb(main):008:0> Object.send(:remove_const, :K)
=> K
irb(main):009:0> k.name
=> "K"
irb(main):010:0> j = Class.new
=> #<Class:0x000000010488c318>
irb(main):011:0> K = j
=> K
irb(main):012:0> K.name
=> "K"
irb(main):013:0> k.name
=> "K"
irb(main):014:0> j.name
=> "K"

So, it's not possible that the class path is actually logical... and the same applies to nested classes.

irb(main):015:1* class k::J
irb(main):016:0> end
=> nil
irb(main):017:0> k::J
=> K::J
irb(main):018:0> k.name
=> "K"
irb(main):019:0> k == K
=> false
irb(main):020:0> K::J
(irb):20:in `<main>': uninitialized constant K::J (NameError)
    from <internal:kernel>:187:in `loop'                  
    from -e:1:in `<main>'  

Updated by Eregon (Benoit Daloze) about 1 year ago

ioquatix (Samuel Williams) wrote in #note-18:

One requirement for that is the given name should not be a valid constant name then (not start with a uppercase letter).

That's not how it works internally, there is actually a flag for anonymous/permanent name.

It's a requirement for humans to always be able to tell the difference. That flag is not exposed or shown.

I wondered about whether your point of class name consistent was valid, but I found cases like this:

Yes, as I already mentioned in https://github.com/ruby/ruby/pull/7376#issuecomment-1445057920.
But remove_const or overriding a constant is very rarely used, there is a reason they are private or warn.
And in practice they aren't used much, so (the original) Module#name does reflect in 99+% how to access a named module, and that is valuable.
This feature hurts that, and Module#name is now possibly a "random user-supplied value" or worse a "looks like a constant path but is fake and was never assigned" (the latter one can be addressed by the requirement above).


I don't get why you want to show the file path in Module#name, this information is already available in the backtrace and in fact is more precise in the backtrace:

Module.new do
  self::C = Class.new
  c = self::C.new
  c.foo
end
---
t.rb:4:in `block in <main>': undefined method `foo' for #<#<Module:0x00007f735567e880>::C:0x00007f73673fa1b0> (NoMethodError)

  c.foo
   ^^^^
	from t.rb:1:in `initialize'
	from t.rb:1:in `new'
	from t.rb:1:in `<main>'

How is #<t.rb::C:0x00007f73673fa1b0> more useful? (it might look better to you, but what is the actual usefulness of it?)
The relevant information here is t.rb:4 which is already there.
I have the feeling you hate the notation #<#<Module:0x1234>::C:0x5678>, which yes that's a bit cryptic, but it's missing the point this is hardly relevant in an exception such as the one above.

For the example in https://bugs.ruby-lang.org/issues/19450#note-6 you can just define inspect as an instance method of that class and be done with it.
You can customize inspect on the class itself like Rails does if you think people will do a lot of p self.class in sus or so.

In other words, a module name and a file path are separate things.

Updated by jeremyevans0 (Jeremy Evans) about 1 year ago

I have tests where I am overriding the name method for anonymous classes, so I understand the desire for this feature. I'm on the fence about whether it's worth adding a core method for this, as I haven't yet seen a non-contrived example where overriding name will not work. Examples with invalid constant names already show how this feature could easily be abused.

Even if we do accept the general idea of setting class/module names without assigning to a constant, I'm strongly against modifying {Class,Module}#initialize for this, due to how rarely this feature is needed. If we are going to add it, we could use a separate method(s) for it with a more descriptive name such as Class.labeled_class/Module.labeled_module.

Updated by ioquatix (Samuel Williams) about 1 year ago

I haven't yet seen a non-contrived example where overriding name will not work

Any time you have nested class or modules in an anonymous module, it gets very ugly.

Examples with invalid constant names already show how this feature could easily be abused.

Already a possible attack surface as demonstrated.

due to how rarely this feature is needed

Since this feature doesn't exist yet, I don't think you can argue that it's rarely used.

As stated above, a similar feature is used over 100 times in Ruby's own test suite. Other anecdotal evidence suggests this is a fairly common issue as there are a variety of blog posts and SO answers about various ways to achieve it (i.e. overriding #name). I've enumerated several key places where I wished this interface existed. However, as stated, overriding #name is insufficient for nested classes/modules.

I don't see why "rarity of usage" (which is yet to be proven) is a strong argument for "bespoke method names". Can you explain why those two things are related? Are you concerned something is going to break?

Updated by jeremyevans0 (Jeremy Evans) about 1 year ago

ioquatix (Samuel Williams) wrote in #note-22:

I haven't yet seen a non-contrived example where overriding name will not work

Any time you have nested class or modules in an anonymous module, it gets very ugly.

I haven't come across a case where you cannot also define name on those nested classes/modules, though it is possible such a case exists.

Examples with invalid constant names already show how this feature could easily be abused.

Already a possible attack surface as demonstrated.

Apologies, it but was not obvious to me where you demonstrated this. Could you link to where you think you already demonstrated this?

It's true that you can define name to return an invalid constant name, but doing so does not currently automatically infect nested modules.

due to how rarely this feature is needed

Since this feature doesn't exist yet, I don't think you can argue that it's rarely used.

The rarity I'm referring to is the need to define a name for an anonymous class or module without assigning it to a constant. My guess would be at somewhere between 1% and 0.1% of anonymous classes and modules need to override name. Are you of the opinion that the need to assign a name to an anonymous module or class without assigning it to a constant is more common than 1%?

As stated above, a similar feature is used over 100 times in Ruby's own test suite. Other anecdotal evidence suggests this is a fairly common issue as there are a variety of blog posts and SO answers about various ways to achieve it (i.e. overriding #name). I've enumerated several key places where I wished this interface existed. However, as stated, overriding #name is insufficient for nested classes/modules.

For the anonymous classes/modules currently overriding name, this feature would be helpful, I said that up front. In my experience, there is only a small percentage of cases where {Class,Module}.new is used and name is also overridden.

I don't see why "rarity of usage" (which is yet to be proven) is a strong argument for "bespoke method names". Can you explain why those two things are related? Are you concerned something is going to break?

In my opinion, it's a bad idea to add positional arguments for things that are rarely needed. For one, it makes adding future positional arguments much more cumbersome. This is especially true if the methods already take optional positional arguments, as Class.new does.

For arguments that are rarely needed, keyword arguments are better. For one, use of a positional argument for the temporary name in Class.new means that you must manually specify Object as the first argument, even in cases where it would be the default.

So if we had to accept changes to Module.new and Class.new for temporary names, I think it would be better to use keyword arguments:

Class.new(Array, name: 'Foo')
Class.new(name: 'Foo')
Module.new(name: 'Foo')

I still think that based on the rarity of need, it would be better to have a separate method:

Class.labeled_class('Foo', Array)
Class.labeled_class('Foo')
Module.labeled_module('Foo')

The decision of whether to add a keyword argument to an existing method or instead add an additional method is always a judgement call. Adding a keyword argument makes the method API more complex, especially in cases where the method does not already use keyword arguments. For cases where it would be somewhat common to use the keyword argument, I think adding a keyword argument makes sense. However, for cases where using the keyword would be rare, I think 2 separate methods, both with a simpler API, may be better than 1 method with a more complex API.

Updated by ioquatix (Samuel Williams) about 1 year ago

I haven't come across a case where you cannot also define name on those nested classes/modules, though it is possible such a case exists.

It's not reasonable, and in fact cumbersome, to expect users to override Class#name for every class defined in an anonymous namespace. However, as an alternative/addition to this proposal, having Class#name invoke superclass.name rather than using the internally defined class path would probably fix some of the shortcomings of overriding Class#name.

Apologies, it but was not obvious to me where you demonstrated this. Could you link to where you think you already demonstrated this?

https://bugs.ruby-lang.org/issues/19450#note-19

Are you of the opinion that the need to assign a name to an anonymous module or class without assigning it to a constant is more common than 1%?

Yes, based on my own usage and analysis of other projects on GitHub. A lot of code uses assignment to unique constant names, but this itself is overly complex. The reason for doing that is to give it a meaningful name, not because a constant in Object is desired. A common example of this is RSpec: https://github.com/rspec/rspec-core/blob/d722da4a175f0347e4be1ba16c0eb763de48f07c/lib/rspec/core/example_group.rb#L842-L848 but this type of usage is fairly common. I myself gave 5 example cases where I'd immediately benefit from such an interface: https://bugs.ruby-lang.org/issues/19450#note-18 and I'm sure I can find more if I go looking for them.

The decision of whether to add a keyword argument to an existing method or instead add an additional method is always a judgement call.

I understand. My initial proposal was to use a keyword argument, but the implementation was more complex as you suggest, so I opted for an optional positional argument.

I would say, the keyword argument is slightly more ergonomic, but at the cost of implementation complexity and perhaps a little bit of performance.

I will leave it to the developer meeting to decide what interface makes the most sense. I also appreciate the use of methods like labeled_class and labeled_module but those don't compose well long term, i.e. if there are other optional arguments, you'd end up with a lot of messy methods. So, I'd personally vote for keyword arguments or optional positional arguments.

Updated by mame (Yusuke Endoh) about 1 year ago

Please write the use case in the proposal description. It is difficult for us to understand and discuss it in the dev meeting in a limited time.

Updated by ioquatix (Samuel Williams) about 1 year ago

Sorry @mame (Yusuke Endoh), I did not have a concrete proposal when I originally suggested it and this was more of a discussion to get feedback. I've updated the dev meeting proposal with specific details.

Updated by ioquatix (Samuel Williams) about 1 year ago

  • Status changed from Open to Closed

Thanks for all the discussion, here are the proposals:

They are alternatives and also complementary, i.e. we could do both. @matz (Yukihiro Matsumoto) can you please check each proposal carefully and let me know your thoughts.

Actions #28

Updated by Eregon (Benoit Daloze) about 1 year ago

  • Related to Feature #19521: Support for `Module#name=` and `Class#name=`. added
Actions #29

Updated by Eregon (Benoit Daloze) about 1 year ago

  • Related to Feature #19520: Support for `Module.new(name)` and `Class.new(superclass, name)`. added
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0