Bug #18622
closedconst_get still looks in Object, while lexical constant lookup no longer does
Description
There is some inconsistency here between literal constant lookup and the meta API (const_get).
Lexical constant lookup no longer uses a special case for Object, and this is good as it avoids surprises: #11547
However, const_get
still looks in Object, even though that's confusing, inconsistent and IMHO shouldn't really happen.
module ConstantSpecsTwo
Foo = :cs_two_foo
end
module ConstantSpecs
end
p ConstantSpecs.const_get("ConstantSpecsTwo::Foo") # => :cs_two_foo
p ConstantSpecs::ConstantSpecsTwo::Foo # => const_get.rb:9:in `<main>': uninitialized constant ConstantSpecs::ConstantSpecsTwo (NameError)
I think we should change it so both behave the same (i.e., NameError).
It's like if cd /foo/bar
would go to /bar
if /foo/bar
does not exist and /bar
does.
const_get
is a meta API so it cannot know the surrounding Module.nesting
, but so I think it should consider the receiver of const_get
as the only nesting (so just ConstantSpecs
in this case, not Object
).
Note this does not affect nested constants inside the const_get
like Foo
above (another way to look at the inconsistency that the first component is treated differently).
Updated by Eregon (Benoit Daloze) almost 3 years ago
- Related to Feature #11547: remove top-level constant lookup added
Updated by NuriYuri (Youri Nouri) almost 3 years ago
I always believed it was expected because ConstantSpecs.const_get("ConstantSpecsTwo::Foo") behave like:
module ConstantSpecs
p ConstantSpecsTwo::Foo
end
If you call ConstantSpecs.const_get("ConstantSpecsTwo::Foo", false)
it'll be equivalent to ConstantSpecs::ConstantSpecsTwo::Foo
.
ConstantSpecs.const_get("ConstantSpecsTwo::Foo", false)
# => (irb):8:in `const_get': uninitialized constant ConstantSpecs::ConstantSpecsTwo (NameError)
Updated by Eregon (Benoit Daloze) almost 3 years ago
NuriYuri (Youri Nouri) wrote in #note-2:
I always believed it was expected because ConstantSpecs.const_get("ConstantSpecsTwo::Foo") behave like:
Interesting, I didn't think of it like that, but I can see that point of view.
The thing there is that code has a lookup nesting of [ConstantSpecs, Object]
(Module.nesting
doesn't show Object but it's there internally).
If you call
ConstantSpecs.const_get("ConstantSpecsTwo::Foo", false)
it'll be equivalent toConstantSpecs::ConstantSpecsTwo::Foo
.
Not really because lexical lookup does look in the ancestor chains, while that does not (but unused for this example).
e.g. class D; A=:f; end; class C < D; p A; end; p C::A
works and prints :f
twice.
IMHO the problematic/inconsistent/surprising behavior is this:
Enumerable.const_get(:Process) # => Process, BAD
Enumerable::Process # => uninitialized constant Enumerable::Process (NameError), GOOD
Object.const_get("Enumerable::Process") # => uninitialized constant Enumerable::Process (NameError), GOOD
i.e., it seems very similar to the pitfall explained in #11547.
Updated by fxn (Xavier Noria) almost 3 years ago
I do not know which is the correct definition of const_get
, Ruby core knows :).
However, let me say that I have always thought about const_get
as what happens during a relative constant lookup. That is, in my mind, a relative constant lookup is basically: 1) Check the nesting + 2) const_get
.
Take for example:
Module.new.const_get(:String) # => String
You get String
there, but it is not in the ancestors. Why? Because it is checking Object
by hand, since the receiver is a module, like the relative constant lookup algorithm does.
I believe
String.const_get(:Hash) # => Hash
has to be understood that way.
Updated by NuriYuri (Youri Nouri) almost 3 years ago
If the current definition is correct, could we add as a feature an additional lookup parameter (or method) that says "do not lookup in Object" so we can get a sort of Module.const_get(:Constant)
strictly equivalent to Module::Constant
. (I believe I also have cases like that and it's not practical to have to do Object.const_get("#{module_to_look_up}::#{const_to_lookup}")
...)
Updated by fxn (Xavier Noria) almost 3 years ago
It would not be enough.
When you do a relative constant lookup in which the cref is a module, Object
is checked by hand before calling const_missing
. Take for example:
module M
String
end
Why is String
found there? It is not in the nesting. It is not in the ancestors. So? Ruby checks Object
by hand and finds it there.
But, Ruby does not check Object
by hand in constant paths:
M::String #=> NameError
However,
M.const_get(:String) #=> String
My point is that const_get
implements the logic used in relative constant lookups. And in that point of view, checking Object
is not an inconsistency or bug, but expected.
Updated by Eregon (Benoit Daloze) almost 3 years ago
fxn (Xavier Noria) wrote in #note-6:
Why is
String
found there? It is not in the nesting. It is not in the ancestors. So? Ruby checksObject
by hand and finds it there.
Actually I'd argue it's in the nesting (just not shown by Module.nesting
), the outermost (implicit) "root" lexical constant scope is always Object (and in fact it's implemented that way in TruffleRuby, maybe in CRuby too).
What's special about this root scope is it's checked after the ancestors of the innermost module (and so for classes there is no need as Object is in their ancestors, but for modules that's not the case).
From the implementation POV, about half of usages look in Object and the other half not, I'm not sure there is a general rule for it, it seems a bit ad-hoc currently.
Maybe the rule is if we're looking after a ::
then don't look in Object but otherwise do?
And A.const_get(:B)
is not considered like A::B
but like module A; B; end
?
This issue is mostly about do we still need to look in Object, in which cases, and could there be simpler overall rules for the various constant lookups?
Updated by fxn (Xavier Noria) almost 3 years ago
Actually I'd argue it's in the nesting (just not shown by Module.nesting), the outermost (implicit) "root" lexical constant scope is always Object (and in fact it's implemented that way in TruffleRuby, maybe in CRuby too).
I believe that is confounding the contract with your implementation.
From the POV of the programmer, Module.nesting
tells you the nesting, Object
is not in the nesting. Relative constant lookup checks first the nesting, if not found there, then you check the ancestors. If not found there, and the cref is a module, you additionally check Object
This is consistent with what Flanagan & Matz says on page 261:
Ruby first attempts to resolve a constant in the lexical scope of the reference. This means that it first checks the class or module that encloses the constant reference to see if that class or module defines the constant. If not, it checks the next enclosing class or module. This continues until there are no more classes or modules. Note that top-level or "global" constants are not considered part of the lexical scope and are not considered during this part of constant lookup. The class method Module.nesting returns the list of classes and modules that are searched in this step, in the order they are searched.
If no constant definition is found in the lexically closing scope, Ruby next tries to resolve the constant in the inheritance hierarchy by checking the ancestors of the class or module that referred to the constant.
If no constant definition is found in the inheritance hierarchy, then top-level constant definitions are checked.
Now, I don't know how TruffleRuby implements lookup, but I am pretty certain Object
does not belong to the nesting, and that the public contract, conceptually, is that one.
And A.const_get(:B) is not considered like A::B but like module A; B; end?
This is what I am trying to say in my previous comments. And that is why looking at Object
is not inconsistent. What happens in A::B
is not relevant for const_get
.
This issue is mostly about do we still need to look in Object, in which cases, and could there be simpler overall rules for the various constant lookups?
You look at Object
by hand, because people expect String
to be available everywhere (except in BasicObject
, which is an edge case). So, whether the ancestors have Object
or not is a technicality, from the POV of the user, you want module M; String; end
to work. And that means you need to introduce a special rule for modules.
Personally, I think the lookup is pretty intuitive and easy to understand once documented.
Updated by fxn (Xavier Noria) almost 3 years ago
For completeness, let me add that that the described lookup is incomplete. Like all the lookup descriptions I've seen, it obviates Module#autoload
. Also, the nesting is not strictly lexical. You can push an anonymous module to the nesting of a script with
load script, true
also, you can push an arbitrary class or module to the nesting with
A::B::C.eval_family_of_methods(STRING)
In any case, my conclusion is that const_get
does what it is supposed to do. It is not inconsistent.
Updated by Eregon (Benoit Daloze) almost 3 years ago
That makes sense, I think we should improve const_get
docs to says it's like module A; B; end
and not A::B
(which I'd think I'm not the only one to assume).
For example I think it's nice to be able to implement mod.const_lookup("B::C")
(i.e., before const_get handled ::
paths) as path.split('::').inject(mod) { _1.const_get(_2) }
but that's actually not fully equivalent.
Regarding Object in the nesting, I feel it's consistency with the general situation for the top-level:
-
default definee
: Object (clearly the case) -
self
: main, an Object (clearly the case) -
cref
: Object (clearly the case, but that's not same as being in the nesting unfortunately)
So the top-level is very similar to:
class Object
private
self = main # not instance_eval, that would change the default definee
<source code>
end
But indeed with more subtleties for constant lookup so Object
is looked last, after the module/class's ancestors.
I wish we could model constant lookup with just a nesting and have the invariant cref == nesting.first
, but that doesn't hold for the top-level (AFAIK only for that case), unless the root constant scope is treated specially (as it is in TruffleRuby, and as we see the top-level constant scope is also special semantically anyway).
Updated by fxn (Xavier Noria) almost 3 years ago
That makes sense, I think we should improve const_get docs to says it's like module A; B; end and not A::B (which I'd think I'm not the only one to assume).
Agree :). The documentation of the constants API has a lot of room for improvement. Also, lookup algorithms, you basically need to reverse engineer them (or read source code). I remember some 10 years ago in Amsterdam, talking about this, that @headius (Charles Nutter) told me: "You are going through what I had to go through myself to implement JRuby".
I documented quite a bit in the old Rails guide (https://guides.rubyonrails.org/v5.2/autoloading_and_reloading_constants.html#constants-refresher), because in order to understand the caveats of the old system, users had to understand how things work. Still, not a spec, the full precise details would not fulfill that didactic goal.
However, let me add a nuance. The current documentation says:
Checks for a constant with the given name in mod. If inherit is set, the lookup will also search the ancestors (and Object if mod is a Module).
Therefore, it is implicitly saying it is not mimicking A::B
, because A::B
does not check Object
and does not check modules by hand.
But I agree with you, this could be more clear.
Updated by matz (Yukihiro Matsumoto) over 2 years ago
For the sake of consistency, it is nice to start searching from the receiver class/module. I really like to fix the behavior, if there's no compatibility issue.
But in reality, changing the behavior possibly affect numerous usage of const_get
(4000+ appearance by gem search).
I am not sure this consistency really worth the possible compatibility issues.
Matz.
Updated by austin (Austin Ziegler) over 2 years ago
matz (Yukihiro Matsumoto) wrote in #note-12:
For the sake of consistency, it is nice to start searching from the receiver class/module. I really like to fix the behavior, if there's no compatibility issue.
But in reality, changing the behavior possibly affect numerous usage ofconst_get
(4000+ appearance by gem search).
I am not sure this consistency really worth the possible compatibility issues.
I think it would be worth an experimental flag, TBH. I have a few places where I am calling const_get
in some gems and my intent is just to search in the space where I am (if bare) or in the receiver class/module (if the receiver is explicit).
Updated by fxn (Xavier Noria) over 2 years ago
I you want to check only in the receiver, that is mod.const_get(cname, false)
already.
If you want to check from the receiver up the ancestors chain, that is mod.const_get(cname)
already.
If the receiver is a module, and the constant is not found in the ancestors the method checks Object
manually as it happens in relative constant lookups. This is documented.
So, a relative constant lookup could be seen as checking the nesting + const_get
+ const_missing
. This is how I see const_get
and in that sense I am personally fine with the current behavior.
@Eregon (Benoit Daloze) opened this issue because for him, the manual check in Object
is surprising (conceptually, regardless of documentation).
Updated by mame (Yusuke Endoh) over 2 years ago
FYI: Here is an example that will be affected by the proposed change
namespace = ["Google", "Cloud", "Retail", "V2"]
parent_config = while namespace.any?
parent_name = namespace.join "::"
parent_const = const_get parent_name
break parent_const.configure if parent_const.respond_to? :configure
namespace.pop
end
Updated by Eregon (Benoit Daloze) over 2 years ago
The fix of that would be to use Object.const_get parent_name
instead of const_get parent_name
, or to use const_get "::#{parent_name}
.
The current code seems rather fragile and would break if there is any nested Google module for instance (with so many nested namespaces it does not sound too unlikely).
I'm unsure how much code this change would break.
It might also help reveal and avoid bugs with similar cases as shown in #11547.
One way to get an idea would be to make a PR, test a few popular gems locally to see if they work or not.
If that does not reveal too many issues, I think the best would be to emit a deprecation warning for one or a few versions when going through the special Object lookup for a module, and then in a later version actually remove that extra lookup.
Updated by fxn (Xavier Noria) over 2 years ago
@Eregon (Benoit Daloze) in your proposal
Module.new.const_get(:String)
raises.
Would
Class.new.const_get(:String)
raise as well like mod::String
does for mod != Object
?
Updated by Eregon (Benoit Daloze) over 2 years ago
@fxn (Xavier Noria) A good point, it applies both to module and classes, it seems mod::Name
always ignores constants of Object, whether mod is module or class.
I'd like A.const_get("B::C")
to behave exactly the same as A::B::C
.
That also has the nice property that A.const_get("B::C")
= A::B.const_get("C")
= A.const_get("B").const_get("C")
.
If we compare to literal ::
:
Class.new::String # => uninitialized constant #<Class:0x0000000001b98330>::String (NameError)
Class.new.const_get :String # => String currently, but I'd like NameError
Module.new::String # => uninitialized constant #<Module:0x00000000019456f0>::String (NameError)
Module.new.const_get :String # => String currently, but I'd like NameError
So ::
doesn't look into Object
(String
is a constant of Object
) (since #11547 I think).
In other words, with ::
classes do look into ancestors except for Object, and modules only look in their own ancestors (which never include Object
).
That only applies to Object
itself, Kernel
constants for instance are considered simply whether they are in the ancestors or not.
module Kernel; K = :kernel; end
Class.new::K # => :kernel
Class.new.const_get :K # => :kernel
Module.new::K # => uninitialized constant #<Module:0x0000000001e7ba20>::K (NameError)
Module.new.const_get :K # => :kernel currently, but I'd like NameError
Module.new.include(Kernel)::K # => :kernel
Module.new.include(Kernel).const_get :K # => :kernel
I think I also understand better your relative and absolute lookups terms now.
In A::B::C
, A
is a relative lookup/scope lookup/nesting lookup.
And B
and C
are both using "absolute lookup"/"only ancestors and always ignoring constants of Object to avoid going back to the root scope inadvertently".
Updated by fxn (Xavier Noria) over 2 years ago
Exactly :).
Backwards compatibility, adding a flag, a new method, etc., is something for core to consider. But in any case, I like your proposal.
Updated by jeremyevans0 (Jeremy Evans) over 1 year ago
@matz (Yukihiro Matsumoto) This was reviewed during the June 2023 developer meeting and is still waiting for your response: https://github.com/ruby/dev-meeting-log/blob/master/2023/DevMeeting-2023-06-08.md#bug-18622-const_get-still-looks-in-object-while-lexical-constant-lookup-no-longer-does-jeremyevans0
Updated by matz (Yukihiro Matsumoto) 7 months ago
- Status changed from Open to Closed
Sorry, I forgot to reply. Due to the compatibility concern, I'd like to keep this behavior even though it's inconsistent with direct constant accesses.
Matz.