Project

General

Profile

Actions

Feature #17753

open

Add Module#namespace

Added by tenderlovemaking (Aaron Patterson) over 1 year ago. Updated 5 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:103044]

Description

Given code like this:

module A
  module B
    class C; end
    class D; end
  end
end

We can get from C to B like C.outer_scope, or to A like
C.outer_scope.outer_scope.

I want to use this in cases where I don't know the outer scope, but I
want to find constants that are "siblings" of a constant. For example,
I can do A::B::C.outer_scope.constants to find the list of "sibling"
constants to C. I want to use this feature when walking objects and
introspecting. For example:

ObjectSpace.each_object(Class) do |k|
  p siblings: k.outer_scope.constants
end

I've attached a patch that implements this feature, and there is a pull request on GitHub here.


Files

0001-Add-Module-outer_scope.patch (5.93 KB) 0001-Add-Module-outer_scope.patch tenderlovemaking (Aaron Patterson), 03/26/2021 07:19 PM
0001-Add-Module-namespace.patch (5.89 KB) 0001-Add-Module-namespace.patch tenderlovemaking (Aaron Patterson), 03/27/2021 09:51 PM

Updated by sawa (Tsuyoshi Sawada) over 1 year ago

What would you expect if a module has multiple names?

module E; end
E::F = A::B::C

Should A::B::C.outer_scope return A::B or E?

Updated by Eregon (Benoit Daloze) over 1 year ago

@sawa (Tsuyoshi Sawada) I'd say first assignment to a named constant wins, just like for Module#name.

I agree with the feature.
I'd suggest Module#namespace for the name though.
For example, I'd say the namespace of Process::Status is Process.

scope feels too general to me, and there are many other scopes, so I think namespace is a more precise term for it.

namespace is also the term used in https://github.com/ruby/ruby/blob/master/doc/syntax/modules_and_classes.rdoc#label-Modules

Updated by tenderlovemaking (Aaron Patterson) over 1 year ago

Eregon (Benoit Daloze) wrote in #note-2:

@sawa (Tsuyoshi Sawada) I'd say first assignment to a named constant wins, just like for Module#name.

Yes, this is what I would expect too (and implemented). 😄

I agree with the feature.
I'd suggest Module#namespace for the name though.
For example, I'd say the namespace of Process::Status is Process.

Yes, this is a much better name. I've updated the patch to use "namespace".

Updated by sawa (Tsuyoshi Sawada) over 1 year ago

This feature is reminiscent of Module.nesting. The difference is that the former has dynamic scope and the latter lexical scope. Besides that, I do not see any reason to make them different in any way. What about returning an array of the nested modules (perhaps including self) rather than just the direct parent?

module A; module B; class C; Module.nesting end end end # => [A::B::C, A::B, A]

A::B::C.outer_scope # => [A::B::C, A::B, A]

Updated by byroot (Jean Boussier) over 1 year ago

Besides that, I do not see any reason to make them different in any way

Well, if Module.nesting because of its scope semantic can't be chained. Module.nesting.nesting would be problematic.

The proposed feature is very easily chainable:

A::B::C.namespace # => A::B
A::B::C.namespace.namespace # => A

So returning an array doesn't give anything that's not already achievable, and cause an array allocation that some users would rather avoid in some situations.

Actions #6

Updated by tenderlovemaking (Aaron Patterson) over 1 year ago

  • Subject changed from Add Module#outer_scope to Add Module#namespace

Updated by fxn (Xavier Noria) over 1 year ago

I like the direction this is going towards, however, let me record some remarks for the archives.

Java has namespaces. Ruby does NOT have namespaces. That first sentence in the module docs, also present in books, is a super naive entry point to modules. But it is an abuse of language that later on you should correct.

Ruby does not have syntax for types either.

Ruby has storage (variables, constants, etc.), and objects. That is all, variables, constants, and module objects are totally decoupled except for the fact that you get a name in the first constant assignment. A name that does not reflect the nesting, that is not guaranteed to be unique, that does not mean the object is reachable via that constant path, and that some classes change by overriding the name method. It is just a string.

A library like Zeitwerk or Active Support can take some licenses "you know what I mean" because they are libraries and they work on the assumption of projects structured in a certain way. But a programming language has to be consistent with itself. Module#constants is consistent, in my view Module#namespace is not (with the current model).

So, if Ruby core wants to go in this direction and contribute to normalize a bit the mental model, I am onboard. But we have to be conscious that this is introducing something that is going to leak some way or another.

Updated by fxn (Xavier Noria) over 1 year ago

Let me add some edge cases that are possible, also for the archives:

module M
  module N
  end
end

M::N.namespace # => A::B::C, constant M stores the same object as A::B::C
M.namespace # => M, module is namespace of itself
M::N.namespace # => M
M.namespace    # => M::N, cycles of arbitrary depth
X = M::N
# ...
X.namespace # => The module that was once in M has been garbage collected (assuming a weak ref for backwards compat)

I am sure I can come with more if I think more about it.

The Ruby model of this is extremely flexible and decoupled, and that is the public interface. Constant assignment, constants API, instantiation of anonymous modules, etc.

Updated by fxn (Xavier Noria) over 1 year ago

Also, in case my comments above are too generic, let's take the use case in the description of the ticket:

I can do A::B::C.outer_scope.constants to find the list of "sibling" constants to C.

Let's consider

module A
  module B
    class C; end
    class D; end
  end 
end

module X
  Y = A::B::C
  Z = 1
end

In what sense is A::B::D a sibling of the class object stored in A::B::C and X::Z is not?

Take now

module X; end
module Y; end
module Z; end

c = Class.new
X::C = c
Y::C = c
Z::C = c

For Ruby, that's all objects and storage, where's c stored has no relevance. It is not different than

module X; end
module Y; end
module Z; end

X::C = 1
Y::C = 1
Z::C = 1

Yes, c.name is "X::C", but as I said above, that is just a string.

If our input is a class object, as in the ObjectSpace example, you have no information that allows you to jump from it to its possibly multiple places in which the object may be stored. And the original constant may be gone, those places can be elsewhere (as it happens with stale class objects cached during Rails initialization after a reload).

On the other hand, if you are in a very specific situation where you can assume that loop makes sense for all k, you can always name.sub(/::\w+$/, '') and const_get, modulus details. Or you can ObjectSpace.each_object(Module) and inspect constants.

In a project, in a library, you may have constraints in place that you can exploit. In Ruby, the language, you don't.

Updated by tenderlovemaking (Aaron Patterson) over 1 year ago

Yes, c.name is "X::C", but as I said above, that is just a string.

It's also a way to inform the user where that constant lives. The contents of the string have meaning.

On the other hand, if you are in a very specific situation where you can assume that loop makes sense for all k, you can always name.sub(/::\w+$/, '') and const_get, modulus details.

This would work if I could trust the name method on a class (I can't, especially in a Rails project).

Of course there are some edge cases with redefinition, but since the "namespace" method would line up with what the "name" method is supposed to return, I think it would be easy to understand the behavior.

Updated by Eregon (Benoit Daloze) over 1 year ago

I think those edge cases are pretty rare.
Module#namespace would refer to the lexical parent when the module is created (with module Name) or when first assigned to a constant (Name = Module.new).

The first example of https://bugs.ruby-lang.org/issues/17753#note-8 would already need extremely contrived code like:

module A
  module B
    module C
    end
  end
end

module M
  N = A::B::C
  module N
  end
  p N.namespace
end

and even then the value could still be useful.

In the end, the exact same caveats exist for Module#name and yet it's fine in practice.

A module is a namespace of constants.

Updated by fxn (Xavier Noria) over 1 year ago

It's also a way to inform the user where that constant lives. The contents of the string have meaning.

The numerous people that have had to deal with stale objects in Rails for years know that is not entirely true. The class objects have a name, but that constant path no longer gives you the object at hand, but some another object that happens to have the same name.

Benoit, but a programming language is a formal system. It has to define things that are consistent with its model! It does not matter in my view if the examples are statistically rare. They are only ways to demonstrate the definition does not match the way Ruby works.

A module is a namespace of constants.

Yes, but it is dynamic because of const_set and remove_const, and your APIs and concepts need to reflect that.

If you wanted namespaces, you'd have a different programming language where everything is consistent with that concept. But Ruby is not that way.

Same way Ruby does not have types. Admin::User is not a type (we all in this thread know that), it is a constant path. That is all you got, constants and objects, and constants API.

Updated by fxn (Xavier Noria) over 1 year ago

BTW, we were discussing yesterday with Aaron that the flag I am raising is about the name namespace. What we are defining given a class object is:

  1. If the class object is anonymous, return nil.
  2. Otherwise, it was assigned to a constant at least once. Let namespace be the module object that stored that constant at assignment time if it is an alive object, nil if the original object is gone (possible depending on whether the reference is weak or not).

We do not have a good name for that.

Another thing Aaron is exploring is to define Module#namespaces, which would return all modules that store the class object in one of their constants. That is a bit closer to the Ruby model, I believe.

Updated by fxn (Xavier Noria) over 1 year ago

To me, the ability of a namespace being namespace of itself

m = Module.new
m::M = m

is one clear indicator that the name is not quite right. That is not the kind of property you expect a namespace to have. And it is not quite right because we are dealing with storage and objects. In the world of storage and objects that example squares perfectly, there is no surprise.

Updated by fxn (Xavier Noria) over 1 year ago

Oh, let me say something explicitly: You guys are Ruby committers, you are the ones that have the vision for what makes sense in the language.

I am raising a flag because this does not square to me, and makes me feel it is introducing a leaking abstraction not backed by the language model. It is an abstraction you could tolerate in Active Support with documented caveats, but not one that I personally see in Ruby itself.

However, if once the feedback has been listened to you believe this API squares with your vision of Ruby, by all means go ahead :).

Updated by Eregon (Benoit Daloze) over 1 year ago

I see, the name namespace is what we're disagreeing on.
Maybe you have another suggestion for the name of that method?

Maybe outer_module/outer_scope would work too, but I feel namespace is nicer.
All these 3 names imply some kind of lexical relationship. And even though that can be broken, in most cases it is respected and the intent of the user using this new method is clearly to go one step up in the lexical relationship.
So we should mention in the docs this might return unusual results for e.g. anonymous modules that are later assigned to a constant.

FWIW internally that's named "lexical parent module" in TruffleRuby, but that doesn't make a nice method name.
Indeed, it's not always "lexical" but in the vast majority of cases the intent is that and we would say A::B is namespaced or defined under (rb_define_module_under) module/class A.

The way I see it is modules are the closest thing to a namespace that Ruby has. And therefore Module#namespace feels natural to me.
From the other direction, I agree namespaces often have different/stricter semantics than Ruby modules in other languages.
Yet I think it's OK to have a slightly different meaning for namespace in Ruby, and that seems already established in docs.

Updated by fxn (Xavier Noria) over 1 year ago

The lexical parent module happens to be just the object from which you set the name, which does not even reflect the scope/nesting at assignment time (as you know):

module A::B
  module X::Y
    class C
      name # => "X::Y::C"
    end
  end
end

If modules are namespaces, why isn't A::B there?

Yeah, we see it differently. There, I only see a constant assignment. You see it like "most of the time, you can think of it that way because that's how most of Ruby looks like". That difference in points of view is fine :).

Updated by fxn (Xavier Noria) over 1 year ago

BTW, you all know AS has this concept right? https://github.com/rails/rails/blob/f1e00f00295eb51a64a3008c7b1f4c4f46e902e3/activesupport/lib/active_support/core_ext/module/introspection.rb#L20-L37

We say "according to its name", have the X example to clearly see the assumptions, and case closed.

As I said before, AS can take licenses, it is not Ruby itself. And in the context of Rails (the most common case for AS), you can assume some structure.

Updated by mame (Yusuke Endoh) over 1 year ago

This ticket was discussed on the dev meeting. @matz (Yukihiro Matsumoto) said that (1) the use case is not clear to him, and that (2) he wants to keep the keyword namespace for another feature in future. outer_scope is also weird because the return value is not a "scope".

Updated by fxn (Xavier Noria) over 1 year ago

In my view, the way to implement the use case that matches Ruby is to go downwards.

Module has many constants, that is the Ruby model, so instead of

ObjectSpace.each_object(Class) do |k|
  k.outer_scope.constants
end

you'd write

ObjectSpace.each_object(Module) do |mod|
  mod.constants.each do |constant|
    # Do something with constant.
  end
end

Alternatively, recurse starting at Object (would miss anonymous modules with constants).

Updated by ioquatix (Samuel Williams) 5 months ago

@tenderlovemaking (Aaron Patterson) what about some kind of "uplevel" concept for name:

class A::B::C::MyClass; end

A::B::C::MyClass.name(0) # -> "MyClass"
A::B::C::MyClass.name(1) # -> "C::MyClass"
A::B::C::MyClass.name(-1) # -> "A::B::C"
A::B::C::MyClass.name(-2) # -> "A::B"

etc

Updated by sawa (Tsuyoshi Sawada) 5 months ago

ioquatix (Samuel Williams) wrote in #note-21:

class A::B::C::MyClass; end

A::B::C::MyClass.name(0) # -> "MyClass"
A::B::C::MyClass.name(1) # -> "C::MyClass"
A::B::C::MyClass.name(-1) # -> "A::B::C"
A::B::C::MyClass.name(-2) # -> "A::B"

What is the rule behind what the argument represents? To me, your four examples except for the first one seem to suggest:

  1. The nesting levels (achieved by separating the full name by ::) can be referred to by an index as if they were placed in an array.
  2. a. If the argument is negative, then remove the nesting levels from the one indexed by the argument up to the last one.
    b. If the argument is non-negative, then remove the nesting levels from the first one up to the one indexed by the argument.
  3. Join the remaining nesting levels with ::.

But, then I would expect:

A::B::C::MyClass.name(0) # -> "B::C::MyClass"

contrary to what you wrote.

What is your intended logic? Is it coherent?

Updated by ioquatix (Samuel Williams) 5 months ago

class Class
  def name(offset = nil)
    return super() unless offset

    parts = super().split('::')

    if offset >= 0
      parts = parts[(parts.size - 1 - offset)..-1]
    else
      parts = parts[0...(parts.size + offset)]
    end

    return parts.join('::')
  end
end

module A
  module B
    module C
      class MyClass
      end
    end
  end
end

pp A::B::C::MyClass.name(0) # -> "MyClass"
pp A::B::C::MyClass.name(1) # -> "C::MyClass"
pp A::B::C::MyClass.name(-1) # -> "A::B::C"
pp A::B::C::MyClass.name(-2) # -> "A::B"

Something like this.

Updated by sawa (Tsuyoshi Sawada) 5 months ago

@ioquatix (Samuel Williams) (Samuel Williams)

The non-negative part of your code looks pretty much convoluted. To simplify your code (and define it on Module rather than on Class), it would be essentially this:

Module.prepend(Module.new do
  def name(offset = nil)
    return super() unless offset

    super().split('::').then do
      if offset >= 0
        _1.last(offset + 1)
      else
        _1[...offset]
      end
    end.join('::')
  end
end)

This indicates that you are essentially using the argument offset:

  • to specify the number of elements when offset is non-negative, and
  • to specify the ending position (index) of the elements otherwise

which is incoherent. At least to me, your proposal is in fact difficult to understand because of this. I think it should be unified so that either offset expresses the number all the way down, or it does the position all the way down. Or, perhaps you can limit offset to non-negative.

Updated by ioquatix (Samuel Williams) 5 months ago

@sawa (Tsuyoshi Sawada) Thanks for your feedback and the improved code.

Based on my own needs and other code (see https://apidock.com/rails/ActiveSupport/Inflector/demodulize and https://apidock.com/rails/ActiveSupport/Inflector/deconstantize for example) I see two main use cases:

(1) Get some part of the namespace starting from the left. The most common use case is "The entire module namespace without the class name" but it will also be convenient to cut off more than just the class name in some cases. Since how deeply nested we are is usually not known, cutting from the right hand side makes sense.
(2) Get some part of the class name starting from the right. The most common case is "Just the class name without any module namespace" but it will also be convenient to include some of the nested modules expanding towards the right in some cases.

To me it's consistent within the requirements of solving those two problems and maps nicely to negative and non-negative integers respectively. I realise that between the negative and non-negative offset, there is no continuity but this is by design to satisfy user needs rather than theoretical purity. If you have a better idea, please share it!

In more detail, I don't think this offset should be impacted by changes to module nesting, i.e.

# This should be the same:
A::B::C::MyClass.name(1) # C::MyClass
Z::A::B::C::MyClass.name(1) # C::MyClass

# This should always be full module namespace:
A::B::C::MyClass.name(-1) # A::B::C
Z::A::B::C::MyClass.name(-1) # Z::A::B::C

In addition, user won't know ahead of time the nesting level of the class, and so this proposed interface needs to satisfy the most common use cases without any extra computation, otherwise, user is forced to do string manipulation again. In theory, this proposed interface should also be efficient to implement.

Updated by austin (Austin Ziegler) 5 months ago

ioquatix (Samuel Williams) wrote in #note-25:

To me it's consistent within the requirements of solving those two problems and maps nicely to negative and non-negative integers respectively. I realise that between the negative and non-negative offset, there is no continuity but this is by design to satisfy user needs rather than theoretical purity. If you have a better idea, please share it!

Since name doesn’t currently accept any arguments, why not make it a keyword instead of a simple integer?

A::B::C::MyClass.name(tail: 1) # C::MyClass
A::B::C::MyClass.name(head: 1) # A::B::C

I don’t know what the name of the keywords should be.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0