Feature #6373

public #self

Added by Thomas Sawyer almost 2 years ago. Updated 3 months ago.

[ruby-core:44704]
Status:Feedback
Priority:Normal
Assignee:Yukihiro Matsumoto
Category:core
Target version:next minor

Description

=begin
This was recently suggested to me as an extension:

class Object
# An identity method that provides access to an object's 'self'.
#
# Example:
# [1,2,3,4,5,1,2,2,3].group_by(&:identity)
# #=> {1=>[1, 1], 2=>[2, 2, 2], 3=>[3, 3], 4=>[4], 5=>[5]}
#
def identity
self
end
end

First, is such a method commonly useful enough to warrant existence?

Second, it makes me wonder if #self should be a public method in general.
=end

self.pdf (77.8 KB) Marc-Andre Lafortune, 08/31/2013 06:59 AM


Related issues

Related to ruby-trunk - Feature #6817: Partial application Open 07/31/2012

History

#1 Updated by Yusuke Endoh almost 2 years ago

  • Status changed from Open to Assigned
  • Assignee set to Yukihiro Matsumoto

#2 Updated by Martin Dürst almost 2 years ago

#identity (or whatever it's called) is quite important in functional languages. It's handy to pass to another function that e.g. uses it as an argument to map,... So I think it's a good idea to add it.

The name should be identity or id (Haskell) or some such.

Making self public sounds impressive, but first, it doesn't expose anything new (it's the object itself, which is already accessible) an second, I feel it's the wrong name, because #idenity essentially only makes sense in locations in a program where the actual object isn't directly around.

#3 Updated by Yui NARUSE almost 2 years ago

Use id.

#4 Updated by Charles Nutter almost 2 years ago

At first I found this laughable, but then I realized there's no clear method you can call against an object that simply returns the object. It is a small thing, but turns out to be very useful.

For example, if you want a chain of method calls that successively transformation some data. Those transformations, in the form of folds or filters or what have you, may accept an object and return some new object by calling a method. In that case, the simplest transformation is to simply return the object unmodified. There is no such method on Object or BasicObject right now, so rather than having the uniformity of a method call you would need a special filter form that calls nothing and returns its one argument. Identity is an easy, simple functional form that should be available.

I would suggest it should be called self or similar, since it is a key core method nobody should ever override. But I do think there is utility in having it always available.

#5 Updated by Charles Nutter almost 2 years ago

Actually, I just realized there is already a method you can call to just return the method itself, albeit in a really gross way: #tap.

obj.tap{} #returns obj; block is required and invoked

#6 Updated by Ilya Vorontsov almost 2 years ago

enum.map(&:identity) can be replaced with enum.toa
But I think Object#identity is useful. Object#tap requires a block. It's not an option to use it in many cases. I can't imagine how to replace enum.group
by(&:identity) with tap instead of identity.

I think that tap method can be improved to be used without block yielding self. But for readability it'd be however aliased to identity.

self looks like a method every programmer'd avoid.

#7 Updated by Yukihiro Matsumoto almost 2 years ago

id returns object_id number, identity here is supposed to return itself.
I agree with introducing method to return self, but not fully satisfied with the name 'identity'.
Any opinion?

Matz.

#8 Updated by Thomas Sawyer almost 2 years ago

Public #self seems like the most obvious choice. Is there some reason not to use it?

#9 Updated by Alex Young almost 2 years ago

On 28/04/2012 16:10, matz (Yukihiro Matsumoto) wrote:

Issue #6373 has been updated by matz (Yukihiro Matsumoto).

id returns object_id number, identity here is supposed to return itself.
I agree with introducing method to return self, but not fully satisfied with the name 'identity'.
Any opinion?

"itself"?

--
Alex

Matz.


Feature #6373: public #self
https://bugs.ruby-lang.org/issues/6373#change-26296

Author: trans (Thomas Sawyer)
Status: Assigned
Priority: Normal
Assignee: matz (Yukihiro Matsumoto)
Category: core
Target version: 2.0.0

=begin
This was recently suggested to me as an extension:

class Object
# An identity method that provides access to an object's 'self'.
#
# Example:
# [1,2,3,4,5,1,2,2,3].group_by(&:identity)
# #=> {1=>[1, 1], 2=>[2, 2, 2], 3=>[3, 3], 4=>[4], 5=>[5]}
#
def identity
self
end
end

First, is such a method commonly useful enough to warrant existence?

Second, it makes me wonder if #self should be a public method in general.
=end

#10 Updated by Pablo Herrero almost 2 years ago

What about if we borrow #yourself message name from Smalltalk?

#11 Updated by Alexey Muranov almost 2 years ago

Thomas, i think an argument against public #self is that 'self' is a reserved word, which moreover is used more as an object name than as a method name. So 'self' would need to stop being a keyword and become a public 'predefined' method of BasicObject (it cannot be defined without the 'self' keyword, i guess).

I like #itself or #yourself, but i do not know which one is a proper way to talk to my objects.

#12 Updated by Thomas Sawyer almost 2 years ago

=begin
Like many of Ruby's keywords, it can still be used to define a public method:

class X
def self; "x"; end
end

x = X.new
x.self #=> "x"

=end

#13 Updated by Robert A. Heiler almost 2 years ago

Perhaps Smalltalk has the best suggestion. :)

#14 Updated by Alexey Muranov almost 2 years ago

Another option: #the_self. The same number of symbols as in #yourself, but harder to type :(.

#15 Updated by Benoit Daloze almost 2 years ago

On 28 April 2012 17:54, Alex Young alex@blackkettle.org wrote:

"itself"?

I agree, #itself is the best to me.

#16 Updated by Marc-Andre Lafortune almost 2 years ago

I second the addition of Object#self.

For the objection that self is a keyword, so is class. And there wouldn't ever be a need to call self.self :-)

Do we need a slide-show for this?

#17 Updated by Thomas Sawyer almost 2 years ago

Does it really need a slide? Does someone at developers meeting want to bring it up? It's such a simple thing. It's probably a one line addition to code to make self available as public method. Only question is name, which seems to me, why have different public name than private name? Go with #self. But if necessary for public/private names to differ, everyone seems okay with #itself.

#18 Updated by Alexey Muranov almost 2 years ago

I've heard that the underscore _ is commonly used for ignored block variables. Maybe this "public self" can be considered as an "ignored method", and called Object#_?

#19 Updated by Thomas Sawyer almost 2 years ago

_ is used by irb. Also, I don't really see why. Code would look much more "perlish" using _.

Please, what's wrong with public #self?

#20 Updated by Alexey Muranov almost 2 years ago

trans (Thomas Sawyer) wrote:

Please, what's wrong with public #self?

Nothing, just was wondering how to use _ :). I didn't know IRB uses it.

#21 Updated by Michael Kohl over 1 year ago

FWIW, I'm the one who suggested this method as an addition to Facets, mainly for the reason headius mentions above, it's the simplest filter available. I'm still torn on the name, but for some reason #self didn't seem right. For my own extension library I finally went with #it.

#22 Updated by Michael Kohl over 1 year ago

i also found a previous issue where this behavior would come in handy: http://bugs.ruby-lang.org/issues/2172

#23 Updated by Alexey Muranov over 1 year ago

How about merging this with feature request #6721 for #yield_self?

Object#self can optionally accept a block, yield self to the block if block given, and return the result of the block. What do you think?

#24 Updated by Thomas Sawyer over 1 year ago

Strikes me as a very good idea! I forgot about that. In Facets it is called #ergo. Essentially,

def ergo
return yield(self) if block_given?
self
end

Call it #self instead and we get two features for the price of none!

Not sure if should take additional *args or not, but it could.

#25 Updated by Boris Stitnicky over 1 year ago

matz (Yukihiro Matsumoto) wrote:

id returns object_id number, identity here is supposed to return itself.
I agree with introducing method to return self, but not fully satisfied with the name 'identity'.
Any opinion?

Matz.

I did some thinking, and there is hardly anything better than 'identity'.
'identity_function' would be hypercorrect, and 'id' is commonly understood
as identifier. In my personal library, I use 'ɪ' (small cap Unicode I) for
identity mapping, though...

#26 Updated by Alexey Muranov over 1 year ago

boris_stitnicky (Boris Stitnicky) wrote:

matz (Yukihiro Matsumoto) wrote:

id returns object_id number, identity here is supposed to return itself.
I agree with introducing method to return self, but not fully satisfied with the name 'identity'.
Any opinion?

Matz.

I did some thinking, and there is hardly anything better than 'identity'.
'identity_function' would be hypercorrect, and 'id' is commonly understood
as identifier. In my personal library, I use 'ɪ' (small cap Unicode I) for
identity mapping, though...

ID as an identifier or a piece of identification and Id as the identity function are two different meanings, as far as i understand. Also it does not look to me like "method" and "function" have exactly the same semantics.

#27 Updated by Anonymous over 1 year ago

On Fri, Aug 10, 2012 at 4:25 PM, matz (Yukihiro Matsumoto)
m...@ruby-lang.org wrote:

id returns object_id number, identity here is supposed to return itself.
I agree with introducing method to return self, but not fully satisfied with the name 'identity'.
Any opinion?

How about allowing Object#tap to take no block, simply returning self?
This syntactic sugar would allow

 this.tap

instead of

 this.tap {}

If a new method must be added, please for the love of sanity don't
call it id or identity or identifier. This space already chronically
overloaded by frameworks, the need for entities to have publically
visible identity notwithstanding.

Perhaps object, which would feel natural to anyone already
familiar with id? Failing that, Object#object, although I'd expect
this to break a fair amount of existing code. :-(

Ciao,
Sheldon.

#28 Updated by Thomas Sawyer over 1 year ago

Why no answer for: "Why not just public #self"? Why add YAMS?

(YAM = Yet Another Method)

#29 Updated by Yukihiro Matsumoto over 1 year ago

  • Status changed from Assigned to Feedback
  • Target version changed from 2.0.0 to next minor

The point is when we see the code like:

[1,2,3,4,5,1,2,2,3].group_by(&:self)

sometimes it would be less intuitive that self refers elements in the array, not self in the scope.
I think this cause YAM syndrome here. We haven't met name consensus yet (as usual), so I postpone to next minor.

Matz.

#30 Updated by Alexey Chernenkov 11 months ago

Quote: "I think that tap method can be improved to be used without block yielding self."

+1

It is a VERY usefull feature! Can't understand why #tap still need to be used with block only.

#31 Updated by Matthew Kerwin 11 months ago

laise (Alexey Chernenkov) wrote:

Quote: "I think that tap method can be improved to be used without block yielding self."

+1

It is a VERY usefull feature! Can't understand why #tap still need to be used with block only.

Because it's called "tap." Tap doesn't "return self", it taps into an execution flow, extracting an intermediate value for inspection without interrupting the original flow. The analogy is literally tapping a hole in a pipe, to extract liquid samples at various phases in a process. (Also think of tapping a phone line.) Changing its semantics to a straight-up "returns self" method would just make it idiosyncratic, instead of metaphoric.

Matz wrote:

The point is when we see the code like:

[1,2,3,4,5,1,2,2,3].group_by(&:self)

sometimes it would be less intuitive that self refers elements in the array, not self in the scope.

I have trouble imagining such a scenario. I actually think the some_array.group_by(&:self) example is a strong plus for this feature.

I'm also not entirely convinced it's really YAM, since 'self' is already a word in the language, all we're doing is pushing it from keyword to keyword+method, i.e. making it more easily accessible from outside the instance. Even more so since this use-case is really the only place it will ever show up; if there was another way to toproc the 'self' keyword to make it easy to pass to a method like groupby I'd be for that as well/instead.

#32 Updated by Boris Stitnicky 10 months ago

trans (SYSTEM ERROR) wrote:

Why no answer for: "Why not just public #self"? Why add YAMS?

(YAM = Yet Another Method)

Because explicit identity element is a VIP of the functional space.
Summary of the discussion thus far + my opinions follow:

Proposal: My opinion: Remark:

#self +1 proposed and defended by OP, objected by Matz
#identity +1 strong objection by Anonymous
#id 0

#self -1
#yourself -1
#itself +3
#_ -1
#theself -1
#it -1
#
object_ -1
#id -1 not proposed, Anonymous strongly objects this option
#tap -2 me and phluid61 strongly object

I used to favor #identity, but today, I favor #itself. I already advertised this
method, and #ergo method (#6721) on http://stackoverflow.com/questions/16932711 ,
where these features are in demand. To make the naming decision more difficult,
let me append Freudian #ego to the list, which would go nicely with #ergo.

#33 Updated by Charles Nutter 10 months ago

phluid61 (Matthew Kerwin) wrote:

laise (Alexey Chernenkov) wrote:

It is a VERY usefull feature! Can't understand why #tap still need to be used with block only.

Because it's called "tap." Tap doesn't "return self", it taps into an execution flow, extracting an intermediate value for inspection without interrupting the original flow. The analogy is literally tapping a hole in a pipe, to extract liquid samples at various phases in a process. (Also think of tapping a phone line.) Changing its semantics to a straight-up "returns self" method would just make it idiosyncratic, instead of metaphoric.

I still like #tap.

  1. It would only require removal of the block-yielding requirement.
  2. It's functionally equivalent to tapping into execution flow but doing nothing, as in tap {}. I think the objection that it no longer means we're tapping execution flow is a bit pedantic; we are tapping execution flow, but the only operation we seek is the reference to the object.

My favorite remains #self, but I can appreciate objections, especially confusion over "self" and "obj.self" and especially "self.self" if #self can be overridden. It does have the advantage that there's probably fewer libraries that define their own "self" method than any other suggestion here.

Other systems I know would use #identity. It also maps to functional programming language for a function that just returns its sole argument (in this case, the 0th argument, the object itself). I would support #identity as well.

Matz wrote:

The point is when we see the code like:

[1,2,3,4,5,1,2,2,3].group_by(&:self)

sometimes it would be less intuitive that self refers elements in the array, not self in the scope.

I have trouble imagining such a scenario. I actually think the some_array.group_by(&:self) example is a strong plus for this feature.

I'm also not entirely convinced it's really YAM, since 'self' is already a word in the language, all we're doing is pushing it from keyword to keyword+method, i.e. making it more easily accessible from outside the instance. Even more so since this use-case is really the only place it will ever show up; if there was another way to toproc the 'self' keyword to make it easy to pass to a method like groupby I'd be for that as well/instead.

Another option based on matz's objection: #reference. We want a method that returns the reference to the object we're calling against. #reference seems logical.

[1,2,3,4,5,1,2,2,3].group_by(&:reference)

Variations on this might be #selfreference, #selfref, #selfref. Also #selfobject, #selfobj, #selfobj.

#self and #identity are probably the most likely to be guessed by a new user.

#34 Updated by Charles Nutter 10 months ago

Another argument why "tap" is fine...

If tap were defined in a functional style, it would be simply

def tap(obj, &block)
block.call(obj)
obj
end

Anywhere you can pass a function you should be able to pass a no-op function, so tap could be defined as

def tap(obj, &block)
block = proc{} unless block
block.call(obj)
obj
end

So defining tap such that it defaults to a no-op function (i.e. does not yield if block not given) seems perfectly valid to me.

#35 Updated by Matthew Kerwin 10 months ago

headius (Charles Nutter) wrote:

Another option based on matz's objection: #reference. We want a method that returns the reference to the object we're calling against. #reference seems logical.

[1,2,3,4,5,1,2,2,3].group_by(&:reference)

+1. It's sensible (i.e. anyone who knows OOP knows what 'reference' means), there's no overloading of names, and the intention is clear.

So defining tap such that it defaults to a no-op function (i.e. does not yield if block not given) seems perfectly valid to me.

I know I'm throwing a lot of paint at this shed, but while I agree that a default noop #tap is valid, I still strongly believe it makes ary.group_by(&:tap) seem like voodoo. I like #reference a lot.

#36 Updated by Charlie Somerville 10 months ago

I think out of all the options proposed, 'identity' is the most readable/quickly understandable.

For example, I think the use of 'identity' reads very nicely in [1,2,3,4].group_by(&:identity)

#37 Updated by Matthew Kerwin 10 months ago

charliesome (Charlie Somerville) wrote:

I think out of all the options proposed, 'identity' is the most readable/quickly understandable.

For example, I think the use of 'identity' reads very nicely in [1,2,3,4].group_by(&:identity)

Except that #identity seems to imply the same thing as #id , and "a".id is not necessarily == "a".id , as Matz said earlier.

The advantage of using #reference is that there's no existing method or concept we're overloading; there are no such things as "reference" objects (or "pointers") in Ruby -- it is understood that all references are automagically dereferenced when operated on -- so returning the references and then comparing them should be understood by most rubyists as effectively the same as comparing the objects directly (whatever "directly" means).

tl;dr:
Defining #identity that conflicts with #id is confusing, however slightly it might be.

#38 Updated by Charles Nutter 10 months ago

phluid61 (Matthew Kerwin) wrote:

charliesome (Charlie Somerville) wrote:

I think out of all the options proposed, 'identity' is the most readable/quickly understandable.

For example, I think the use of 'identity' reads very nicely in [1,2,3,4].group_by(&:identity)

Except that #identity seems to imply the same thing as #id , and "a".id is not necessarily == "a".id , as Matz said earlier.

I have this concern as well. Having "a".id/object_id and "a".identity return drastically different things feels like it will just be confusing.

#39 Updated by Joel VanderWerf 10 months ago

=begin
Another argument against #identity: it is used by several libraries for something completely different. For example, in narray:

NMatrix.float(2,2).identity
=> NMatrixfloat2,2:
[ [ 1.0, 0.0 ],
[ 0.0, 1.0 ] ]

It's also used in celluloid-zmq:

s1 = PubSocket.new
s1.identity = "publisher-A"

I vote for "itself" or "self", which are unlikely to be defined anywhere with some fundamentally different meaning.

=end

#40 Updated by Marc-Andre Lafortune 8 months ago

Slide attached.

I hope to win the prize for simplest slide too.

#41 Updated by Charlie Somerville 8 months ago

marcandre: I think you made a mistake in your slide. It says "Returns the class of obj", but it should say "Returns obj"

#42 Updated by Yukihiro Matsumoto 8 months ago

I can accept #itself. I want to see it isn't conflict with existing methods.

Matz.

#43 Updated by Andrew Vit 3 months ago

Rails ActiveSupport includes a similar method called presence. There is also a request to add block support to it, for a similar purpose: https://github.com/rails/rails/pull/13416#issuecomment-32636227

#44 Updated by Marc-Andre Lafortune 3 months ago

Andrew Vit wrote:

Rails ActiveSupport includes a similar method called presence.

Mmm, no, that's quite different. "".presence # => nil for example.

#45 Updated by Fuad Saud 3 months ago

Wouldn’t such method accepting a block remove the need to have Object#tap at all? As I understand this method is just a tap that doesn’t need a block. 
-- 
Fuad Saud
Sent with Airmail

#46 Updated by Tsuyoshi Sawada 3 months ago

I would like to propose receiver as the method name.

#47 Updated by Matthew Kerwin 3 months ago

On Jan 21, 2014 12:29 AM, "Fuad Saud" fuadksd@gmail.com wrote:

Wouldn’t such method accepting a block remove the need to have Object#tap
at all? As I understand this method is just a tap that doesn’t need a
block.

That depends on the contract. I was under the impression that #itself (or
whatever name) in block form would return the value of the block. e.g:

def tap
yield self if blockgiven?
self
end
def itself
if block
given?
yield self
else
self
end
end

#48 Updated by Fuad Saud 3 months ago

That is interesting behaviour for chaining; not sure if consistent though.


Sent from Mailbox for iPhone

#49 Updated by Andrew Vit 3 months ago

Mmm, no, that's quite different. "".presence # => nil for example.

Yes, I was aware of the differences from ActiveSupport presence, I just wanted to point out the similar need for chaining "itself" with a block.

The proposed "itself" method also looks similar to "Enumerable#map" but for a single object, effectively:

["ruby"].map {|n| n.upcase }.first
# same as:
"ruby".itself {|n| n.upcase }

But I suppose "map" as a method name would be out of the question.

Has anyone considered "yield" as the method name? It also seems to fit well, and would likely not conflict with anything:

class Object
def yield
if block_given?
yield self
else
self
end
end
end

# yield as a noun means "the result": an object's result (its yield) is itself
"ruby".yield  #=> "ruby"

# yield as a verb means "give way to" or "produce": the object gives way to the block
"ruby".yield {|s| s.upcase } #=> "RUBY"

Although there might be confusion between yield as a keyword and self.yield as a method, I do like this symmetry:

yield self # calls block with self, returns result
self.yield(&block) # calls block with self, returns result
self.yield # implicit identity, just like: {|obj| obj }

Also available in: Atom PDF