Project

General

Profile

Feature #15897

`it` as a default block parameter

Added by mame (Yusuke Endoh) 15 days ago. Updated 1 day ago.

Status:
Open
Priority:
Normal
Target version:
-
[ruby-core:92981]

Description

How about considering "it" as a keyword for the block parameter only if it is the form of a local varaible reference and if there is no variable named "it"?

[1, 2, 3].map { it.to_s } #=> ["1", "2", "3"]

If you are familiar with Ruby's parser, this explanation is more useful: NODE_VCALL to "it" is considered as a keyword.

Examples:

public def it(x = "X")
  x
end

[1, 2, 3].map { it.to_s }    #=> ["1", "2", "3"]
[1, 2, 3].map { self.it }    #=> ["X", "X", "X"] # a method call because of a receiver
[1, 2, 3].map { it() }       #=> ["X", "X", "X"] # a method call because of parentheses
[1, 2, 3].map { it "Y" }     #=> ["Y", "Y", "Y"] # a method call because of an argument
[1, 2, 3].map { it="Y"; it } #=> ["Y", "Y", "Y"] # there is a variable named "it" in this scope

it = "Z"
[1, 2, 3].map { it.to_s }    #=> ["Z", "Z", "Z"] # there is a variable named "it" in this scope

Pros:

  • it is the best word for the feature (according to matsuda (Akira Matsuda))
  • it is reasonably compatible; RSpec won't break because their "it" requires an argument

Cons:

  • it actually brings incompatibility in some cases
  • it is somewhat fragile; "it" may refer a wrong variable
  • it makes the language semantics dirty

Fortunately, it is easy to fix the incompatible programs: just replace it with it(). (Off topic: it is similar to super().)
Just inserting an assignment to a variable "it" may affect another code. This is a bad news, but, IMO, a variable named "it" is not so often used. If this proposal is accepted, I guess people will gradually avoid the variable name "it" (like "p").
The dirtiness is the most serious problem for me. Thus, I don't like my own proposal so much, honestly. But it would be much better than Perlish @1. (Note: I don't propose the removal of @1 in this ticket. It is another topic.) In any way, I'd like to hear your opinions.

An experimental patch is attached. The idea is inspired by jeremyevans0 (Jeremy Evans)'s proposal of @.

P.S. It would be easy to use _ instead of it. I'm unsure which is preferable.


Files

its.patch (4.92 KB) its.patch mame (Yusuke Endoh), 06/04/2019 05:07 AM

Related issues

Related to Ruby trunk - Misc #15723: Reconsider numbered parametersFeedbackActions

History

#1

Updated by mame (Yusuke Endoh) 15 days ago

  • Related to Misc #15723: Reconsider numbered parameters added

Updated by Hanmac (Hans Mackowiak) 15 days ago

_ can't be used as default block parameter because it already has a special meaning when using block variables like {|a,_,_,b| }

Updated by mame (Yusuke Endoh) 15 days ago

Hanmac (Hans Mackowiak),

You cannot use both it and the ordinal parameter |a,_,b| simultaneously. It will cause a SyntaxError like this.

$ ./miniruby -e '1.times {|a,_,b| it }'
-e:1: ordinary parameter is defined

Updated by Hanmac (Hans Mackowiak) 15 days ago

mame (Yusuke Endoh) you got me wrong, i mean you might not use _ for this like you said in your P.S.

Updated by shevegen (Robert A. Heiler) 15 days ago

I was about to write a very lengthy reply, but I think it would be too difficult to read for others, so this is a somewhat shorter variant. So just the main gist:

(1) I don't quite like the name "it", mostly due to semantics (the name does not tell me much at all), but one advantage is that "it" is short to type. I do not have a good alternative name, though. _ as a name would be even shorter, and avoids some of the semantic-meaning problem, but may have other small issues - see a bit later what I mean here.

(2) Even though I do not like the name "it", to me personally, it would be better to see that BOTH 1 (1 1) @2 and "it" would be added, in the sense that then people could just pick what they prefer. Obviously I see no problem with 1 (1 1) @2 at all, so I am biased. But that way people could just use whatever they prefer. (There could be another name, of course ... I thought about some way for acessing block/procs data ... such as BLOCK or ProcData or something like that. One problem is that this is all longer to type than e. g. "it" or 1 (1 1) but perhaps we could use some generic way to access what a proc/block represents anyway, even aside from the proposal itself, similar to ... LINES or method or FILE or so. Just so that we may have a concept for it; although it will probably not be widely used. But I have not really thought about this much. Note that I was wondering about something like this for case/when structures as well, so that we could access them more easily, also from "outside" of methods, but I digress here).

IF only "it" alone were to be added, then I would rather prefer that neither 1 (1 1) @2 nor "it" would be added, and we'd all only use the oldschool way (which is fine, too). But as said, I am biased so I think 1 (1 1) @2 etc... are perfectly fine.

I think this is also a problem I have with the "this is perl" comments in general - to me it is not like perl anywhere. And the old variant such as:

foo.each {|a,b,c|

just continue to work fine; so to me the change is primarily about adding more flexibilty. This was a problem I have had with the use cases that were mentioned - people would be very eager to point out what they dislike (understandably so), but at the same time would not want to mention use cases that may be useful. So I think in general, it would be better to be more objective in giving examples, even if a certain functionality is disliked. Use cases should ideally be extensive, not just focusing on what the ruby user at hand may dislike the most (since they may tend to focus on that first, which is understandable, and then ignore anything else; I tend to do so myself sometimes).

To clarify this - my primary problem with "it" is the name itself, not the functionality that is associated with it.

Using _ avoids the name/semantic issue a bit, to some extent, but I think _ has some slight other issues.

For example, the _ variable is used quite a lot in ruby code out there, or at the least in some code bases, so I am not sure it would be a good name here. Note that I use _ as "throwaway" variable a lot in my own code. This should be kept in mind, at the least for when ruby users do something similar (I have no idea whether it is common or not, though). I think hanmac sort of reffered to this too in a way.

Since I like _ a lot, I'd rather see "it" be added than _, because I will most likely not use "it" in my own code ;D , whereas I would still use _ fine, possibly even within blocks, and possibly 1 (1 1) @2 too, at the least for quick debugging, if it were to stay/remain, which I hope it will. But I am biased. ;)

I should also note that while I think 1 (1 1) @2 are perfectly fine, I also don't have a big problem if it were not to be added permanently, even though I think it is fine if it would, evidently. The oldschool way is the best.

Since I see 1 (1 1) @2 mostly as a convenience feature, though, I can continue to work with ruby just fine. I just don't think that all prior statements in particular in the other thread(s) made a whole lot of sense; and I have no problem if "it" would be added either as well, since I can avoid it, and just use 1 (1 1) @2 for debugging. ;)

I think, realistically, I assume that most ruby users will continue to just use ruby code like it used to be, like:

foobar.some_method {|a, b, c, d, _, f|

I am quite certain that I will keep on using the above variant, and that neither "it" nor 1 (1 1) @2 would persist in my own code - but for quick debugging, in particular for longer names, I think 1 (1 1) @2 is really great. I don't think "it" would be equivalent to this, though; at the least to me, "it" is not the same as e. g. 1 (1 1) @2 in several ways. But as said, I have no problem at all if both variants would be added.

Of course I am not naive - when features are added/offered, people will use it and play around with it; adults are kids after all, just play with different things. ;) I just don't think that the primary focus for dislike should be limited to just some use cases, without considering situations such as e. g. 1 (1 1) @2 not be used in production code, but just for debugging purposes alone. I don't have a
problem with the scenario where we can avoid naming parameters, but to me this is not the primary use case I would like to focus myself - for me the "pp 1 (1 1); pp @3" variant really is the more important aspect of the suggestion. When you come from this point of view then I think it is easy to understand that "it", aside from the name, is not exactly the same.

Last but not least, as mame wrote - I think if you have not yet commented on either 1 (1 1) @2 (in other issues) and/or "it" (here in this proposal), it may be good to comment on the suggestion/idea itself here. Matz actually asked for feedback before, not only in the other thread but also the old(er) ones predating these.

Updated by phluid61 (Matthew Kerwin) 15 days ago

Hanmac (Hans Mackowiak) wrote:

mame (Yusuke Endoh) you got me wrong, i mean you might not use _ for this like you said in your P.S.

Isn't {|_| _ } no more or less conflicting than {|it| it } ? You can either use positional args, or a default arg, but not both. So {|_| _ } means what it currently means, irrespective of this proposal.

Unless you mean there's a chance of an outer scope that already uses _ as an arg, creating a conflict in the inner scope?

Updated by phluid61 (Matthew Kerwin) 15 days ago

shevegen (Robert A. Heiler) wrote:

I was about to write a very lengthy reply, but I think it would be too difficult to read for others, so this is a somewhat shorter variant.

Good grief.

Updated by mikegee (Michael Gee) 14 days ago

RSpec won't break because their "it" requires an argument

Unfortunately this is not accurate. RSpec has a shorthand style like this:

subject { fortytwo }
it { is_expected.to eq 42 }

Updated by jeremyevans0 (Jeremy Evans) 14 days ago

mikegee (Michael Gee) wrote:

RSpec won't break because their "it" requires an argument

Unfortunately this is not accurate. RSpec has a shorthand style like this:

subject { fortytwo }
it { is_expected.to eq 42 }

That's a block argument :). In any case, the parser treats it differently as NODE_ITER/NODE_FCALL, not as NODE_VCALL:

RubyVM::AbstractSyntaxTree.parse("it").children
# => [[], nil, #<RubyVM::AbstractSyntaxTree::Node:VCALL@1:0-1:2>]

RubyVM::AbstractSyntaxTree.parse("it{}").children
# => [[], nil, #<RubyVM::AbstractSyntaxTree::Node:ITER@1:0-1:4>]

RubyVM::AbstractSyntaxTree.parse("it{}").children.last.children
# => [#<RubyVM::AbstractSyntaxTree::Node:FCALL@1:0-1:2>, #<RubyVM::AbstractSyntaxTree::Node:SCOPE@1:2-1:4>]

Regarding the proposal itself, the dirtying of the semantics bothers me about this as well. However, I can see where people would find it cleaner than @ in terms of syntax, so this is really a tradeoff between the cleanliness of semantics and syntax. I don't have a strong opinion on it compared to @, but I think either is preferable to @1 or _.

Updated by Eregon (Benoit Daloze) 14 days ago

I like the proposal and I think it reads nicer than @.
It's a bit magical that giving it an argument changes the semantics, but that's somewhat similar to having a p local variable, and I think it's worth the better readability and syntax.
Not so many people seem confused by p so I guess it would not be too surprising and just intuitive in common cases.

I also like _ because _ is "unnamed" (rather than abstract like it) and a "placeholder for the missing argument name" and this whole feature is about removing the need to name the block argument.
Showing them in code for an easy comparison:

[1, 2, 3].map { @ * 3 }
[1, 2, 3].map { @1 * 3 }
[1, 2, 3].map { _ * 3 }
[1, 2, 3].map { it * 3 }
[1, 2, 3].map { |n| n * 3 }

In the case of nested unnamed block arguments ([1].map { _ * 3.then { _ } }), both _ and it would refer to the outer block's argument, and consider the inner block(s) have no arguments.
That could be confusing, it might be worth warning or rejecting such cases, although they are probably rare.

But it would be much better than Perlish @1

Strongly agreed.

Updated by shugo (Shugo Maeda) 14 days ago

Thus, I don't like my own proposal so much, honestly. But it would be much better than Perlish @1.

I don't like both proposals, but I prefer @1 to it because @1 looks ugly and may help prevent overuse.
Furthermore, the proposed it may be more Perlish in the sense that it depends on the context.

Updated by Eregon (Benoit Daloze) 13 days ago

shugo (Shugo Maeda) wrote:

I don't like both proposals, but I prefer @1 to it because @1 looks ugly and may help prevent overuse.

I think we should never purposefully introduce something ugly in the language.
Preventing overuse is I think best done by limiting to a single argument (as argued in #15723).

Updated by shugo (Shugo Maeda) 5 days ago

Eregon (Benoit Daloze) wrote:

shugo (Shugo Maeda) wrote:

I don't like both proposals, but I prefer @1 to it because @1 looks ugly and may help prevent overuse.

I think we should never purposefully introduce something ugly in the language.

So let's reject both proposals.

Preventing overuse is I think best done by limiting to a single argument (as argued in #15723).

I guess it will be overused when a block takes only one argument.

Updated by Eregon (Benoit Daloze) 4 days ago

shugo (Shugo Maeda) wrote:

I think we should never purposefully introduce something ugly in the language.

So let's reject both proposals.

That's not what I meant. I'd rather not have something ugly in the language at all.
But I think we can make it not ugly, either with _ or it proposed here.

I think readability matters a lot to many people, we typically read code more often than we write.
_ or it seem much better for readability than @ or @1.

Preventing overuse is I think best done by limiting to a single argument (as argued in #15723).

I guess it will be overused when a block takes only one argument.

Maybe, but that harm would be IMHO very little, because it and _ read nicely and easily,
compared to spreading multiple numbered Perlish variables in Ruby code.

Updated by janosch-x (Janosch Müller) 4 days ago

Kotlin has implemented it like this (docs).

From purely personal experience, after doing just a little bit of Kotlin, I often feel a temptation to use it when writing Ruby, just to notice that I can't. I found it in Kotlin natural, easy to get used to, and easy to parse visually and understand when re-reading my code after a while.

Updated by shugo (Shugo Maeda) 1 day ago

Eregon (Benoit Daloze) wrote:

shugo (Shugo Maeda) wrote:

I think we should never purposefully introduce something ugly in the language.

So let's reject both proposals.

That's not what I meant. I'd rather not have something ugly in the language at all.
But I think we can make it not ugly, either with _ or it proposed here.

it doesn't look ugly at first glance, but it makes the language semantics dirty as mame admitted in his proposal.

I think readability matters a lot to many people, we typically read code more often than we write.
_ or it seem much better for readability than @ or @1.

If it is a normal reserved word, I agree with you.
However, the semantics of it depends on the context, and therefore @1 is more readable for me.

Updated by sawa (Tsuyoshi Sawada) 1 day ago

I propose to use a new keyword item.

  • I feel that using a keyword spelt in letters is the right way here since keywords like self are used in other cases where we reach for things out of the blue without receiving them through argument signature.
  • "Item" is close enough to "it", so we may achieve sympathy from some of the people opting for "it", but is not "it", so it does not have the problem that "it" has.
  • \item is used in LaTeX as a command to introduce bullet points in listed structures, which is analogous to elements in a block led by each, map and their kins.

    [1, 2, 3].map{item ** 2} # => [1, 4, 9]
    
  • At the same time, "item" does not exclusively mean "element". It is a neutral term regarding that. So it would not be unnatural to be used in blocks led by methods like then, tap and their kins.

    "foo".then{item + "bar" + item} # => "foobarfoo"
    
  • I have a concern that "it" somewhat implies the receiver since it means something whose referent has been fixed in the context. In fact, the method itself returns the receiver.

Updated by janosch-x (Janosch Müller) 1 day ago

sawa (Tsuyoshi Sawada) wrote:

I propose to use a new keyword item.

I think that is a great proposal.

it is nice to read when passed to methods of other objects or when used with binary operators:

strings.each { puts it }
pathnames.map { File.read it }
numbers.map { it + 2 }

unfortunately, it is quite awkward to read when calling its own methods (which is probably the more common case in Ruby):

strings.each { it.chomp!('foo') }
pathnames.map { it.read }
numbers.map { it.next.next }

item works well for both cases:

strings.each { puts item }
pathnames.map { File.read item }
numbers.map { item + 2 }

strings.each { item.chomp!('foo') }
pathnames.map { item.read }
numbers.map { item.next.next }

Updated by mame (Yusuke Endoh) 1 day ago

Don't think that this proposal can be applied to any words. A common name is much more dangerous than a pronoun like it because it is much more frequently used as a method name.

Actually, the count of "def item()" is 20 times more than "def it()" in gem-codesearch result.

$ csearch "^\s*def it\(?\)?$" | wc -l
12
$ csearch "^\s*def item\(?\)?$" | wc -l
225

Furthermore, I found some codes that will be broken if "item" becomes a soft keyword.

https://github.com/ginty/cranky/blob/ca7176da2b8e69c37669afa03fee1a242338e690/lib/cranky/job.rb#L49
https://github.com/carlosipe/mercado-libre/blob/ebb912a7c4e942eb38e649d8e11a005c288ebc92/test/mercadolibre.rb#L44

Also available in: Atom PDF