Project

General

Profile

Actions

Feature #18980

closed

Re-reconsider numbered parameters: `it` as a default block parameter

Added by k0kubun (Takashi Kokubun) over 1 year ago. Updated 3 months ago.

Status:
Closed
Target version:
[ruby-core:109707]

Description

Problem

Numbered parameters (_1, _2, ...) look like unused local variables and I don't feel motivated to use them, even though I need this feature very often and always come up with _1.

[1, 2, 3].each { puts _1 }

I have barely used it in the last 2~3 years because it looks like a compromised syntax. I even hesitate to use it on IRB.

Why I don't use _1

I'm not clever enough to remember the order of parameters. Therefore, when a block has multiple parameters, I'd always want to name those parameters because which is _1 or _2 is not immediately obvious. Thus I would use this feature only when a block takes a single argument, which is actually pretty common.

If I use _1, it feels like there might be a second argument, and you might waste time to think about _2, even if _2 doesn't exist, which is a cognitive overhead. If you use it, it kinda implies there's only a single argument, so you don't need to spend time remembering whether _2 exists or not. It is important for me that there's no number in it.

Proposal

  • Ruby 3.3: Warn it method calls without a receiver, arguments, or a block.
  • Ruby 3.4: Introduce it as follows.

Specification

[1, 2, 3].each { puts it }

its behavior should be as close to _1 as possible. it should treat array arguments in the same way as _1. it doesn't work in a block when an ordinary parameter is defined. it is implemented as a special case of getlocal insn, not a method. it without an argument is considered _1 or a normal local variable if defined. it is considered a method call only when it has any positional/keyword/block arguments.

Full specification
# Ruby 3.4
def foo
  it #=> method call
  _1 #=> method call
  1.times do
    p it #=> 0
    it = "foo"
    p it #=> "foo"

    p _1 #=> 0
    _1 = "foo" # Syntax Error
    p _1 #=> N/A

    p foo #=> method call
    foo = 1
    p foo #=> local var

    it "foo" do # method call (rspec)
    end
  end
  1.times do ||
    p _1 # Syntax Error
    it # method call
  end
  1.times do
    ["Foo"].any? {|| it } # method call
  end
  yield_1_and_2 do # yield 1, 2
    p _1 #=> 1
    p it #=> 1
  end
  yield_ary do # yield [1, 2]
    p _1 #=> [1, 2]
    p it #=> [1, 2]
  end
  1.times do
    p [_1, it] # Syntax Error
    p [_2, it] # Syntax Error
  end
end

Past discussions

Compatibility

it has not necessarily been rejected by Matz; he just said it's difficult to keep compatibility and it or this could break existing code. It feels like everybody thinks it is the most beautiful option but is not sure if it breaks compatibility. But, in reality, does it?

The following cases have been discussed:

  • it method, most famously in RSpec: You almost always pass a positional and/or block argument to RSpec's it, so the conflict is avoided with my proposal. You virtually never use a completely naked it (comment).
  • it local variable: With the specification in my proposal, the existing code can continue to work if we consider it as a local variable when defined.

With the specification in my proposal, existing code seems to break if and only if you call a method #it without an argument. But it seems pretty rare (reminder: a block given to an RSpec test case is also an argument). It almost feels like people are too afraid of compatibility problems that barely exist or have not really thought about options to address them.

Also, you could always experiment with just showing warnings, which doesn't break any compatibility. Even if it takes 2~3 years of a warning period, I'd be happy to use that in 3 years.

Confusion

We should separately discuss incompatible cases and "works but confusing" cases. Potential confusion points:

  • RSpec's it "tests something" do ... end vs it inside the do ... end
  • it could be a local variable or _1, depending on the situation

My two cents: You'd rarely need to write it directly under RSpec's it block, and you would just name a block argument for that case. In a nested block under a test case, I don't think you'd feel it is RSpec's. When you use a local variable it = 1, you'd use the local variable in a very small scope or few lines because otherwise, it'd be very hard to figure out what the local variable has anyway. So you'd likely see the assignment it = 1 near the use of the local variable and you could easily notice it is not _1. If not, such code would be confusing and fragile even without this feature. The same applies when it is a method/block argument.

I believe it wouldn't be as confusing as some people think, and you can always choose to not use it in places where it is confusing.


Related issues 3 (0 open3 closed)

Related to Ruby master - Feature #4475: default variable name for parameterClosednobu (Nobuyoshi Nakada)Actions
Related to Ruby master - Misc #15723: Reconsider numbered parametersFeedbackmatz (Yukihiro Matsumoto)Actions
Is duplicate of Ruby master - Feature #15897: `it` as a default block parameterClosedmatz (Yukihiro Matsumoto)Actions
Actions #1

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Is duplicate of Feature #15897: `it` as a default block parameter added
Actions #2

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Related to Feature #4475: default variable name for parameter added
Actions #3

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Related to Misc #15723: Reconsider numbered parameters added
Actions #4

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Description updated (diff)
Actions #5

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Description updated (diff)
Actions #6

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Description updated (diff)

Updated by Eregon (Benoit Daloze) over 1 year ago

I regularly use _1, _2, maybe it's question of habit and getting used to it?
_1 seems already used in many gems (gem-codesearch '\b_1\b' | grep '\.rb:\s').
It's also a syntax used in other languages, and _ is already established as a "special variable" in Ruby since ages in IRB.

Numbered parameters (_1, _2, ...) look like unused local variables

Unused local variables are more like _[a-z]\w+, the number makes the variable "anonymous" so it can't be an unused local variable (also if you remove the leading _ it's no longer a variable).

I dislike it, because it looks like a local variable (or method call) and doesn't make it clear it's special syntactic sugar with wider effect than a local variable (i.e., it adds a parameter to the surrounding block).
If it is used in a block it does not stand out, it just looks like any other variable/vcall. OTOH _1/_2 is fairly visible and makes it easy to see how many parameters the block has.

Also it only works for a single argument so it feels inconsistent and incomplete.
There is probably also little chance to syntax highlight it correctly in IDEs/editors as something special, because it wouldn't be a keyword, it's already used as a method (in RSpec/MSpec/etc), and it seems too complicated for most editors to be able to discern which meaning of it it is based on the context (unless a LSP is used maybe, but many editors don't).

Updated by k0kubun (Takashi Kokubun) over 1 year ago

I regularly use _1, _2, maybe it's question of habit and getting used to it?

Matz also made that argument in the Ruby Committers vs the World 2019, saying many people started using -> {} after almost 100% of the people objected to the idea (ref). I wrote this proposal after listening to that, which is why I wrote "I have barely used it in the last 2~3 years" to make a point that I never get used to it.

It's also a syntax used in other languages

it is too.

_ is already established as a "special variable" in Ruby since ages in IRB.

What do you mean? I thought we were talking about _1, not _. If this feature were introduced as _, I would use it much more often than _1, but Matz finds _ confusing, and I like it more anyway.

Also it only works for a single argument so it feels inconsistent and incomplete.

I'd like to note that I'm not proposing to delete numbered parameters, and I thought it's nice to re-raise this topic because we no longer need to discuss "what about second arguments?" now that we already have _2. You already have a solution to your use case, and I still don't have one.

So, here's the use case I have but you don't seem to share with me. I'm not clever enough to remember the order of parameters. Therefore, when a block has multiple parameters, I'd always want to name those parameters because which is _1 or _2 is not immediately obvious. Thus I would use this feature only when a block takes a single argument, which is actually pretty common. If I use _1, it feels like there might be a second argument, and you might waste time to think about _2, even if _2 doesn't exist, which is a cognitive overhead. If you use it, it kinda implies there's only a single argument, so you don't need to spend time remembering whether _2 exists or not. It is important for me that it is "incomplete".

There is probably also little chance to syntax highlight it correctly in IDEs/editors as something special, because it wouldn't be a keyword, it's already used as a method (in RSpec/MSpec/etc), and it seems too complicated for most editors to be able to discern which meaning of it it is based on the context (unless a LSP is used maybe, but many editors don't).

Kotlin's it doesn't really stand out like a keyword on IntelliJ (it's bold white, not really different from non-bold while method calls or variables) compared to other yellow ones, but I never felt it's a problem because it is typically used when it's a one liner and what it means is obvious without fancy highlighting.

Updated by Eregon (Benoit Daloze) over 1 year ago

k0kubun (Takashi Kokubun) wrote in #note-8:

What do you mean? I thought we were talking about _1, not _. If this feature were introduced as _, I would use it much more often than _1, but Matz finds _ confusing, and I like it more anyway.

I just think _, _1 and _foo are related, they are all "special local variables".
The first one is "last result in IRB", the second is "numbered parameters" and the third is "unused variable".
So I find it consistent that numbered paramters uses one of these "special local variables".

So, here's the use case I have but you don't seem to share with me. I'm not clever enough to remember the order of parameters. Therefore, when a block has multiple parameters, I'd always want to name those parameters because which is _1 or _2 is not immediately obvious. Thus I would use this feature only when a block takes a single argument, which is actually pretty common. If I use _1, it feels like there might be a second argument, and you might waste time to think about _2, even if _2 doesn't exist, which is a cognitive overhead. If you use it, it kinda implies there's only a single argument, so you don't need to spend time remembering whether _2 exists or not. It is important for me that it is "incomplete".

This is a good argument, I think it would be good to add to the issue description.

I agree if e.g. _1 and _2 are far apart or the block is long it hurts readability, and IMHO one should use named parameters instead for long blocks or if arguments are used far apart.
A typical case I use _1 and _2 would be hash.each_pair { p [_1, _2] } or so, where the block is very small and the overhead to see if there is a _2 seems very low.
I think also in most cases the method which is given the block should be clear how many arguments it passes to the block, and the in the majority of cases it's a single one.

Kotlin's it doesn't really stand out like a keyword on IntelliJ (it's bold white, not really different from non-bold while method calls or variables) compared to other yellow ones, but I never felt it's a problem because it is typically used when it's a one liner and what it means is obvious without fancy highlighting.

My worry with that is then it's even harder to notice than _1/_2/etc and so it's unclear if the block takes 0 or 1 argument, one need to look for a innocuous word (it) in the middle of the rest the expression, which is typically a bunch of other words (method calls & variables).
I think it's a bit like string interpolation, that's visible thanks to the sigils and syntax highlighting. If it was a word instead of #{} it would be a total nightmare.

Updated by Eregon (Benoit Daloze) over 1 year ago

innocuous word

I misused innocuous here. I want to mean a "common word", and it's hard to notice because it's the same syntax as a local variable or method call.

Actions #11

Updated by k0kubun (Takashi Kokubun) over 1 year ago

  • Description updated (diff)

Updated by k0kubun (Takashi Kokubun) over 1 year ago

This is a good argument, I think it would be good to add to the issue description.

I'm glad it made sense to you. Sure, I updated the issue description to include that.

There is probably also little chance to syntax highlight it correctly in IDEs/editors as something special, because
it's already used as a method (in RSpec/MSpec/etc)

I'm not sure if it's too difficult to distinguish "a method (in RSpec/MSpec/etc)" taking arguments from it without an argument. Even if you don't have an AST, a peek of the next token (it "returns xxx" or it { is_expected xxx } vs it.xxx or xxx(it)) would give you a fairly reliable guess.

it wouldn't be a keyword

I don't think this is a blocker. For example, while require is not a keyword, my VSCode highlights it as if it's a keyword. It seems to just match the text in an identifier instead of checking whether it's an actual Ruby keyword or not. So highlighting it as "special local variables" seems feasible.

it seems too complicated for most editors to be able to discern which meaning of it it is based on the context (unless a LSP is used maybe, but many editors don't).

Note that this general problem is not specific to it. Bringing up the require example again, let's say your method have a keyword argument require: false, my VSCode highlights the argument as if it's a require method. But it doesn't make you think "let's not introduce require unless it becomes a keyword!", right? We use basic text matching and can live with it.

My worry with that is then it's even harder to notice than _1/_2/etc and so it's unclear if the block takes 0 or 1 argument

So my suggestion to this problem is to use the same color as "special local variables" when an identifier not taking arguments is named it. This could of course highlight a local variable it as if it's _1, but as I discussed in the "Confusion" section, a non-meaningful variable name it would be used when you can easily figure out the meaning of it, because otherwise you'd give a meaningful name to it to make it understandable. So you should typically be able to understand it's a local variable it even with the imperfect syntax highlight.

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

If we are considering an alternative to _1, I'm going to vote again for @ (bare at sign) (originally proposed in https://bugs.ruby-lang.org/issues/4475#note-10). This is currently invalid syntax, so there is no possibility of backwards compatibility issues when using it. I think it would also do a better job of standing out than it, which appears to be a normal local variable.

Updated by hanazuki (Kasumi Hanazuki) over 1 year ago

I also avoid using solo _1 as I have to search for _2 when I see _1 in a block. It's not because of a feeling caused by the number but because the behavior of _1 changes depending on the occurrence of _2.

Given that we have a block that takes a single argument, when only _1 is used in the block, _1 refers to the argument itself. Otherwise, _1 is bound to the first element of the argument.
That is:

  • proc { _1 }.call([1, 2]) #=> [1, 2]
  • proc { _2; _1 }.call([1, 2]) #=> 1

So when you encounter _1 in a code, you always have to scan the entire block (though it should be short) to see if _2 occurs to determine how _1 works. (This problem has been discussed in #16178)

Syntax that requires this kind of arbitrary lookahead to understand is rare in Ruby, so I hesitate to use it in order to keep my code simple. But blocks like{|x| x.something(...) } and {|y| something(y) } are actually common, and I hope Ruby to have a variant of _1 that is always bound to the first argument (not its first element).

Updated by graywolf (Gray Wolf) over 1 year ago

jeremyevans0 (Jeremy Evans) wrote in #note-13:

If we are considering an alternative to _1, I'm going to vote again for @ (bare at sign) (originally proposed in https://bugs.ruby-lang.org/issues/4475#note-10). This is currently invalid syntax, so there is no possibility of backwards compatibility issues when using it. I think it would also do a better job of standing out than it, which appears to be a normal local variable.

This sounds like a good idea. Fully backwards compatible, stands out and easy to highlight (since that seems to be relevant based on the debate here).

Updated by k0kubun (Takashi Kokubun) over 1 year ago

For your information, @ was a candidate discussed and rejected when @1 was changed to _1. I'm raising it partly because it hasn't been explicitly rejected. However, for the same reason discussed above, @ is still a better compromise for me than _1 if we choose not to introduce it.

https://docs.google.com/document/d/1XypDO1crRV9uNg1_ajxkljVdN8Vdyl5hnz462bDQw34/edit#heading=h.s5b1eox1ywa3

Updated by baweaver (Brandon Weaver) over 1 year ago

Aliasing

While I understand that _1 is not necessarily clear and immediately obvious it has seen a lot of use. Even if we were to introduce it or any other syntax it would effectively be an alias so as to not break compatibility.

If that were the case then the benefit gained from introducing it would mostly be in having another way to express the same idea, albeit clearer in meaning potentially.

@1 syntax

I do not see a distinct enough difference to justify @1, and would reiterate previous arguments that even if it is technically illegal syntax it will still be confused for instance variables, which presents more problems than _1 which is assumed to be local.

2+ Args

The problem I have with it is how we express more than one argument. Blocks accept the full range of -> (pos_req, pos_opt = 1, *rest, key_req:, key_opt: 1, **key_rest, &block) so dealing with all of those as a singular splatted it (guessing implied -> *it { it }) seems to be less flexible.

How would you propose we deal with such cases?

Updated by k0kubun (Takashi Kokubun) over 1 year ago

@1 syntax

In case it wasn't clear, the alternative that @jeremyevans0 (Jeremy Evans), @graywolf (Gray Wolf), and I talked about was not @1 but @. They're different.

How would you propose we deal with such cases?

I discussed that at https://bugs.ruby-lang.org/issues/18980#note-8.

2+ args cases are already solved thanks to _1 and _2. You already have a solution to that use case and I don't think we need every other feature to support that.

Instead, I want to have an expression to say "this block has only a single parameter so I'm gonna use it" because _1 implies _2 possibly exists, which is a cognitive overhead when reading the code.

Updated by jeremyevans0 (Jeremy Evans) over 1 year ago

baweaver (Brandon Weaver) wrote in #note-17:

I do not see a distinct enough difference to justify @1, and would reiterate previous arguments that even if it is technically illegal syntax it will still be confused for instance variables, which presents more problems than _1 which is assumed to be local.

To be clear, I am proposing just @ as an alias for _1 (as an alternative to it). I'm not proposing @1, @2, etc.. We already have _1, _2, etc and I believe this proposal is not to deprecate the current numbered parameters support, but merely offer a more readable alias for _1.

Updated by zverok (Victor Shepelev) over 1 year ago

My 5c: I came to (almost) peace with _1, and we use it extensively in the codebase, and find it quite convenient.

Choosing the designation for it is a hard choice, and after years of consideration, I believe that the current solution, if looking somewhat "weird", is a very well-balanced choice, and I still didn' see any that is better.

From my PoV, the design space can be described this way:

  1. It should be something following the rules of scoping by its name (e.g. _1 is a valid name of local variable, implying locality). This rules out @1, which implies "some special instance variable"
  2. It should look special. Even if you don't know its meaning (just learning Ruby, or haven't upgraded your knowledge of the language for a long time), it should immediately imply "it is not just a regular name like all other names". This rules out it. Even besides "somebody could've used this name already" (and somebody could indeed, besides RSpec, I saw codebases which used this abbreviation to mean "iterator", "item", or "i(ndex of) t(ime point)"), it just doesn't give a strong feeling of "this is a local name, but also a special name".
  3. As far as I understand, we are currently quite reluctant towards introducing "Perlisms" like "this character means something new in that context." For operators (verbs) we seek to recombine existing ones (like in and => for pattern matching), and for variables/methods, we seek to stay with existing naming schemes.
  4. (Have mixed feelings about this one) It probably should allow a sequence of similar names (like _1/_2 do)

TBH, I can't think of much better naming scheme which would satisfy at least 1-3.

About 4: like @k0kubun (Takashi Kokubun), I find myself not very comfortable with _1 meaning different things depending on the presence of _2. As a consequence, I use _1 extensively in obvious cases (when it is the only one), but while iterating on hashes, I frequently prefer to name the block params explicitly.

That actually might be a greater obstacle for adoption (at least for some) than particular naming, which also highlighted by @k0kubun (Takashi Kokubun) in the "Why I don't use _1" section.

In our current codebase (quite large), that switched to Ruby 2.7 a year ago, there are ~200 entries of _1, and 6 (six) usages of _2 (and no usages of _3, _4 etc.).
And while _1 is used extensively and really helps to write shorter and DRYer blocks, our usages of _2 are mostly "clever-isms" like this:

users.sort { _2.last_seen_at <=> _1.last_seen_at }
# the "clever" way of doing just...
users.sort_by(&:last_seen_at).reverse
# ...because we still don't have `reverse_sort_by`

So, the thinking-out-of-the-box solution might be to preserve _1, but deprecate the rest :)

Updated by Eregon (Benoit Daloze) over 1 year ago

The logical sigil for a single arguments would be _ of course (given we have _1, _2, etc).
I forgot, is _ problematic in practice?

The usage of it in irb/pry should not be an issue given the variable is explicitly declared there I'd imagine.
So for cases where it's already a local variable then it just reads that local variable instead of being a numbered parameter.

Updated by zverok (Victor Shepelev) over 1 year ago

I forgot, is _ problematic in practice?

_ is very widespread name to "I don't need that". TBH, till 5 min ago I thought it is processed specially in the case of parameter repetition:

[[1, 2, 3], [4, 5, 6]].map { |i, _, _| i } # => works
[[1, 2, 3], [4, 5, 6]].map { |i, x, x| i } #duplicated argument name
# ..., 3], [4, 5, 6]].map { |i, x, x| i }
# ...                              ^

But actually, to my surprise, this works too (anything starting with _):

[[1, 2, 3], [4, 5, 6]].map { |i, _x, _x| i } # => works

Anyway, _ has a strong association with "drop that".
One might argue that _1 does, too, but I think it looks differently (and, well, it is two years already, I know a lot of people who have their brain rewired to distinguish _1)

Updated by k0kubun (Takashi Kokubun) over 1 year ago

Unused local variables are more like _[a-z]\w+, the number makes the variable "anonymous" so it can't be an unused local variable

Anyway, _ has a strong association with "drop that". One might argue that _1 does, too, but I think it looks differently

It's interesting that "Numbered parameters (_1, _2, ...) look like unused local variables" doesn't seem to resonate with you. To me, _1 looks almost exactly like _l, which is, did you notice, an unused local variable.

Updated by zverok (Victor Shepelev) over 1 year ago

It's interesting that "Numbered parameters (_1, _2, ...) look like unused local variables" doesn't seem to resonate with you.

It did initially, but let's say I got over it. There were several factors in play:

  1. I am very concerned about readability and brevity, but I actually don't believe any "core" name choice makes something "totally unreadable". Syntax structures might, but names are just sigils you soon get used too. Say, I considered yield_self so wrong naming choice I spent the good part of that year fighting for its renaming, but at the same time I started to use it (and it DID make code better, as my colleagues agreed, you just needed to get used to the weird name). I frequently call for name compromises (let's stop on one name and move forward instead of five more years of discussion).
  2. At the moment of introduction of numbered args, I was more concerned that it is a principally wrong feature (it somewhat overshadowed the idea of shortening blocks with method references, with method references then being dropped altogether). But then I tried it reluctantly and turned out it made life much easier.
  3. In practice, it was really easy to get used to. _1 is easy to remember and recognize in others' code. And I never met code that uses _1 before (and rarely the code that uses _l, TBH), so it wasn't like I really needed to rewire my mind to stop recognizing it as "unused variable."
  4. I did the "design space analysis" from above at feature introduction, and I still believe that while _1 looks weird(ish) for those not used to it, it is a reasonable choice. At least in my book, it is arguably better than @1 (associates with non-local name), it (looks like a "regular" name, not a distinguished special thing), or _ (this one is really used as an "unused argument" regularly).

So... yeah.

Actions #25

Updated by k0kubun (Takashi Kokubun) over 1 year ago

Say, I considered yield_self so wrong naming choice I spent the good part of that year fighting for its renaming, but at the same time I started to use it (and it DID make code better, as my colleagues agreed, you just needed to get used to the weird name).

Haha, I rarely used yield_self because it looked like a compromised name, and then I liked and used it so much when then was added. It's a nice analogy.

(let's stop on one name and move forward instead of five more years of discussion)

As a Rubyist who was born in the same year as Ruby and could use Ruby for tens of more years, possibly longer than Matz, experimenting with _1 for 3 years and discussing it again to fix compromises feels like a good use of our time TBH.

I frequently call for name compromises
I still believe that while _1 looks weird(ish) for those not used to it

The general idea of numbered parameters supporters in this ticket seems to be "you'll get used to it". Let's forget the fact that I didn't get used to it after 3 years, if you were able to design Ruby from the ground up, would you still use a syntax that you think is a name compromise and looks weird for newcomers? I wouldn't, and that thinking leads me to pick it for this feature.

Updated by zverok (Victor Shepelev) over 1 year ago

if you were able to design Ruby from the ground up, would you still use a syntax that you think is a name compromise and looks weird for newcomers?

My "design space analysis" was sourced by the same idea. I can say that I would NOT choose it (or any other name looking "regularly"), unless it would be a keyword (like self), having exactly one meaning always, but even then— I am not sure.

I'd say that in this case, I would like _1 even more: it still keeps the balance between "looks special" and "corresponds to the rules for local names, so it is visible it is local" + allows to define several consequential variables; but, be it in the language from the beginning, it wouldn't have a sour feeling "but before, underscore meant 'drop this var'!"

I'd actually say that it is less confusing for newcomers than for seasoned Rubyists; I regularly observe younger colleagues being introduced to the notion and getting used to it in a blink of an eye.

In hindsight, I am sadder that _0 was dropped (to mean "all args unsplatted" and distinguish from _1 which would always be "first of args, splatted").

Updated by ufuk (Ufuk Kayserilioglu) over 1 year ago

zverok (Victor Shepelev) wrote in #note-26:

My "design space analysis" was sourced by the same idea. I can say that I would NOT choose it (or any other name looking "regularly"), unless it would be a keyword (like self), having exactly one meaning always, but even then— I am not sure.

One thing that this "design space analysis" keeps ignoring, though, is the proposal by @jeremyevans0 (Jeremy Evans) to use a bare @ as a synonym for _1, as in:

[1, 2, 3].each { puts @ }

which (you can see from the syntax highlighting above) is currently a syntax error.

While, I, personally, don't like the line noise introduced by yet another sigil, I do understand the concern that _1 cognitively implies _2, _3, etc, and special syntax for the single argument case would be better for code readability overall. I have no concerns with using it, since "it" will hardly have many compatibility problems if implemented in the way @k0kubun (Takashi Kokubun) is suggesting. However, if it is deemed not suitable, then I think @ is a good fallback.

Updated by zverok (Victor Shepelev) over 1 year ago

One thing that this "design space analysis" keeps ignoring, ...

It doesn't (though I by no means consider it exhaustive or definitive, it is just "what I am thinking about when considering options"):

As far as I understand, we are currently quite reluctant towards introducing "Perlisms" like "this character means something new in that context." For operators (verbs) we seek to recombine existing ones (like in and => for pattern matching), and for variables/methods, we seek to stay with existing naming schemes.

It doesn't mention @ directly, but it covers this proposal.

(One additional consideration is that @.size and @size would be both valid, easy to mistype/misread, and mean completely different things, and as far as I understand, Matz tries to avoid such situations.)

Updated by Dan0042 (Daniel DeLorme) over 1 year ago

I'm also one of those who didn't manage to get used to this syntax. I'll use it sometimes in IRB, or in code samples in this tracker, but never in any code I commit... it just feels unclean for some reason.

So far I count k0kubun and hanazuki who have similar thoughts. Not sure about Jeremy. It would be very informative to know who else wasn't able to get used to the numbered parameters syntax in those 2~3 years.

Everyone has their favorite alternative though, so while k0kubun prefers it and Jeremy prefers @, I'm still a superfan of #16120 Omitted block argument. Some things are hard to let go of.

zverok (Victor Shepelev) wrote in #note-20:

In our current codebase (quite large), that switched to Ruby 2.7 a year ago, there are ~200 entries of _1

Out of curiosity, how many of those ~200 are used as the first token in the block (like { _1.foo }) vs. how many used in another way?

Updated by austin (Austin Ziegler) over 1 year ago

Dan0042 (Daniel DeLorme) wrote in #note-29:

I'm also one of those who didn't manage to get used to this syntax. I'll use it sometimes in IRB, or in code samples in this tracker, but never in any code I commit... it just feels unclean for some reason.

I haven’t used it, but that’s because I’m still maintaining libraries that support versions of Ruby that are currently EOL, and I can’t really change that without bumping the major version…

Even in a library that I’m prepping for release, I can’t use some conveniences like foo(...) because jruby doesn’t yet support it (at least in the version I get with Github Actions).

As far as "not liking it", I would have preferred Jeremy’s suggestions @, @1, @2, etc. over _1, _2, but having used them a few times…they’re OK. I do most of my software development in Elixir, JS, or Typescript these days (Ruby is my fallback, but is only about 20% of our production apps, and none of them Rails—instead, Roda). Elixir’s closest case is on anonymous capture functions, & &1 (the identity closure—the first & introduces the closure, the second &1 is for the parameters). Unlike Ruby, you can’t use &2 without using &1 somewhere in the anonymous capture function (where you could do _2 without using _1).

I’m not a fan of it, and would probably use _1 over it most times I reached for something like that.

Updated by zverok (Victor Shepelev) over 1 year ago

@Dan0042 (Daniel DeLorme)

Out of curiosity, how many of those ~200 are used as the first token in the block (like { _1.foo }) vs. how many used in another way?

I am afraid I don't see a pattern that you hope to see (e.g., quoting #16120, that it would mostly be { _1.something(args) }.
Here are some:

.tap { _1 << :date if time_zone.present? }
, format_with: -> { _1 == UserOrganization::Active }
.then { disabled ? _1 : _1.reject(&:disabled?) }
.find{ _1['error'] == 'task_comment_limit' }
.map { {id: _1.id, text: _1.name} }.sort_by { _1[:text] }
.to_h { [_1.to_s.pluralize.to_sym, [activity[_1]]] }
.map { preprocess_row(_1.symbolize_keys) }

(I have tons of those, it is not that I've spent a lot of time choosing, it is just a quick selection from grep _1 {app,lib} -r results)

Updated by k0kubun (Takashi Kokubun) over 1 year ago

Now I'm almost giving it up because there are more people who dislike it than I expected. I thought compatibility was the only blocker of it, but it doesn't seem like the case.

However, even if it is not gonna make it, the problem will still remain in Ruby:

Why I don't use _1

I'm not clever enough to remember the order of parameters. Therefore, when a block has multiple parameters, I'd always want to name those parameters because which is _1 or _2 is not immediately obvious. Thus I would use this feature only when a block takes a single argument, which is actually pretty common.

If I use _1, it feels like there might be a second argument, and you might waste time to think about _2, even if _2 doesn't exist, which is a cognitive overhead. If you use it, it kinda implies there's only a single argument, so you don't need to spend time remembering whether _2 exists or not. It is important for me that there's no number in it.

I still want some new syntax to do the same thing as _1 without saying 1 because it feels like there's a 2nd parameter, even if I want to say there's no such thing. Looking at #note-9 and #note-20, it seems like even people using _1 agree that it's a valid problem.

Can we at least agree that there should be a way to do _1 without saying 1? Would you rather prefer having only a single way to write it?

Updated by byroot (Jean Boussier) over 1 year ago

Now I'm almost giving it up because there are more people who dislike it than I expected.

Is it really a problem though? This proposal is basically about a new alias for an existing feature. I don't think it logically need to please everyone given that the feature is already accessible by another syntax. It costs nothing not to use it, so it seems to me that the bar should be whether it's helping a sizeable enough portion of the user base (which is super subjective, but still).

And I for one am very much in favor of it as an alias for _1, I just didn't think I had anything to add to this discussion until now, since I was basically in total agreement with your proposal.

I also want to concur with your numbering argument, I never got used to _1 because referencing arguments by position rather than name is a regression to me, it brings me back to function arguments in shell scripts. I'm quite sure I haven't used it a single time since it was introduced, not even for quick IRB experiments.

Given that the vast majority of block out there either take 1 or 0 arguments, having a more pleasant and meaningful alias for that overwhelming majority of cases seem perfectly sensible to me.

As for the proposed alternatives, my subjective stance on them are:

  • Bare @ screams "INSTANCE VARIABLE" at my brain, I doubt I'd ever get used to it. And there's also the typo inducing @.size vs @size mentioned previously.
  • _ I like it less than it but I'm ok with. I don't think its current usage as "ignored argument" really conflict, because if you use it, well, it's obviously used.

Updated by Dan0042 (Daniel DeLorme) over 1 year ago

zverok (Victor Shepelev) wrote in #note-31:

I am afraid I don't see a pattern that you hope to see (e.g., quoting #16120, that it would mostly be { _1.something(args) }.

Thank you. It's surprising to me, but very enlightening. Nothing beats actual data about live usage. Although I would really love to see actual counts, e.g.

egrep _1 | wc -l
egrep '\{ *_1' | wc -l
egrep '\{ *_1\.' | wc -l

Updated by zverok (Victor Shepelev) over 1 year ago

@Dan0042 (Daniel DeLorme)

$ egrep _1 {app,lib} -r --include \*.rb | wc -l
378
$ egrep '\{ *_1' {app,lib} -r --include \*.rb | wc -l
151
$ egrep '\{ *_1\.' {app,lib} -r --include \*.rb | wc -l
92

And even of those 92, not all are suitable for { .method(args) } syntax, there are some like...

.all? { _1.data.dig(_1.widget_id.to_sym, :score).nil? }
.then { _1.start_with?(%r{^(https|http)://?}) ? _1 : _1.prepend('https://') }
.each { _1.update!(allowed_hours: weekly_limit) unless _1.start_date == date }
.filter_map { _1.id if _1.has_virtual_jobs? }

etc

Updated by Dan0042 (Daniel DeLorme) over 1 year ago

@zverok (Victor Shepelev) Again, thank you very much, this is super informative.

Updated by matheusrich (Matheus Richard) over 1 year ago

I don't have much to contribute here, but I'll give my personal experience with this subject.

I like the convenience of numbered params, but I've rarely used them (outside of throwaway scripts) because they look off.
I find them even a bit confusing in some cases where you might confuse them with numbers (syntax highlighting does help, though):

(1..).take(10).map { _1 ** 2 }

I feel like it would be much more readable in this case

(1..).take(10).map { it ** 2 }

Although sometimes I'd wish we could use its so it reads even more like English, but that would be a minor convenience:

# for each user, get its name
user_names = users.map { its.name }

Updated by matheusrich (Matheus Richard) over 1 year ago

Now I'm almost giving it up because there are more people who dislike it than I expected.

I'm not sure how good the people here represent the whole community. My bet is most people don't even notice these feature requests unless they appear on something like RubyWeekly.

The fact that many here are maintainers of Ruby implementations also has a biased effect on new features, as they might represent a burden on them. I'm not saying this is a bad thing, I love the diversity of points of view that this brings! OTOH, it's fair that people that do take time to discuss things here have a bigger influence on the direction that Ruby follows.

Updated by adiel (Adiel Mittmann) over 1 year ago

I would like to provide a data point.

I have been using Ruby for 10+ years and for a long time I missed a syntax that would allow me to map things quickly. Over time I decided to always use a variable named x for such purposes: map.keys.select{|x| x =~ /foo/}. While I was happy when I heard about the _1 syntax, I do think that the case when there's only one item to be processed is special. I find that _1 does imply _2 in a way and it looks out of place when there's really one item, the item.

Furthermore, I find that _1, _2 must be carefully used so as to keep the code readable. And, in a way, when I'm using both _1 and _2, I often feel that I've gone too far and it's better to just give names to the variables.

In our current code base there are:

  • 268 instances of the |x| syntax, which we didn't care enough to update.
  • 515 instances of _1, of which 440 are the first token in a block.
  • 2 instances of _2.

I'm in favor of any syntax which doesn't force me to use a number when my brain is definitely not thinking of any kind of ordering, and in particular I like @jeremyevans0 (Jeremy Evans) 's @ syntax.

Updated by maedi (Maedi Prichard) over 1 year ago

How about _@? It's your friendly neighborhood local-instance variable. I'm half joking but it is a local variable that refers to instances.

It's nice and illegal:

[1, 2, 3].each { puts _@ }

Looks better when it's not an outlaw:

[1, 2, 3].each { puts _@ }

I've always liked just @ but I see the point that it can be confused with instance variables, especially when it's written as @.size (is _@.size better?). But I'm hoping that the local nature of _ can help combat that, and it fits within the existing style of _1, _2 and _3, yet actually feels unique and like something you would want to use. Call it a "local at"?

Updated by maedi (Maedi Prichard) over 1 year ago

Or _$:

[1, 2, 3].map { puts _$ }
[1, 2, 3].map { puts _$ }

Or even just $ by itself:

[1, 2, 3].map { puts $ }
[1, 2, 3].map { puts $ }

In my opinion $ is underutilised in Ruby compared to other languages. In writing day to day code you don’t often use global variables, so $ could provide an entirely unique local “this” variable without mentally switching to the global context.

Updated by ufuk (Ufuk Kayserilioglu) over 1 year ago

@maedi (Maedi Prichard) _$, and especially $, have the same problems as @ in which it is super easy to confuse and hard to differentiate $.size vs $size, as mentioned here https://bugs.ruby-lang.org/issues/18980#note-28

Updated by funny_falcon (Yura Sokolov) over 1 year ago

Ruby takes so much syntax last years. I fear it. Let's not strain our lovely language, please.

Updated by maedi (Maedi Prichard) over 1 year ago

I see that $. is already a pre-defined variable which would make $.method_name difficult to parse. Then _$ looks a little bit too close to the pre-defined variable $_. Though I still like _@.

Can anyone think of a solution? Anything is better than _1. I personally like @ and it and this and that but there always seems to be some conflict with existing variables and methods. What about $this or $it? If someone somewhere ever named their global variable $it and used it in a block then I'm sure they would forgive us, if this unlikely situation ever happened at all.

[1, 2, 3].each { puts $it }

Another thing you could do which is very Ruby is provide a special block param that acts as an object and interact with that:

[1, 2, 3].each { |$|.method_name }
[1, 2, 3].each { puts |$| }
[1, 2, 3].each { method_name(|$|) }
[1, 2, 3].each { |$|.method_name }
[1, 2, 3].each { puts |$| }
[1, 2, 3].each { method_name(|$|) }

Updated by rubyFeedback (robert heiler) about 1 year ago

If I recall correctly I suggested @1 @2 and so forth.

At a later time _1 _2 and so forth was added, which is not entirely the same. I
then realised that the suggestion was actually much older than my suggestion,
and the ruby core team often said they may build upon suggestions and modify
it.

What surprised me was that one of my use case was not implemented, at the least
back then.

Which was:

@large_collection_as_an_array.each {|cats_with_a_fluffy_tail, dogs_with_cute_ears, ships_with_red_painting, cars_without_a_door|
}

For such a situation, in particular in IRB as well as in quick debugging, I wanted to
be able to refer to the element at hand quickly, without having to remember the name
as such.

So I could then do:

pp @1
pp @3

And so on.

Unfortunately it seems as if that was not considered, and it was (back then at
the least) made forbidden to assign proper (long) names to the block variables.
The syntax was also changed; I find _1 _2 a bit hard to read.

I should note that I remember the other proposal about "it". I don't have a strong
opinion against it. But I also don't feel particularly committed towards wanting
to use it. The whole point of @1 @2 was actually to help in prototyping, e. g. to
be faster when you write code initially. It's not a huge saving of time, admittedly
so, but it does help, and I think in particular for IRB it can really be helpful
(I'd wish we could still use _1 _2 and so forth together with real names of the
variables; my use case was that I am fine with the initial block names, and I
would keep them, but in the middle of writing more code, I just want to refer
to the variable without necessarily always needing to remember the name. I have
to scroll back with my crappy editor.)

maedi wrote:

I see that $. is already a pre-defined variable which would
make $.method_name difficult to parse.

All $-variables are quite hard to remember. With @1 or _1 this is
a bit different because, like in a regex, you just refer to some
specific group by number rather than a name. (Or just syntax, as is
the case with the various $: $< and what not.)

maedi wrote:

Though I still like _@.

That one trips my brain up. I think we should be conservative about
syntax. Perlisms can be useful due to begin succinct, but readability
is also a useful metric to have.

What surprised me the most is that people began to include _1 _2 and
so forth in real production code. To me it is purely a method of
testing and debugging aid; but I guess the moment you add functionality
someone is going to use it, and a few of these change their style
to include these.

maedi wrote:

Another thing you could do which is very Ruby is provide a
special block param that acts as an object and interact
with that:

I have no real problem with "it", even though I most likely won't
use it. But we could use "it" also as reference object via
[] method e. g. it[0], it[1] in addition to "normal" use of "it".

adiel wrote:

Over time I decided to always use a variable named x for such
purposes: map.keys.select{|x| x =~ /foo/}.

I kind of use longer names such as:

map.keys.select {|entry| entry =~ /foo/}

But using short names is also understandable. I tend to use _
most of the time, but sometimes I also use short variable
names such as "x".

Matheus wrote:

Although sometimes I'd wish we could use its so it reads
even more like English, but that would be a minor convenience:

# for each user, get its name
user_names = users.map { its.name }

That is similar to the File.exist? versus File.exists? situation.

In plain english the "its name" may be more correct. In an OOP
centric point of view, you are asking a specific object via a
method call (sending a message). Although if "it" were to be
added, we could also add an alias to it called "its". ;)

Actions #46

Updated by k0kubun (Takashi Kokubun) 5 months ago

  • Description updated (diff)

Updated by k0kubun (Takashi Kokubun) 3 months ago

  • Description updated (diff)
  • Status changed from Open to Assigned
  • Assignee set to k0kubun (Takashi Kokubun)
  • Target version set to 3.4

In today's Developers Meeting, @matz (Yukihiro Matsumoto) accepted to warn it in Ruby 3.3 and add it in Ruby 3.4. Quote from the meeting notes:

  • matz: accept it on Ruby 3.4.
    • ruby 3.3 will warn and ruby 3.4 will use the new semantics
    • The warning should be printed always (even without $VERBOSE)

I also copied the discussed specification to the ticket description. In Ruby 3.3, it should be warned only when it will behave like _1 in Ruby 3.4.

Actions #48

Updated by k0kubun (Takashi Kokubun) 3 months ago

  • Status changed from Assigned to Closed

Applied in changeset git|44592c4e20a17946b27c50081aee96802db981e6.


Implement it (#9199)

[Feature #18980]

Co-authored-by: Yusuke Endoh

Actions

Also available in: Atom PDF

Like14
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like2Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like1Like0Like0Like0Like0Like0Like0Like0Like5Like0