Project

General

Profile

Feature #17056

Array#index: Allow specifying the position to start search as in String#index

Added by TylerRick (Tyler Rick) 4 months ago. Updated 2 months ago.

Status:
Open
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:99380]

Description

I have a use case of finding the first matching line within a given section in a file. After finding the line number of the start of the section, I want to find the first match after that line.

My workaround for now is to use with_index:

lines = pathname.read.lines
section_start_line = lines.index {|line| line.start_with?(/#* #{section_name}/) }
lines.index.with_index {|line, i| i > section_start_line && line.include?(sought) }

I'd like to do it in a more concise way using a feature of Array#index that I propose here, which is analogous to String#index.

If the second parameter of String#index is present, it specifies the position in the string to begin the search:

'abcabc'.index('a') # => 0
'abcabc'.index('a',2) # => 3

I would expect to also be able to do:

'abcabc'.chars.index('a') # => 0
'abcabc'.chars.index('a', 2)

Using such feature, I would be able to do:

lines.index(sought, section_start_line)

This would give Ruby better parity with other programming languages like Python:

>>> list('abcabc')
['a', 'b', 'c', 'a', 'b', 'c']
>>> list('abcabc').index('a')
0
>>> list('abcabc').index('a', 2)
3

End index

We can further think of an optional parameter to specify the position to end the search. The following languages allow specifying both start and end indexes:

Ruby's String#index does not have one, so we could make a separate proposal to add end to both methods at the same time.

#1

Updated by TylerRick (Tyler Rick) 4 months ago

  • Description updated (diff)
#2

Updated by TylerRick (Tyler Rick) 4 months ago

  • Description updated (diff)
#3

Updated by sawa (Tsuyoshi Sawada) 4 months ago

  • Description updated (diff)
  • Subject changed from Array#index: Allow specifying start index to search like String#index does to Array#index: Allow specifying the position to start search as in String#index

Updated by marcandre (Marc-Andre Lafortune) 3 months ago

👍

I'd like to have optional start and stop arguments for find_index, find, bsearch and bsearch_index.

As mentionned, a typical usecase is to repeat a lookup, but another one is to lookup a range of indices (e.g. which elements of a sorted array are between 10 and 20).

I've had to iterate on the indices instead but it is not elegant and is less performant.

Updated by fatkodima (Dima Fatko) 3 months ago

I have implemented an offset parameter for Array#index - https://github.com/ruby/ruby/pull/3448
Will adjust to more methods if asked.

Updated by matz (Yukihiro Matsumoto) 2 months ago

Accepted.

How do you think about end index? Do we need it? If so, should we add end index to String#index as well?

Matz.

Updated by Eregon (Benoit Daloze) 2 months ago

What if a block is given, and one want to use a start index? (for efficiency and not run the block for the first start elements).

ary.index(start) { |i| ... } seems confusing.

Probably keyword arguments are better:
ary.index(from: start) { |i| ... } or ary.index(start: start) { |i| ... }

Although personally I'm not convinced we need these complications.
One can do 'abcabc'.chars[2..].index('a') + 2 instead of 'abcabc'.chars.index('a', 2).
And the [2..] is quite cheap considering that arrays use copy-on-write.

It can also be done with ary = 'abcabc'.chars; (2...ary.size).find { |i| ary[i] == 'a' }.
That's a little bit more complicated, but it's also usable in many more situations than just index.
I would expect it's fairly rare to need a start offset, so I think there is no need for a shortcut.

Updated by fatkodima (Dima Fatko) 2 months ago

Eregon (Benoit Daloze) wrote in #note-7:

What if a block is given, and one want to use a start index? (for efficiency and not run the block for the first start elements).

ary.index(start) { |i| ... } seems confusing.

Probably keyword arguments are better:
ary.index(from: start) { |i| ... } or ary.index(start: start) { |i| ... }
ary.index(from: start) { |i| ... } or ary.index(start: start) { |i| ... }

Agreed.

Eregon (Benoit Daloze) wrote in #note-7:

It can also be done with ary = 'abcabc'.chars; (2...ary.size).find { |i| ary[i] == 'a' }.
That's a little bit more complicated, but it's also usable in many more situations than just index.
I would expect it's fairly rare to need a start offset, so I think there is no need for a shortcut.

Personally, I had a need for start index a couple of times. What I have seen most of the times, developers just slice an array (allocating a new array; is there a CoW here?) in needed range. And it will be convenient to have a start index argument. And it will be consistent with String#index.

matz (Yukihiro Matsumoto) wrote in #note-6:

Accepted.

How do you think about end index? Do we need it? If so, should we add end index to String#index as well?

Matz.

As for end index, I think this is truly would be rarely needed and can be simulated with something like ...with_index ... { |..., index| ... break if index > end_index ... }

As marcandre (Marc-Andre Lafortune) pointed out, other methods would probably also benefit from such method arguments, but to avoid updating all of them, I would prefer just add start index argument to Array#index, for consistency with String#index, and it can be used in user code for emulating other methods, like

# find with start index
index = array.index(start: 10) { |e| e % 10 == 0 }
item = array[index] if index

Updated by mame (Yusuke Endoh) 2 months ago

Hi,

fatkodima (Dima Fatko) wrote in #note-8:

to avoid updating all of them, I would prefer just add start index argument to Array#index, for consistency with String#index,

I agree with your approach. However, your PR changes not only Array#index but also Array#find_index. This brings another inconsistency: Enumerable#find_index does not accept "start", but Array#find_index does.

We discussed this ticket at today's dev-meeting, and ko1 (Koichi Sasada) proposed removing Array#find_index so that ary.find_index invokes Enumerable#find_index instead of keeping it as an alias to Array#index, and matz agreed with the removal.

Also available in: Atom PDF