Project

General

Profile

Actions

Feature #11090

closed

Enumerable#each_uniq and #each_uniq_by

Added by Hanmac (Hans Mackowiak) almost 10 years ago. Updated over 8 years ago.

Status:
Closed
Target version:
-
[ruby-core:68969]

Description

currently if you want to iterate the first uniq elements you either need to call uniq and create a big array or you need to count the elements yourself
if you have an Enumerable with an indifferent size (maybe something with cycle or something you cant rewind) calling the Array#uniq method might not what you want.

the idea is adding each_uniq which does only iterate though the elements which are not already send (it does count for you)
a second each_uniq_by does work similar with chunk and does takes a block using a generated Enumerator

IceDragon200 did make the following gist/sample in ruby, it might be written in C later to make it faster/better. https://gist.github.com/IceDragon200/5b1c205b4b38665c308e for better view i also added it as attachment.


Files

each_uniq.rb (830 Bytes) each_uniq.rb Hanmac (Hans Mackowiak), 04/23/2015 07:37 AM

Related issues 1 (0 open1 closed)

Related to Ruby master - Feature #1153: Enumerable#uniqClosedmatz (Yukihiro Matsumoto)Actions

Updated by prijutme4ty (Ilya Vorontsov) over 8 years ago

  • Assignee set to nobu (Nobuyoshi Nakada)

Why introduce one more method if we can just implement #uniq (with or without block, sticking to Array#uniq semantics) for Enumerable and Enumerator::Lazy? With Enumerator::Lazy we do not need to create an array.

module Enumerable
  def uniq
    result = []
    uniq_map = {}
    if block_given?
      each do |value|
        key = yield value
        next if uniq_map.has_key?(key)
        uniq_map[key] = true
        result << value
      end
    else
      each do |value|
        next if uniq_map.has_key?(value)
        uniq_map[value] = true
        result << value
      end
    end
    result
  end
end

class Enumerator::Lazy
  def uniq
    uniq_map = {}
    if block_given?
      Enumerator::Lazy.new(self) do |yielder, value|
        key = yield value
        next if uniq_map.has_key?(key)
        uniq_map[key] = true
        yielder << value
      end
    else
      Enumerator::Lazy.new(self) do |yielder, value|
        next if uniq_map.has_key?(value)
        uniq_map[value] = true
        yielder << value
      end
    end
  end
end

olimpics = {1896 => 'Athens', 1900 => 'Paris', 1904 => 'Chikago', 1906 => 'Athens', 1908 => 'Rome'}
each_city_first_time = olimpics.uniq{|k,v| v }
# [[1896, "Athens"], [1900, "Paris"], [1904, "Chikago"], [1908, "Rome"]]

(1..Float::INFINITY).lazy.uniq{|x| (x**2) % 10 }.first(6)
# => [1, 2, 3, 4, 5, 10]

While I propose another solution for the problem, I'm totally agree that we need a way to work with unique elements of collections without creating intermediate array. In heavy data processing it is a very common problem.

Updated by matz (Yukihiro Matsumoto) over 8 years ago

As Ilya proposed, Enumerable#uniq and Enumerable::Lazy#uniq is reasonable.

Matz.

Actions #4

Updated by nobu (Nobuyoshi Nakada) over 8 years ago

  • Status changed from Open to Closed

Applied in changeset r55709.


enum.c: Enumerable#uniq

  • enum.c (enum_uniq): new method Enumerable#uniq.
    [Feature #11090]
Actions #5

Updated by shyouhei (Shyouhei Urabe) over 5 years ago

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0