Bug #16270
closedStrange behavior on Hash's #each and #select method.
Description
The following is some example code:
sample_hash = {
"246" => {
"price" => "8000",
"note" => ""
},
"247" => {
"price" => "8000",
"note" => ""
},
"248" => {
"price" => "8000",
"note" => ""
}
}
sample_hash.each {|e| p e}
# The following is p's output content. We can see that e is a hash element, and is converted into an array object.
# This is expected behavior maybe. Anyway, a hash is the same as a nested array.
["246", {"price"=>"8000", "note"=>""}]
["247", {"price"=>"8000", "note"=>""}]
["248", {"price"=>"8000", "note"=>""}]
sample_hash.select {|e| p e }
# Wired(?). Why is e's output this time different from each?
"246"
"247"
"248"
The following is source code for each
static VALUE
rb_hash_each_pair(VALUE hash)
{
RETURN_SIZED_ENUMERATOR(hash, 0, 0, hash_enum_size);
if (rb_block_arity() > 1)
rb_hash_foreach(hash, each_pair_i_fast, 0);
else
rb_hash_foreach(hash, each_pair_i, 0);
return hash;
}
The following is source code for select
VALUE
rb_hash_select(VALUE hash)
{
VALUE result;
RETURN_SIZED_ENUMERATOR(hash, 0, 0, hash_enum_size);
result = rb_hash_new();
if (!RHASH_EMPTY_P(hash)) {
rb_hash_foreach(hash, select_i, result);
}
return result;
}
I don't understand C well, and don't know why the above two Hash methods lack consistency. But I think it confuses me a little.
Updated by shevegen (Robert A. Heiler) over 4 years ago
I am not entirely sure where there is a lack of consistency or why the
C code is necessary. I assume that you may have been confused about
Kernel#p perhaps? It is rare that people combine .select with p,
whereas this behaviour is more frequently seen with .each, where people
may output all or some elements.
What helps me personally is when I "split" up the Hash into "key" and
"value" pairs, such as:
sample_hash.select {|key, value| p key }
The above still does not make a whole lot of sense to me, but from the
code alone I think it became more clear what you, as a user of ruby,
actually want to do. Of course I may have misunderstood you as well.
(The reason why I wrote that I do not see a lack of consistency is because
.each and .select have different behaviour on purpose. What I often do
is modify the internal dataset of a class, before I may then go on to
report it to the user after the dataset was modified. It's a bit like MVC
but nowhere as strict as MVC separates stuff.)
Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
- Status changed from Open to Closed
You need to look at the functions passed to rb_hash_foreach
:
static int
each_pair_i(VALUE key, VALUE value, VALUE _)
{
rb_yield(rb_assoc_new(key, value));
return ST_CONTINUE;
}
static int
select_i(VALUE key, VALUE value, VALUE result)
{
if (RTEST(rb_yield_values(2, key, value))) {
rb_hash_aset(result, key, value);
}
return ST_CONTINUE;
}
each_pair_i
yields a single array argument with the key and the value, select_i
yields 2 arguments (key and value separately).
This is inconsistent, and could potentially be changed so that blocks passed to Hash#select
that accept a single argument are yielded an array. However, that may cause backwards compatibility issues with existing code that expects the key to be yielded to the block (instead of an array with the key and value).
I don't think the current behavior is a bug, so I'm going to close this. If you would like Hash#select
behavior changed when passing a block that accepts a single argument, please submit a feature request for that.
Updated by Dan0042 (Daniel DeLorme) over 4 years ago
Of all Hash methods, only select/reject/select!/reject!/keep_if/delete_if have that behavior. That's fairly inconsistent, but more importantly it doesn't seem like this inconsistency is on purpose. There is no test for this case in test/ruby/test_hash.rb.
I tried changing rb_yield_values(2
to rb_yield(rb_assoc_new(
and I got only one failure in the specs, in test_delete_if, because of h.delete_if {|*a|
where *a
becomes [[1,"one"]]
instead of [1,"one"]
. And that is more related to #16166 than the current issue.