Feature #20163
openIntroduce #bit_count method on Integer
Description
This feature request is to implement a method called #bit_count on Integer that returns the number of ones in the binary representation of the absolute value of the integer.
n = 19
n.bit_count #=> 3
(-n).bit_count #=> 3
This is often useful when you use an integer as a bitmask and want to count how many bits are set.
This would be equivalent to
n.to_s(2).count("1")
However, this can be outperformed by
def bit_count(n)
count = 0
while n > 0
n &= n - 1 # Flip the least significant 1 bit to 0
count += 1
end
count
end
I think this would be a useful addition because it would fit alongside the other bit-related methods defined on integer: #bit_length,
#allbits?
, #anybits?
, #nobits?
. Also, when working with bitmasks, a minor upgrade to performance often results in a significant improvement.
Similar methods from other languages:
https://docs.python.org/3/library/stdtypes.html#int.bit_count
https://doc.rust-lang.org/std/primitive.i32.html#method.count_ones
Updated by Eregon (Benoit Daloze) over 1 year ago
The name sounds too close to #bit_length
, and length
and count
are often quite close in Ruby
(e.g. Enumerable#count without arguments is the same as #size/#length after Enumerable#to_a or on an Array, Hash, etc).
I think count_ones
is a better name because there is no ambiguity.
popcount
might be another common name for it, but that seems less clear unless the term or the assembly instruction is already known.
Also I think abbreviations are in general suboptimal in method names.
Updated by shyouhei (Shyouhei Urabe) over 1 year ago
count_ones could be something different from what the OP wants, because (-19).bit_count is expected to be 3.
PS. i64::count_ones(-19)
is 62
for Rust.
PPS. There is no such thing like a 64bit signed integer in ruby.
Updated by nobu (Nobuyoshi Nakada) over 1 year ago
GCC has __builtin_popcount
and Ruby defines rb_popcount
function family internally.
Updated by byroot (Jean Boussier) over 1 year ago
I also think popcount
makes sense. Yes it's a bit of a cryptic name, but if you are dealing with bits, you are likely to be familiar with that name.
Updated by nobu (Nobuyoshi Nakada) over 1 year ago
- Is duplicate of Feature #8748: Integer#popcount (Fixnum#popcount and Bignum#popcount) added
Updated by garrison (Garrison Jensen) over 1 year ago
I like popcount
but I also like count_ones
because it sounds friendlier, however I have no strong preferences for the chosen name.
Also, I want to share my performance tests in case they are helpful for the discussion. As you can see, the built-in method is significantly faster.
(0..10_000_000).each { |x| x.to_s(2).count('1') }
processing time: 3.714706s
(0..10_000_000).each { |x| ruby_pop_count(x) }
processing time: 3.367775s
(0..10_000_000).each { |x| x.pop_count }
processing time: 0.515767s
Here are my implementations:
def ruby_pop_count(n)
n = n.abs
count = 0
while n > 0
n &= n - 1
count += 1
end
count
end
unsigned int pop_count(long value) {
#ifdef __GNUC__
// Use GCC built-in
return __builtin_popcountl(value);
#else
// Fallback method for compilers without the built-in
unsigned int count = 0;
while (value) {
count += value & 1;
value >>= 1;
}
return count;
#endif
}
// Wrapper function for Ruby
VALUE rb_pop_count(VALUE self) {
long value = NUM2LONG(self);
unsigned int count = pop_count(labs(value));
return UINT2NUM(count);
}
Updated by tenderlovemaking (Aaron Patterson) 3 days ago
I would also like a popcount
method. I prefer popcount
over bit_count
. I wrote my own here (but it only deals with single bytes).
Updated by akr (Akira Tanaka) 3 days ago
What is the behavior for negative values?
The proposal describes two implenentations that returns different values.
p (-5).to_s(2).count("1") #=> 2
def bit_count(n)
count = 0
while n > 0
n &= n - 1 # Flip the least significant 1 bit to 0
count += 1
end
count
end
p bit_count(-5) #=> 0
Updated by byroot (Jean Boussier) 3 days ago
What is the behavior for negative values?
IMO, the only behavior that makes sense in the context of a language with arbitrary size integers is to ignore the sign bit.
Updated by mame (Yusuke Endoh) 3 days ago
What are the intended use cases for this proposal?
My experience (in other languages) involves two use cases of popcount:
- Bitboards for game AI (like Reversi) to count pieces.
- Succinct data structures (like LOUDS Tries) for rank operations.
In both scenarios, integers are treated as unsigned bitsets.
Does anyone have a use case where popcount on a negative number is necessary?
If not, I guess raising an exception would be the best behavior.
Updated by tompng (tomoya ishida) 3 days ago
I also think popcount of a negative number should raise error because of the ambiguity.
One way to extend popcount to negative number is using a relationship below, derived from the fact that -5 == 0b111...11111011
has 1 fewer bits compared to -1 == 0b111...11111111
.
x.popcount == popcount_of_minus_one - (~x).popcount
# popcount_of_minus_one: 0 or -1 or something else representing infinity bits
I'm not sure if this definition is useful, but anyway, extending to negative number has ambiguity.
If someone want a popcount of abs, n.abs.popcount
is clearer.
Updated by ahorek (Pavel Rosický) 2 days ago
x64 and ARM have specialized CPU instructions
https://godbolt.org/z/xvGvzsvd9
and Ruby already uses it internally, for instance
https://github.com/ruby/ruby/blob/dc555a48e750b4d50eb7a7000ca1bfb927fa9459/string.c#L2209
That said, Ruby isn’t the ideal choice for implementing memory allocators, SIMD masks, parity checks, GCD calculations, UTF parsers, or prime sieving… but since many other languages, even Python, provide support for popcount, why not?
Updated by byroot (Jean Boussier) 2 days ago
but since many other languages, even Python, provide support for popcount, why not?
Usually a higher bar than that is required for a new method to be added to Ruby.
I personally don't have an immediate use case to point to (except for Aaron's gem of course). But more generally, in recent years we've tried to eliminate or at least reduce the need for C extensions, and bit operation are often useful in these cases, so I'm sympathetic to more bit oriented capacities.
Updated by tenderlovemaking (Aaron Patterson) 2 days ago
mame (Yusuke Endoh) wrote in #note-10:
What are the intended use cases for this proposal?
My experience (in other languages) involves two use cases of popcount:
- Bitboards for game AI (like Reversi) to count pieces.
- Succinct data structures (like LOUDS Tries) for rank operations.
In both scenarios, integers are treated as unsigned bitsets.
My experience is similar. I've used it for sets (like I linked above) as well as modeling undirected graphs (a bit matrix, but I omitted popcount from the blog post). I've only used unsigned integers, and I think it would be a bug in my code if the integers were signed.
Does anyone have a use case where popcount on a negative number is necessary?
If not, I guess raising an exception would be the best behavior.
I agree, and I think it should raise an exception.
ahorek (Pavel Rosický) wrote in #note-12:
That said, Ruby isn’t the ideal choice for implementing memory allocators, SIMD masks, parity checks, GCD calculations, UTF parsers, or prime sieving…
Not yet! But hopefully someday! 😂
Updated by garrison (Garrison Jensen) 2 days ago
Python ignores the sign. It seems friendlier to match that behavior than throw an exception.
(-x).popcount == x.popcount
Updated by tenderlovemaking (Aaron Patterson) 2 days ago
garrison (Garrison Jensen) wrote in #note-15:
Python ignores the sign. It seems friendlier to match that behavior than throw an exception.
(-x).popcount == x.popcount
When would you use a negative number unless it's a mistake in your code?
Updated by tenderlovemaking (Aaron Patterson) 2 days ago
It seems like the Python folks didn't have too serious a discussion about handling negative numbers.
https://github.com/python/cpython/issues/74068#issuecomment-1093743975
https://github.com/python/cpython/issues/74068#issuecomment-1093743978
I prefer it raises an exception as it's probably a mistake in the code. 🤷🏻♀️
Updated by garrison (Garrison Jensen) 2 days ago
tenderlovemaking (Aaron Patterson) wrote in #note-16:
When would you use a negative number unless it's a mistake in your code?
I don't have a strong argument. Raising an exception sounds good to me.