Feature #20163
openIntroduce #bit_count method on Integer
Description
This feature request is to implement a method called #bit_count on Integer that returns the number of ones in the binary representation of the absolute value of the integer.
n = 19
n.bit_count #=> 3
(-n).bit_count #=> 3
This is often useful when you use an integer as a bitmask and want to count how many bits are set.
This would be equivalent to
n.to_s(2).count("1")
However, this can be outperformed by
def bit_count(n)
  count = 0
  while n > 0
    n &= n - 1 # Flip the least significant 1 bit to 0
    count += 1
  end
  count
end
I think this would be a useful addition because it would fit alongside the other bit-related methods defined on integer: #bit_length, #allbits?, #anybits?, #nobits?. Also, when working with bitmasks, a minor upgrade to performance often results in a significant improvement.
Similar methods from other languages:
https://docs.python.org/3/library/stdtypes.html#int.bit_count
https://doc.rust-lang.org/std/primitive.i32.html#method.count_ones
        
          
          Updated by Eregon (Benoit Daloze) almost 2 years ago
          
          
        
        
      
      The name sounds too close to #bit_length, and length and count are often quite close in Ruby
(e.g. Enumerable#count without arguments is the same as #size/#length after Enumerable#to_a or on an Array, Hash, etc).
I think count_ones is a better name because there is no ambiguity.
popcount might be another common name for it, but that seems less clear unless the term or the assembly instruction is already known.
Also I think abbreviations are in general suboptimal in method names.
        
          
          Updated by shyouhei (Shyouhei Urabe) almost 2 years ago
          
          
        
        
      
      count_ones could be something different from what the OP wants, because (-19).bit_count is expected to be 3.
PS. i64::count_ones(-19) is 62 for Rust.
PPS. There is no such thing like a 64bit signed integer in ruby.
        
          
          Updated by nobu (Nobuyoshi Nakada) almost 2 years ago
          
          
        
        
      
      GCC has __builtin_popcount and Ruby defines rb_popcount function family internally.
        
          
          Updated by byroot (Jean Boussier) almost 2 years ago
          
          
        
        
      
      I also think popcount makes sense. Yes it's a bit of a cryptic name, but if you are dealing with bits, you are likely to be familiar with that name.
        
          
          Updated by nobu (Nobuyoshi Nakada) almost 2 years ago
          
          
        
        
      
      - Is duplicate of Feature #8748: Integer#popcount (Fixnum#popcount and Bignum#popcount) added
 
        
          
          Updated by garrison (Garrison Jensen) almost 2 years ago
          
          
        
        
      
      I like popcount but I also like count_ones because it sounds friendlier, however I have no strong preferences for the chosen name.
Also, I want to share my performance tests in case they are helpful for the discussion. As you can see, the built-in method is significantly faster.
(0..10_000_000).each { |x| x.to_s(2).count('1') }
processing time: 3.714706s
(0..10_000_000).each { |x| ruby_pop_count(x) }
processing time: 3.367775s
(0..10_000_000).each { |x| x.pop_count }
processing time: 0.515767s
Here are my implementations:
def ruby_pop_count(n)
  n = n.abs
  count = 0
  while n > 0
    n &= n - 1 
    count += 1
  end
  count
end
unsigned int pop_count(long value) {
#ifdef __GNUC__
    // Use GCC built-in
    return __builtin_popcountl(value);
#else
    // Fallback method for compilers without the built-in
    unsigned int count = 0;
    while (value) {
        count += value & 1;
        value >>= 1;
    }
    return count;
#endif
}
// Wrapper function for Ruby
VALUE rb_pop_count(VALUE self) {
    long value = NUM2LONG(self);
    unsigned int count =  pop_count(labs(value));
    return UINT2NUM(count);
}
        
          
          Updated by tenderlovemaking (Aaron Patterson) 2 months ago
          
          
        
        
      
      I would also like a popcount method. I prefer popcount over bit_count.  I wrote my own here (but it only deals with single bytes).
        
          
          Updated by akr (Akira Tanaka) 2 months ago
          
          
        
        
      
      What is the behavior for negative values?
The proposal describes two implenentations that returns different values.
p (-5).to_s(2).count("1") #=> 2
def bit_count(n)
  count = 0
  while n > 0
    n &= n - 1 # Flip the least significant 1 bit to 0
    count += 1
  end
  count
end
p bit_count(-5) #=> 0
        
          
          Updated by byroot (Jean Boussier) 2 months ago
          
          
        
        
      
      What is the behavior for negative values?
IMO, the only behavior that makes sense in the context of a language with arbitrary size integers is to ignore the sign bit.
        
          
          Updated by mame (Yusuke Endoh) 2 months ago
          
          
        
        
      
      What are the intended use cases for this proposal?
My experience (in other languages) involves two use cases of popcount:
- Bitboards for game AI (like Reversi) to count pieces.
 - Succinct data structures (like LOUDS Tries) for rank operations.
 
In both scenarios, integers are treated as unsigned bitsets.
Does anyone have a use case where popcount on a negative number is necessary?
If not, I guess raising an exception would be the best behavior.
        
          
          Updated by tompng (tomoya ishida) 2 months ago
          
          
        
        
      
      I also think popcount of a negative number should raise error because of the ambiguity.
One way to extend popcount to negative number is using a relationship below, derived from the fact that -5 == 0b111...11111011 has 1 fewer bits compared to -1  == 0b111...11111111.
x.popcount == popcount_of_minus_one - (~x).popcount
# popcount_of_minus_one: 0 or -1 or something else representing infinity bits
I'm not sure if this definition is useful, but anyway, extending to negative number has ambiguity.
If someone want a popcount of abs, n.abs.popcount is clearer.
        
          
          Updated by ahorek (Pavel Rosický) 2 months ago
          
          
        
        
      
      x64 and ARM have specialized CPU instructions
https://godbolt.org/z/xvGvzsvd9
and Ruby already uses it internally, for instance
https://github.com/ruby/ruby/blob/dc555a48e750b4d50eb7a7000ca1bfb927fa9459/string.c#L2209
That said, Ruby isn’t the ideal choice for implementing memory allocators, SIMD masks, parity checks, GCD calculations, UTF parsers, or prime sieving… but since many other languages, even Python, provide support for popcount, why not?
        
          
          Updated by byroot (Jean Boussier) 2 months ago
          
          
        
        
      
      but since many other languages, even Python, provide support for popcount, why not?
Usually a higher bar than that is required for a new method to be added to Ruby.
I personally don't have an immediate use case to point to (except for Aaron's gem of course). But more generally, in recent years we've tried to eliminate or at least reduce the need for C extensions, and bit operation are often useful in these cases, so I'm sympathetic to more bit oriented capacities.
        
          
          Updated by tenderlovemaking (Aaron Patterson) 2 months ago
          
          
        
        
      
      mame (Yusuke Endoh) wrote in #note-10:
What are the intended use cases for this proposal?
My experience (in other languages) involves two use cases of popcount:
- Bitboards for game AI (like Reversi) to count pieces.
 - Succinct data structures (like LOUDS Tries) for rank operations.
 In both scenarios, integers are treated as unsigned bitsets.
My experience is similar. I've used it for sets (like I linked above) as well as modeling undirected graphs (a bit matrix, but I omitted popcount from the blog post). I've only used unsigned integers, and I think it would be a bug in my code if the integers were signed.
Does anyone have a use case where popcount on a negative number is necessary?
If not, I guess raising an exception would be the best behavior.
I agree, and I think it should raise an exception.
ahorek (Pavel Rosický) wrote in #note-12:
That said, Ruby isn’t the ideal choice for implementing memory allocators, SIMD masks, parity checks, GCD calculations, UTF parsers, or prime sieving…
Not yet! But hopefully someday! 😂
        
          
          Updated by garrison (Garrison Jensen) 2 months ago
          
          
        
        
      
      Python ignores the sign. It seems friendlier to match that behavior than throw an exception.
(-x).popcount == x.popcount
        
          
          Updated by tenderlovemaking (Aaron Patterson) 2 months ago
          
          
        
        
      
      garrison (Garrison Jensen) wrote in #note-15:
Python ignores the sign. It seems friendlier to match that behavior than throw an exception.
(-x).popcount == x.popcount
When would you use a negative number unless it's a mistake in your code?
        
          
          Updated by tenderlovemaking (Aaron Patterson) 2 months ago
          
          
        
        
      
      It seems like the Python folks didn't have too serious a discussion about handling negative numbers.
https://github.com/python/cpython/issues/74068#issuecomment-1093743975
https://github.com/python/cpython/issues/74068#issuecomment-1093743978
I prefer it raises an exception as it's probably a mistake in the code. 🤷🏻♀️
        
          
          Updated by garrison (Garrison Jensen) 2 months ago
          
          
        
        
      
      tenderlovemaking (Aaron Patterson) wrote in #note-16:
When would you use a negative number unless it's a mistake in your code?
I don't have a strong argument. Raising an exception sounds good to me.
        
          
          Updated by akr (Akira Tanaka) 2 months ago
          
          
        
        
      
      I prefer an exception for popcount to negative values.
I think an array of Fixnums (63 bit signed integers) can be used for mutable bit array.
(Ruby's Integer is immutable.  So mutable bit array needs a mutable data structure.)
In this scenario, we would like to count bits in a negative fixnum: (n & 0x7ffffffff).popcount.
This is not n.abs.popcount.
Also, this behavior is not suitable for Integer#popcount because it ignores bits higher than 63th bit.
So, I prefer an exception for negative values.
        
          
          Updated by slewsys (Andrew Moore) about 2 months ago
          
          
        
        
      
      garrison (Garrison Jensen) wrote in #note-6:
unsigned int pop_count(long value) { #ifdef __GNUC__ // Use GCC built-in return __builtin_popcountl(value); #else // Fallback method for compilers without the built-in unsigned int count = 0; while (value) { count += value & 1; value >>= 1; } return count; #endif } // Wrapper function for Ruby VALUE rb_pop_count(VALUE self) { long value = NUM2LONG(self); unsigned int count = pop_count(labs(value)); return UINT2NUM(count); }
Section 1.8 of Matters Computational by Jörg Arndt offers efficient algorithms for popcount, including for arrays. There is also an x86 instruction POPCNT.
Matters Computational: https://www.jjj.de/fxt/fxtbook.pdf
E.g.,
unsigned long popcount (unsigned long value) {
    value -= (value >> 1) & 0x5555555555555555UL;
    value = ((value >> 2) & 0x3333333333333333UL) + (value & 0x3333333333333333UL);
    value = ((value >> 4) + value) & 0x0f0f0f0f0f0f0f0fUL;
    value *= 0x0101010101010101UL;
    return value >> 56;
}
By the way, "popcount" is short for "population count", whereas "pop_count" suggests (to me) a stack operation - a bit (hm) confusing.
        
          
          Updated by YO4 (Yoshinao Muramatsu) about 1 month ago
          
          
        
        
      
      We can use Integer#[] to negative numbers.
-1[0,8].popcount # => 8
-2[0,8].popcount # => 7
-3[0,8].popcount # => 7
In computers that use two's complement for negative numbers, I think this is expected behavior.
If generating sliced values becomes an overhead, we can add arguments to popcount.
        
          
          Updated by mame (Yusuke Endoh) about 1 month ago
          
          
        
        
      
      In computers that use two's complement for negative numbers, I think this is expected behavior.
Yes, but do you want -1.popcount to return Float::INFINITY? That might be theoretically consistent, but useless and even problematic, I think.
        
          
          Updated by YO4 (Yoshinao Muramatsu) about 1 month ago
          
          
        
        
      
      Popcount for negative numbers is useful when equivalent functionality to machine register operations is desired, but I do not consider operations on infinite bit widths to be useful.
When popcount is called on a negative number (without specifying a bit width if popcount has arguments), it may indicate abnormal input or a bug in user code.
I am +1 for raising exception on that case.
        
          
          Updated by matz (Yukihiro Matsumoto) 12 days ago
          
          
        
        
      
      Is there any real-world use-case for Integer#popcount method?
Bit-array?
Matz.
        
          
          Updated by tenderlovemaking (Aaron Patterson) 11 days ago
          
          
        
        
      
      matz (Yukihiro Matsumoto) wrote in #note-24:
Is there any real-world use-case for Integer#popcount method?
Bit-array?Matz.
Yes, I am using it for a bit set gem (I had to implement my own popcount here). I want to use a bit array for representing an undirected graph.