Before using String#setbyte I tested how it would respond to values lower than 0 or greater than 255 and found that it automatically performed the needed modulo 256 operation (at least up to Ruby 2.5.3). Therefore I left out the explicit modulo operation for performance reasons.
Would it make sense to change the String#setbyte implementation to perform the modulo operation? This would restore compatibility with prior Ruby versions and may be what people would expect.
7213568733f673da0d82f95e8a1bccf79ba3f0d3
Author: shyouhei <shyouhei@b2dd03c8-39d4-4d8f-98ff-823fe69b080e>
Date: Mon Nov 19 09:52:46 2018 +0000
string.c: setbyte silently ignores upper bits
The behaviour of String#setbyte has been depending on the width
of int, which is not portable. Must check explicitly.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@65804 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
:100644 100644 5a5cbb576c31350a416b2121e5efb85aaffa0676 e55c59136a49fb7d2d70805a65a4c7f56519f2e9 M string.c
:040000 040000 2170f3be4ac9d27e26ce361c38beb7874c07436f e71fc9f00ed5602a30e9f504fc015b9bf7b4dad1 M test
Apology for the previous comment. It shows IO#ungetbyte example. I confused them because I fixed them the same day for the same reason. The same thing happens for String#setbyte.
% ruby -v -e 'p "foo".setbyte(0,18446744073709551616)'
ruby 1.9.3p551 (2014-11-13) [x86_64-darwin15.6.0]
-e:1:in `setbyte': bignum too big to convert into `long' (RangeError)
from -e:1:in `<main>'
I think it's good to be strict here, i.e., to raise RangeError when str.setbyte(256) or higher as that could very well be a bug in the calling code.
Negative values also sound buggy in most situations.
Why are values higher in HexaPDF? Does it mean multiple bytes need to be written or is it enough to throw away the high bits?
I suspect the latter is rarely correct.
Why are values higher in HexaPDF? Does it mean multiple bytes need to be written or is it enough to throw away the high bits?
I suspect the latter is rarely correct.
The PNG spec says that all operations are to be taken as unsigned arithmetic modulo 256 and since this is what String#setbyte originally did, I just left out the additional module operation.
It is not a huge problem for me and I will release a new HexaPDF version with the fix soon.
As for whether the modulo operation should be done by String#setbyte: Many methods automatically convert arguments to an expected type if possible, e.g. with #to_str. Converting an integer to a byte using modulo 256 seems like the most straight-forward way to do this for String#setbyte.
I had a chance this week to ask matz if he wants to allow bigger inputs or not for those methods. He answered yes. He prefers mod 256 behaviour for larger numbers. I will fix them again.