Bug #18417
Updated by zverok (Victor Shepelev) about 3 years ago
Hello. While working for documentation for `IO::Buffer` (still WIP), I uncovered these problems. I believe at least some of them should be fixed before the release, though I understand not much time left. The list is not final, probably (still on it, but wanted to raise awareness ASAP). **These should be probably fixed/discussed before the release** 1\. I believe `to_str` is an unfortunate name for "read string from buffer, maybe entire buffer. It is Ruby's convention that `to_str` is an implicit String-conversion, and a) I am not sure the Buffer is "string-like" enough to be implicitly converted, and b) it is really unexpected that `#to_s` and `#to_str` do different things: ```ruby buf = IO::Buffer.for('test') "buf: #{buf}" # => "buf: #<IO::Buffer 0x00007fdb7f4e3a20+4 SLICE>" "buf: " + buf # => "buf: test" puts buf # #<IO::Buffer 0x00007fdb7f4e3a20+4 SLICE> puts '' + buf # test ``` Another concern about `#to_str` naming is it is one of the main parts of the buffer interface (alongside `#copy`), but doesn't look this way, and looks like some utility method instead. Maybe copy/to_str pair should be symmetrical, like `write`/`read`, or something like that? 2\. I am not sure that type names (`:U8`, `:S8`, `:U16`, `:u16` and so on) introduced "from scratch" (?) is a good thing. From one point, they are kinda nice, and consistent. From another, we already have abbreviated names for the types in [Array#pack](https://docs.ruby-lang.org/en/master/Array.html#method-i-pack), and having two completely unrelated sets of designation for the same thing in the **core of the language** seems weird to me. I understand that not all `#pack`s designations can be represented by Symbol, and their consistency is questionable (it is `S` for 16-bit unsigned LE, `s` for 16-bit signed LE, and `S>`/`s>` for the same in BE), but the discrepancy is questionable too. 3\. About `flags` as a creation parameter: * `::new`: it seems that the only acceptable value is `MAPPED`; of others, `INTERNAL` is the default, `EXTERNAL`, `IMMUTABLE` and `LOCKED` raise on creation; * `::for`: no flags accepted (though it seems that a method to make such buffer immutable might be of some utility?) * `::map`: `INTERNAL`, `LOCKED`, `EXTERNAL` are ignored, `MAPPED` is the default; seems like the only usage of the parameter is _not_ to pass `IMMUTABLE` which is the default: and for that, you'll need to also pass `size` and `offset`. It seems, that **we shouldn't probably expose "flags" concept to the user** at all, and instead make APIs like `::new(size, mapped: false)` and `::map(io, ..., immutable: true)`? **Probably bugs** * It seems that `#clear` behaves weirdly when `offset` is provided but `length` is not: ```ruby buf = IO::Buffer.new(4) buf.clear(1, 2) # it seems obvious that I want to clear from byte 2 to the end of the buffer # in `clear': Offset + length out of bounds! (RuntimeError) ``` * `IO::Buffer.map('README.md')` (just a string, I experimented "whether it is smart enough") core dumps. * I suspect we might want to set buffer to `IMMUTABLE` if it is created from the frozen string. Currently, this works: ```ruby str = 'test'.freeze buf = IO::Buffer.for(str) buf.set(:U8, 1, 111) str #=> "tost" ``` * It seems `to_str` always returns a string in `US-ASCII` encoding, I am not sure it is the right behavior. For a low-level buffer, I'd rather expect `ASCII-8BIT` (what file read in binary mode produces in Ruby). * (Not sure if a bug or just not very useful behavior?) `#resize` applied to a buffer mapped from file seems to "break" the link between the buffer and file; no further `copy` or `set` would be reflected on file. * ~~`buf = IO::Buffer.new(4); buf.free; buf.inspect` gives a core dump~~ dump~ * ~~`buf = IO::Buffer.for("test"); buf.free` — after that, `buf.get(:U8, 0)` gives "Buffer has been invalidated! (RuntimeError)", but `buf.to_str` works and produces `"\x00\x00\x00\x00"`; `buf.to_str(2)` does a core dump~~ **Inconsistencies and other problems** * The buffer raises `RuntimeError` as the only type of the error. I believe that in some places, it should be other standard errors (for example, "Offset + length out of bounds" could be at least `ArumentError`?..), and in other occurrences, special error classes: for example, "Buffer is locked" would probably be helpful to have as some `IO::Buffer::LockedError` * ~~Current `#inspect` is kinda nice and useful, but it also might be really annoying when used by Ruby itself. Try `IO::Buffer.map(File.open("big file")).unexisting_method`, and you'll get REALLY long error message.~~ * ~~Can we maybe have for `#resize` a signature `resize(new_size, preserve = self.size)`? It seems a good default to allow just `resize(123)`, with preserving everything by default?~~