Project

General

Profile

Actions

Bug #18417

closed

IO::Buffer problems

Added by zverok (Victor Shepelev) about 3 years ago. Updated about 3 years ago.

Status:
Closed
Target version:
-
[ruby-core:106730]

Description

Hello. While working for documentation for IO::Buffer (still WIP), I uncovered these problems. I believe at least some of them should be fixed before the release, though I understand not much time left.

The list is not final, probably (still on it, but wanted to raise awareness ASAP).

These should be probably fixed/discussed before the release

1. I believe to_str is an unfortunate name for "read string from buffer, maybe entire buffer. It is Ruby's convention that to_str is an implicit String-conversion, and a) I am not sure the Buffer is "string-like" enough to be implicitly converted, and b) it is really unexpected that #to_s and #to_str do different things:

buf = IO::Buffer.for('test')

"buf: #{buf}"
# => "buf: #<IO::Buffer 0x00007fdb7f4e3a20+4 SLICE>"
"buf: " + buf
# => "buf: test"

puts buf
# #<IO::Buffer 0x00007fdb7f4e3a20+4 SLICE>
puts '' + buf
# test

Another concern about #to_str naming is it is one of the main parts of the buffer interface (alongside #copy), but doesn't look this way, and looks like some utility method instead. Maybe copy/to_str pair should be symmetrical, like write/read, or something like that?

2. I am not sure that type names (:U8, :S8, :U16, :u16 and so on) introduced "from scratch" (?) is a good thing. From one point, they are kinda nice, and consistent. From another, we already have abbreviated names for the types in Array#pack, and having two completely unrelated sets of designation for the same thing in the core of the language seems weird to me. I understand that not all #packs designations can be represented by Symbol, and their consistency is questionable (it is S for 16-bit unsigned LE, s for 16-bit signed LE, and S>/s> for the same in BE), but the discrepancy is questionable too.

3. About flags as a creation parameter:

  • ::new: it seems that the only acceptable value is MAPPED; of others, INTERNAL is the default, EXTERNAL, IMMUTABLE and LOCKED raise on creation;
  • ::for: no flags accepted (though it seems that a method to make such buffer immutable might be of some utility?)
  • ::map: INTERNAL, LOCKED, EXTERNAL are ignored, MAPPED is the default; seems like the only usage of the parameter is not to pass IMMUTABLE which is the default: and for that, you'll need to also pass size and offset.

It seems, that we shouldn't probably expose "flags" concept to the user at all, and instead make APIs like ::new(size, mapped: false) and ::map(io, ..., immutable: true)?

Probably bugs

  • It seems that #clear behaves weirdly when offset is provided but length is not:
buf = IO::Buffer.new(4)
buf.clear(1, 2) # it seems obvious that I want to clear from byte 2 to the end of the buffer
# in `clear': Offset + length out of bounds! (RuntimeError)
  • IO::Buffer.map('README.md') (just a string, I experimented "whether it is smart enough") core dumps.
  • I suspect we might want to set buffer to IMMUTABLE if it is created from the frozen string. Currently, this works:
str = 'test'.freeze
buf = IO::Buffer.for(str)
buf.set(:U8, 1, 111)
str
#=> "tost"
  • It seems to_str always returns a string in US-ASCII encoding, I am not sure it is the right behavior. For a low-level buffer, I'd rather expect ASCII-8BIT (what file read in binary mode produces in Ruby).
  • (Not sure if a bug or just not very useful behavior?) #resize applied to a buffer mapped from file seems to "break" the link between the buffer and file; no further copy or set would be reflected on file.
  • buf = IO::Buffer.new(4); buf.free; buf.inspect gives a core dump
  • buf = IO::Buffer.for("test"); buf.free — after that, buf.get(:U8, 0) gives "Buffer has been invalidated! (RuntimeError)", but buf.to_str works and produces "\x00\x00\x00\x00"; buf.to_str(2) does a core dump

Inconsistencies and other problems

  • The buffer raises RuntimeError as the only type of the error. I believe that in some places, it should be other standard errors (for example, "Offset + length out of bounds" could be at least ArumentError?..), and in other occurrences, special error classes: for example, "Buffer is locked" would probably be helpful to have as some IO::Buffer::LockedError
  • Current #inspect is kinda nice and useful, but it also might be really annoying when used by Ruby itself. Try IO::Buffer.map(File.open("big file")).unexisting_method, and you'll get REALLY long error message.
  • Can we maybe have for #resize a signature resize(new_size, preserve = self.size)? It seems a good default to allow just resize(123), with preserving everything by default?
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0