Feature #21700
open`IO::Buffer.map`: offset argument is "broken" and needs to be made more useful
Description
This was supposed to be a bug report, but it turned into a feature request halfway through.
IO::Buffer.map supports size and offset arguments to localize buffer to a specific part of mapped file. While size works fine, specifying offset as anything but 0 is a recipe for disaster.
- It is practically undocumented. This is it currently: "Optional size and offset of mapping can be specified." While
sizecan be whatever user desires,offsetcan only be set to specific values. It was very surprising when I started testing it, as I expected it to behave similar to size and offset of a slice. Which values are allowed? That's the next point. - Possible values for
offsetare highly platform-dependent.
- On Linux it must be a multiple of page size. (https://www.man7.org/linux/man-pages/man2/mmap.2.html)
- On macOS it can be any value? manpage doesn't specify exactly. (https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/mmap.2.html (I looked at the manpage on a mac, seems to be the same.))
- On Windows it must be a multiple of some value that isn't page size and page size doesn't work here. StackOverflow says it's 64 kB, but that didn't work for me. (See dwFileOffsetLow in https://learn.microsoft.com/en-us/windows/win32/api/memoryapi/nf-memoryapi-mapviewoffile)
- On other systems? Who knows.
- On macOS, a buffer successfully created with an offset crashes on read. I don't know why, it's exactly the same file, size and offset that works on Ubuntu, and the read is bounded.
- Currently, offset doesn't change the buffer's size if unspecified, leading to crashes. I will be making a PR addressing this but still.
- All of this makes offset unpractical to test, leading to all kinds of problems.
So the problem is: this argument is a very low-level detail that requires significant knowledge of all platforms your code may be ever deployed to, even if strictly Ruby is used (no C API), with unexpectedly high footgun potential. I understand that IO::Buffer is a lower-level class, but this seems too much.
I propose to make offset argument into a convenience for users, allowing any reasonable value (i.e inside the file). Behind the scenes, we can use the native offset for efficieny, on platforms where it works well, but the user API would be decoupled from this. However, I'm not particularly familiar with Ruby internals, so I can't tell how viable this is.
Updated by trinistr (Alexander Bulancov) about 1 hour ago
Currently, offset doesn't change the buffer's size if unspecified, leading to crashes. I will be making a PR addressing this but still.
The pull request in question: https://github.com/ruby/ruby/pull/15264