Accept IO object as stdin data in Open3.capture
Currently Open3.capture3, Open3.capture2, Open3.capture2e accept a :stdin_data option, which allows you to write a String into subprocess' standard input. This patch adds the ability to also pass in an IO-like object (any object that respond to #read) as :stdin_data, which will them be streamed to standard input.
Open3.capture3("file", "--mime-type", "--brief", "-", stdin_data: File.open("image.jpg"))
Open3.capture3("ffprobe", "-print_format", "json", "-i", "pipe:0", stdin_data: File.open("video.mp4"))
This is convenient when you want to pass in files into standard input (images, videos etc), because this way you don't have to load the whole file into memory, the file contents will get efficiently streamed into subprocess' standard input.
Another advantage is that many command line tools will stop reading the standard input once they get enough data. In both the examples above the subprocess will stop reading standard input as soon as it gets the information it needs (the image MIME type or video metadata), and in both examples it turns out to be about 1-2MB. This isn't that useful if the IO object represents a file on the filesystem (where reading is fast), but it becomes very useful when the IO object represents a file from the database or a remote file over HTTP. That way you don't need to guess how much data the subprocess needs, you can just give it the IO object and it will read as much as it needs, and then only that amount will be retrieved from the database or downloaded from the Internet.
Updated by akr (Akira Tanaka) over 2 years ago
- Status changed from Assigned to Closed
Updated by janko (Janko Marohnić) over 2 years ago
Thank you for the patch!
Since IO.copy_stream also accepts IO objects that respond only to #read (and not #readpartial), would it be possible to also permit those objects as :stdin_data (maybe check that the object responds to either #read or #readpartial)?
That was my use case, being able to pass any #read-able object as the standard input. In my case the IO-like objects I work with don't respond to #readpartial, because #read is enough.