Feature #13568
closedFile#path for O_TMPFILE fds has no meaning
Description
By using File::TMPFILE (O_TMPFILE) allows us to create a file without directory entries.
While open(2) with O_TMPFILE don't create a file without directory entries, it still requires a directory name to determine a file system to create a file.
Current Ruby implementation holds such directory names in fptr->pathv and retrievable via File#path.
But such paths are useless and may raise errors. For example, some code 1 checks File#path availability then when available, it attempts to use the path to open a file in different fd, finally raises Errno::EISDIR.
This patch changes File#path (fptr->pathv) not to return String if a fd is opened with O_TMPFILE.
Files
Updated by Hanmac (Hans Mackowiak) over 7 years ago
not pro or against it but what do you think of that part?
#ifdef O_TMPFILE
if (!(oflags & O_TMPFILE))
#endif
{
fptr->pathv = pathv;
}
you might leaveout the {} but i think its more readable that way
Updated by sorah (Sorah Fukumori) over 7 years ago
Let me explain about the background bit detail.
The example I put in the previous post https://github.com/aws/aws-sdk-ruby/blob/v2.9.17/aws-sdk-core/lib/aws-sdk-core/checksums.rb#L15 is from aws-sdk-ruby gem. The core part of this example is like:
require 'digest/sha2'
def upload(body)
checksum = if body.kind_of?(IO) && body.path
Digest::SHA256.file(body.path)
else
# ...
end
# ...
end
As a user of this library, I can head the Errno::EISDIR
like the following code using O_TMPFILE
:
File.open('/tmp', File::RDWR | File::TMPFILE) do |io|
upload io #=> Errno::EISDIR
end
AFAIK There's no way to determine a File
object is made with O_TMPFILE
.
Similar situations (path given by File#path
doesn't exist or is not valid) also can be made by deleting file before closing fd. But path in File
objects made by O_TMPFILE
is completely sure that it doesn't point a file of the object itself, so I think returning String
is meaningless from such objects.
Updated by sorah (Sorah Fukumori) over 7 years ago
Hanmac (Hans Mackowiak) wrote:
not pro or against it but what do you think of that part?
I don't have strong opinion on the style for lines you pointed out...
Updated by Hanmac (Hans Mackowiak) over 7 years ago
i see your change and did read the man page of O_TMPFILE
to check if there is any useful path this could return. (there isn't)
now i am pro this change.
does ruby has a way to use linkat
like to seen there?
http://man7.org/linux/man-pages/man2/open.2.html
hm about Tempfile
, does it already use O_TMPFILE
internal, or can it maybe changed to that?
(might probably for another ticket if not already done)
as for my code change, i thought it would be better if you don't have a duplicated line in there
Updated by sorah (Sorah Fukumori) over 7 years ago
- Status changed from Open to Closed
Applied in changeset trunk|r58734.
[DOC] File#path
result can be inaccurate
-
file.c(rb_file_path): [DOC] Note that the pathname returned by this
method can be inaccurate, for instance file gets moved, renamed,
deleted or is created with File::TMPFILE option.Relates to [Feature #13568]
Updated by sorah (Sorah Fukumori) over 7 years ago
- Status changed from Closed to Open
Reopening due to closed via r58734, a document change.
Updated by sorah (Sorah Fukumori) over 7 years ago
Hanmac (Hans Mackowiak) wrote:
now i am pro this change.
does ruby has a way to uselinkat
like to seen there?
http://man7.org/linux/man-pages/man2/open.2.html
I think we don't have one yet. But I'm not positive to give side effect on File#path
.
hm about
Tempfile
, does it already useO_TMPFILE
internal, or can it maybe changed to that?
(might probably for another ticket if not already done)
It's separate issue.
Updated by sorah (Sorah Fukumori) over 7 years ago
Few additional things pointed out by some conversations:
- We miss a way to retrieve a directory path used to create tmpfile
- Although file system can be identified by
File#stat
- Although file system can be identified by
-
Digest::*.file
don't expect aFile
object, but it's actually working due toFile.open
callsto_path
internally
Updated by Hanmac (Hans Mackowiak) over 7 years ago
yeah your points should be done in extra issues i think.
for Digest
i think it should first check if its an (direct) instant of IO
, or has a to_io
method.
if yes then it should read the file directly instead of trying to open with a new File.open
Updated by sorah (Sorah Fukumori) over 7 years ago
fix proposed for aws-sdk-ruby gem https://github.com/aws/aws-sdk-ruby/pull/1516 but I still think returning nil
from File#path
when path is surely known of its unavailability is happier than current behavior.
Updated by Hanmac (Hans Mackowiak) over 7 years ago
yeah your fix for aws does looks good, but i think that should be done in Digest
directly if able. (for another ticket)
Updated by sorah (Sorah Fukumori) over 7 years ago
Agreed, but my PR sent to aws-sdk-ruby includes different fix to remove incorrect usage of File#path
, so it's necessary.
Updated by sorah (Sorah Fukumori) over 7 years ago
- Status changed from Open to Assigned
- Assignee set to sorah (Sorah Fukumori)
We discussed about this today in DevelopersMeeting20170519Japan. Conclution in this meeting is:
- Return unexist path with informational message, like in Linux procfs symbolic link (
/proc/PID/fd/N
→/tmp/#165106976 (deleted)
). - Other considerations:
- Returning
nil
has no traceability. - Raising error is an alternate way, but it doesn't sound good.
- Returning
Please see the published log for conversation details in the meeting.
Updated by Hanmac (Hans Mackowiak) over 7 years ago
i think returing an unexisting path whould be way problematic if it isnt checked if it exist.
other functions might depend on it.
and trying to open "/tmp/#165106976 (deleted)"
in (read)/writing mode might result in unexpected results.
can it later checked if a file was open with O_TMPFILE
?
if not, when with unexisting path, you can't know if the file can be reopen.
that might be related to #13576
i would prefer it to return nil
.
PS: there might be ways to get the filesystem without using that File#path
method, so return nil
should be okay
Updated by shyouhei (Shyouhei Urabe) over 7 years ago
At the meeting we discussed the use case of File#path
and we thought there are 2 kinds:
- Pass it to
open()
- Pass it to
printf()
or variant
We considered both and concluded it is best to return a nonexistent path, not nil
.
Hanmac (Hans Mackowiak) wrote:
i think returing an unexisting path whould be way problematic if it isnt checked if it exist.
other functions might depend on it.
and trying to open"/tmp/#165106976 (deleted)"
in (read)/writing mode might result in unexpected results.
Trying to open nil
is like this:
irb(main):001:0> open(nil)
TypeError: no implicit conversion of nil into String
from (irb):1:in `open'
OTOH trying to open nonexistent path is:
irb(main):001:0> open("/tmp/#165106976 (deleted)")
Errno::ENOENT: No such file or directory @ rb_sysopen - /tmp/#165106976 (deleted)
from (irb):1:in `initialize'
from (irb):1:in `open'
Do you think nil
is better? I think nonexistent path is far more descriptive.
can it later checked if a file was open with
O_TMPFILE
?
if not, when with unexisting path, you can't know if the file can be reopen.
that might be related to #13576
Yes. We found that quite generally speaking, it is a bad idea to pass a file's path into open. Because such thing is not guaranteed to exist. I proposed prohibiting File#path
but that was rejected; because path is not only meant to be passed to open but also for inspection. When passing to printf
or logger or something like that, something other than nil is desirable. I filed #13576 to focus on "file's path passed to open" situation.
i would prefer it to return
nil
.PS: there might be ways to get the filesystem without using that
File#path
method, so return nil should be okay
(Out of curiousity) are there such ways? What do you have in mind?
Updated by Hanmac (Hans Mackowiak) over 7 years ago
like i said, open with write mode is the problem.
because it didn't fail with the nonexistent path
an way for it to make it work for printf
or logger would be to make it's #inspect
working (with not using path in that case)
about the filesystem, can #stat
be used for something like that?
Updated by nobu (Nobuyoshi Nakada) over 7 years ago
shyouhei (Shyouhei Urabe) wrote:
At the meeting we discussed the use case of
File#path
and we thought there are 2 kinds:
- Pass it to
open()
- Pass it to
printf()
or variant
A correction; The method which open
calls is to_path
, not path
.
These are different.
Do you think
nil
is better? I think nonexistent path is far more descriptive.
The write mode problem is a matter.
It would silently (and wrongly) success to write.
I think that File#to_path
should raise if the path dost not exist, but untouch File#path
.
Updated by naruse (Yui NARUSE) over 7 years ago
- Related to Feature #13577: Digest.file accidentally receives File object but uses file path added
Updated by shyouhei (Shyouhei Urabe) over 7 years ago
nobu (Nobuyoshi Nakada) wrote:
shyouhei (Shyouhei Urabe) wrote:
At the meeting we discussed the use case of
File#path
and we thought there are 2 kinds:
- Pass it to
open()
- Pass it to
printf()
or variantA correction; The method which
open
calls isto_path
, notpath
.
These are different.
Yes, I'm talking about File#path
. Situations like ruby-core:81167 is in my mind. #to_path
is a separate issue.
Do you think
nil
is better? I think nonexistent path is far more descriptive.The write mode problem is a matter.
It would silently (and wrongly) success to write.
What about adding a slash then? like:
irb(main):001:0> open("/tmp/#165106976 (deleted / nonexistent)", "w")
Errno::ENOENT: No such file or directory @ rb_sysopen - /tmp/#165106976 (deleted / nonexistent)
from (irb):1:in `initialize'
from (irb):1:in `open'
I think that
File#to_path
should raise if the path dost not exist, but untouchFile#path
.
For #to_path
, go to #13576. This issue is about #path
, and I think adding description is the best known solution; leave it untouched does not save anything.
Updated by nobu (Nobuyoshi Nakada) over 7 years ago
shyouhei (Shyouhei Urabe) wrote:
nobu (Nobuyoshi Nakada) wrote:
A correction; The method which
open
calls isto_path
, notpath
.
These are different.Yes, I'm talking about
File#path
. Situations like ruby-core:81167 is in my mind.#to_path
is a separate issue.
File#path
is not relevant to that situation at all.
Digest.file
just opens the argument, and open
calls to_path
method on it to implicit conversion.
$ ruby -rdigest -e 'obj = Object.new; def obj.path;raise "path!"; end; Digest::MD5.file(obj)'
Traceback (most recent call last):
4: from -e:1:in `<main>'
3: from /opt/local/lib/ruby/2.5.0/digest.rb:35:in `file'
2: from /opt/local/lib/ruby/2.5.0/digest.rb:50:in `file'
1: from /opt/local/lib/ruby/2.5.0/digest.rb:50:in `open'
/opt/local/lib/ruby/2.5.0/digest.rb:50:in `initialize': no implicit conversion of Object into String (TypeError)
$ ruby -rdigest -e 'obj = Object.new; def obj.to_path;raise "to_path!"; end; Digest::MD5.file(obj)'
Traceback (most recent call last):
5: from -e:1:in `<main>'
4: from /opt/local/lib/ruby/2.5.0/digest.rb:35:in `file'
3: from /opt/local/lib/ruby/2.5.0/digest.rb:50:in `file'
2: from /opt/local/lib/ruby/2.5.0/digest.rb:50:in `open'
1: from /opt/local/lib/ruby/2.5.0/digest.rb:50:in `initialize'
-e:1:in `to_path': to_path! (RuntimeError)
Do you think
nil
is better? I think nonexistent path is far more descriptive.The write mode problem is a matter.
It would silently (and wrongly) success to write.What about adding a slash then? like:
irb(main):001:0> open("/tmp/#165106976 (deleted / nonexistent)", "w") Errno::ENOENT: No such file or directory @ rb_sysopen - /tmp/#165106976 (deleted / nonexistent) from (irb):1:in `initialize' from (irb):1:in `open'
A dedicated exception such as "Unnamed temporary file in some path" from to_path
would be more descriptive and better, I think.
I think that
File#to_path
should raise if the path dost not exist, but untouchFile#path
.For
#to_path
, go to #13576. This issue is about#path
, and I think adding description is the best known solution; leave it untouched does not save anything.
I think this issue is false assertion at the first.
Updated by nobu (Nobuyoshi Nakada) over 7 years ago
$ ./x86_64-linux/bin/ruby -e 'f = open(File.expand_path("/tmp"), File::RDWR|File::TMPFILE); p open(f).read'
Traceback (most recent call last):
2: from -e:1:in `<main>'
1: from -e:1:in `open'
-e:1:in `to_path': unnamed temporary file at /tmp (IOError)
diff --git a/file.c b/file.c
index b234910d5d..261997ecb2 100644
--- a/file.c
+++ b/file.c
@@ -367,7 +367,6 @@ apply2files(void (*func)(const char *, VALUE, void *), int argc, VALUE *argv, vo
/*
* call-seq:
* file.path -> filename
- * file.to_path -> filename
*
* Returns the pathname used to create <i>file</i> as a string. Does
* not normalize the name.
@@ -391,6 +390,38 @@ rb_file_path(VALUE obj)
return rb_obj_taint(rb_str_dup(fptr->pathv));
}
+#ifdef O_TMPFILE
+/*
+ * call-seq:
+ * file.to_path -> filename
+ *
+ * Returns the pathname to open the <i>file</i> as a string.
+ *
+ * The pathname may not point the file corresponding to <i>file</i>.
+ * e.g. file has been moved, deleted, or created with <code>File::TMPFILE</code> option.
+ */
+
+static VALUE
+rb_file_to_path(VALUE obj)
+{
+ rb_io_t *fptr;
+ int flags;
+
+ fptr = RFILE(rb_io_taint_check(obj))->fptr;
+ rb_io_check_initialized(fptr);
+ flags = fcntl(fptr->fd, F_GETFL);
+ if (flags == -1) rb_sys_fail_path(fptr->pathv);
+ if ((flags & O_TMPFILE) == O_TMPFILE) {
+ rb_raise(rb_eIOError, "unnamed temporary file at %"PRIsVALUE,
+ fptr->pathv);
+ }
+ if (NIL_P(fptr->pathv)) return Qnil;
+ return rb_obj_taint(rb_str_dup(fptr->pathv));
+}
+#else
+# define rb_file_to_path rb_file_path
+#endif
+
static size_t
stat_memsize(const void *p)
{
@@ -6140,7 +6171,7 @@ Init_File(void)
rb_define_const(rb_mFConst, "NULL", rb_fstring_cstr(null_device));
rb_define_method(rb_cFile, "path", rb_file_path, 0);
- rb_define_method(rb_cFile, "to_path", rb_file_path, 0);
+ rb_define_method(rb_cFile, "to_path", rb_file_to_path, 0);
rb_define_global_function("test", rb_f_test, -1);
rb_cStat = rb_define_class_under(rb_cFile, "Stat", rb_cObject);
Updated by shyouhei (Shyouhei Urabe) over 7 years ago
nobu (Nobuyoshi Nakada) wrote:
$ ./x86_64-linux/bin/ruby -e 'f = open(File.expand_path("/tmp"), File::RDWR|File::TMPFILE); p open(f).read' Traceback (most recent call last): 2: from -e:1:in `<main>' 1: from -e:1:in `open' -e:1:in `to_path': unnamed temporary file at /tmp (IOError)
Why are you ignoring #13576, after I repeatedly mentioned in this thread?
Updated by duerst (Martin Dürst) over 7 years ago
- Subject changed from File#path for O_TMPFILE fds are unmeaning to File#path for O_TMPFILE fds has no meaning
Updated by nobu (Nobuyoshi Nakada) over 7 years ago
Once #13576 is introduced, this issue becomes stale.
It doesn't make sense to change File#path
.
Updated by shyouhei (Shyouhei Urabe) over 7 years ago
nobu (Nobuyoshi Nakada) wrote:
Once #13576 is introduced, this issue becomes stale.
It doesn't make sense to changeFile#path
.
OK, I see your opinion over File#path. I have different opinion though. I agree with sorah here; a file created using O_TMPFILE can never have a meaningful path.
Updated by Eregon (Benoit Daloze) over 7 years ago
I think an exception is the best way to be clear about the problem here, nil or a fake path are most likely to raise exceptions, but later and in a harder to debug way.
Moreover, this only happens for O_TMPFILE files, so it seems reasonable to have different behavior.
Updated by matz (Yukihiro Matsumoto) over 7 years ago
I vote for raising exceptions (ENOENT?).
Matz.
Updated by sorah (Sorah Fukumori) over 7 years ago
- File 0001-File-path-Raise-IOError-when-a-file-is-O_TMPFILE.patch 0001-File-path-Raise-IOError-when-a-file-is-O_TMPFILE.patch added
Nobu's patch at [ruby-core:81329] fails when fd is closed... Updated the patch.
- Raise IOError when fptr->pathv is Qnil on File#path
- Set Qnil to fptr->pathv when opening file with O_TMPFILE
- File#to_path and FIle#path behave same
Updated by sorah (Sorah Fukumori) over 7 years ago
- Status changed from Assigned to Closed
Applied in changeset trunk|r59704.
File#path: Raise IOError when a file is O_TMPFILE
File#path for a file opened with O_TMPFILE has no meaning.
A filepath returned by this method isn't guarranteed about its accuracy,
but files opened with O_TMPFILE are known its recorded path has no
meaning. So let them not to return any pathname.
After a discussion in ruby-core, just returning Qnil makes guessing the
root cause difficult. Instead, this patch makes the method to raise an
error.
Other consideration is calling fnctl(2) on rb_file_path, but it adds a
overhead, and it's difficult to determine O_TMPFILE status after fd has
been closed.
[Feature #13568]
-
io.c(rb_file_open_generic): Set Qnil to fptr->pathv when opening a
file using O_TMPFILE -
file.c(rb_file_path): Raise IOError when fptr->pathv is Qnil
-
file.c(rb_file_path): [DOC] Update for the new behavior