Feature #4038

IO#advise

Added by Run Paint Run Run over 4 years ago. Updated almost 4 years ago.

[ruby-core:33110]
Status:Closed
Priority:Normal
Assignee:-

Description

=begin
As discussed in #4015, I suggest a wrapper around posix_fadvise(2) named IO#advise. On platforms that don't support this system call, IO#advise is a no-op. Otherwise, it provides a hint to the kernel as to how the given file descriptor will be accessed in the future. This allows the kernel to optimise its page cache accordingly.

io-advise.patch Magnifier (5.23 KB) Run Paint Run Run, 11/09/2010 10:27 AM

io-advise-2.patch Magnifier - Updated to use rb_syserr_fail() (5.22 KB) Run Paint Run Run, 11/09/2010 10:57 AM

io-advise-3.patch Magnifier - Clarified documentation (5.47 KB) Run Paint Run Run, 11/10/2010 01:54 AM

io-advise-4.patch Magnifier (6.87 KB) Run Paint Run Run, 11/11/2010 06:24 AM

io-advise-5.patch Magnifier (7.21 KB) Run Paint Run Run, 11/13/2010 09:07 AM

IO-advise-6-kosaki.patch Magnifier (7.66 KB) Motohiro KOSAKI, 12/16/2010 03:44 AM

IO-advise-7-kosaki.patch Magnifier (7.87 KB) Motohiro KOSAKI, 12/16/2010 04:55 AM


Related issues

Related to Ruby trunk - Feature #4015: File::DIRECT Constant for O_DIRECT Closed 11/03/2010
Related to Ruby trunk - Feature #4204: IO#advise should raise error for unknown symbol Closed 12/25/2010

History

#1 Updated by Run Paint Run Run over 4 years ago

=begin
Updated patch to use rb_syserr_fail().
=end

#2 Updated by Motohiro KOSAKI over 4 years ago

=begin
Hello,

2010/11/9 Run Paint Run Run redmine@ruby-lang.org:

Issue #4038 has been updated by Run Paint Run Run.

File io-advise-2.patch added

Updated patch to use rb_syserr_fail().

http://redmine.ruby-lang.org/issues/show/4038

I like this patch and I've reviewed this.

diff --git a/configure.in b/configure.in
index dc8cb7c..5eb0ef5 100644
--- a/configure.in
+++ b/configure.in
@@ -1266,7 +1266,7 @@ else
fi
AC_CHECK_FUNCS(fmod killpg wait4 waitpid fork spawnv syscall chroot getcwd eaccess\
truncate ftruncate chsize times utimes utimensat fcntl lockf lstat\
- link symlink readlink readdir_r fsync fdatasync fchown\
+ link symlink readlink readdir_r fsync fdatasync fchown posix_fadvise\
setitimer setruid seteuid setreuid setresuid setproctitle socketpair\
setrgid setegid setregid setresgid issetugid pause lchown lchmod\
getpgrp setpgrp getpgid setpgid initgroups getgroups setgroups\
diff --git a/io.c b/io.c
index 8fa6613..af6cd3a 100644
--- a/io.c
+++ b/io.c
@@ -1,3 +1,4 @@
+
/**********************************************************************

io.c -
@@ -1413,6 +1414,107 @@ rb_io_fdatasync(VALUE io)
#define rb_io_fdatasync rb_f_notimplement
#endif

+static VALUE sym_normal, sym_sequential, sym_random,
+ sym_willneed, sym_dontneed, sym_noreuse;
+/*
+ * call-seq:
+ * ios.advise(advice, offset=0, len=0) -> nil
+ *
+ * Announce an intention to access data from the current file in a
+ * specific pattern. On platforms that do not support the
+ * posix_fadvise(2) system call, this method is a no-op.
+ *
+ * advice is one of the following symbols:
+ *
+ * * :normal - No advice to give; the default assumption for an open file.
+ * * :sequential - The data will be accessed sequentially:
+ * with lower offsets read before higher ones.
+ * * :random - The data will be accessed in random order.
+ * * :willneed - The data will be accessed in the near future.
+ * * :dontneed - The data will not be accessed in the near future.
+ * * :noreuse - The data will only be accessed once.

Probably, we have to note detailed meaning is OS dependent. example,
On almost os, POSIX_FADV_DONTNEED mean to add a hint to cache
reclaim logic. But, on linux, it immediately drop page caches.

  • *
  • * "data" means the region of the current file that begins
  • * at offset and extends for len bytes. By default, both offset
  • * and len are 0, meaning that the advice applies to the entire
  • * file.

This doesn't describe offset != 0 && len == 0 case.

  • * If an error occurs, one of the following exceptions will be raised:
  • *
  • * * IOError - The IO stream is closed.
  • * * Errno::EBADF - The file descriptor of the current file is
  • invalid.
  • * * Errno::EINVAL - An invalid value for advice was given.
  • * * Errno::ESPIPE - The file descriptor of the current
  • * * file refers to a FIFO or pipe. (Linux raises Errno::EINVAL
  • * * in this case).

A lot of OS may return os specific errno. so probably we need to note
other Errno::*
may happen.

  • * * TypeError - Either advice was not a Symbol, or one of the
  • other arguments was not an Integer.
  • * * RangeError - One of the arguments given was too big/small.
  • */ + +static VALUE +rb_io_advise(int argc, VALUE *argv, VALUE io) +{
  • off_t offset, len;
  • int advice, rv;
  • VALUE initadvice, initoffset, initlen;
  • rb_io_t *fptr; +
  • rb_scan_args(argc, argv, "12", &initadvice, &initoffset, &initlen);
  • if (TYPE(initadvice) != T_SYMBOL)
  • rb_raise(rb_eTypeError, "advice must be a Symbol"); +
  • offset = NIL_P(initoffset) ? 0 : NUM2OFFT(initoffset);
  • len = NIL_P(initlen) ? 0 : NUM2OFFT(initlen);
  • advice = 0; +
  • if (initadvice == sym_normal) { +#ifdef POSIX_FADV_NORMAL
  • advice = POSIX_FADV_NORMAL; +#endif
  • }
  • else if (initadvice == sym_random) { +#ifdef POSIX_FADV_RANDOM
  • advice = POSIX_FADV_RANDOM; +#endif
  • }
  • else if (initadvice == sym_sequential) { +#ifdef POSIX_FADV_SEQUENTIAL
  • advice = POSIX_FADV_SEQUENTIAL; +#endif
  • }
  • else if (initadvice == sym_willneed) { +#ifdef POSIX_FADV_WILLNEED
  • advice = POSIX_FADV_WILLNEED; +#endif
  • }
  • else if (initadvice == sym_dontneed) { +#ifdef POSIX_FADV_DONTNEED
  • advice = POSIX_FADV_DONTNEED; +#endif
  • }
  • else if (initadvice == sym_noreuse) { +#ifdef POSIX_FADV_NOREUSE
  • advice = POSIX_FADV_NOREUSE; +#endif
  • }
  • else
  • rb_raise(rb_eArgError, "Invalid advice: :%s",
  • RSTRING_PTR(rb_id2str(SYM2ID(initadvice))));

Don't we need following?

if (!advice)
return Qnil;

Or, should we raise not-implement exception?

+
+#ifdef HAVE_POSIX_FADVISE
+ io = GetWriteIO(io);
+ GetOpenFile(io, fptr);
+
+ if (rv = posix_fadvise(fptr->fd, offset, len, advice))
+ /* posix_fadvise(2) doesn't set errno. On success it returns 0; otherwise
+ it returns the error code. /
+ rb_syserr_fail(rv, RSTRING_PTR(fptr->pathv));
+#endif
+ return Qnil;
+}
+
/

* call-seq:
* ios.fileno -> fixnum
@@ -10052,6 +10154,7 @@ Init_IO(void)
rb_define_method(rb_cIO, "binmode", rb_io_binmode_m, 0);
rb_define_method(rb_cIO, "binmode?", rb_io_binmode_p, 0);
rb_define_method(rb_cIO, "sysseek", rb_io_sysseek, -1);
+ rb_define_method(rb_cIO, "advise", rb_io_advise, -1);

 rb_define_method(rb_cIO, "ioctl", rb_io_ioctl, -1);
 rb_define_method(rb_cIO, "fcntl", rb_io_fcntl, -1);

@@ -10220,4 +10323,10 @@ Init_IO(void)
sym_textmode = ID2SYM(rb_intern("textmode"));
sym_binmode = ID2SYM(rb_intern("binmode"));
sym_autoclose = ID2SYM(rb_intern("autoclose"));
+ sym_normal = ID2SYM(rb_intern("normal"));
+ sym_sequential = ID2SYM(rb_intern("sequential"));
+ sym_random = ID2SYM(rb_intern("random"));
+ sym_willneed = ID2SYM(rb_intern("willneed"));
+ sym_dontneed = ID2SYM(rb_intern("dontneed"));
+ sym_noreuse = ID2SYM(rb_intern("noreuse"));
}

=end

#3 Updated by Run Paint Run Run over 4 years ago

=begin

I like this patch and I've reviewed this.

Thank you. I've updated the documentation according to your suggestions. Does it look OK?

Don't we need following?

if (!advice)
return Qnil;

Or, should we raise not-implement exception?

My thinking was that in the bizarre case where none of the POSIX_FADV_* constants were defined, but HAVE_POSIX_FADVISE was, it was acceptable to pass posix_fadvise() 0 for the advice argument. On my system, at least, POSIX_FADV_NORMAL has the value 0, so this makes even more sense. On other systems, posix_fadvise() would presumably return EINVAL in this case, which we would then raise. If this isn't acceptable, perhaps we initialise advice to a sentinel value, then return Qnil if it has this value after the else statement? I'd rather not raise NotImplementedError because otherwise we try to fail silently on platforms without this syscall.

Are there any security issues we need to consider? $SAFE, tainting, trust? #advise already raises a SecurityError when $SAFE=4.
=end

#4 Updated by Motohiro KOSAKI over 4 years ago

=begin
Hi

2010/11/10 Run Paint Run Run redmine@ruby-lang.org:

Issue #4038 has been updated by Run Paint Run Run.

File io-advise-3.patch added

I like this patch and I've reviewed this.

Thank you. I've updated the documentation according to your suggestions. Does it look OK?

Looks good. :)

Don't we need following?

if (!advice)
   return Qnil;

Or, should we raise not-implement exception?

My thinking was that in the bizarre case where none of the POSIX_FADV_* constants were defined, but HAVE_POSIX_FADVISE was, it was acceptable to pass posix_fadvise() 0 for the advice argument. On my system, at least, POSIX_FADV_NORMAL has the value 0, so this makes even more sense. On other systems, posix_fadvise() would presumably return EINVAL in this case, which we would then raise. If this isn't acceptable, perhaps we initialise advice to a sentinel value, then return Qnil if it has this value after the else statement? I'd rather not raise NotImplementedError because otherwise we try to fail silently on platforms without this syscall.

Ah ok, I missed POSIX_FADV_NORMAL=0 case. So, I don't think initialize
advise=0 is good idea. It mean initialize platform dependent meanings.
And personally I prefer that unsupported hint behave as no-op. But,
I'm ok other behavior too if unsupported
hint keeps platform independent meanings.

Are there any security issues we need to consider? $SAFE, tainting, trust? #advise already raises a SecurityError when $SAFE=4.

I think we don't need additional limit. fadvise() is a hint of
read/write operation. and we allow read/write on $SAFE=1/2/3.

Thanks.

=end

#5 Updated by Run Paint Run Run over 4 years ago

=begin

Ah ok, I missed POSIX_FADV_NORMAL=0 case. So, I don't think initialize
advise=0 is good idea. It mean initialize platform dependent meanings.

I've removed it and added some rudimentary tests of the API. (I can't see how functionality can be tested without parsing the output of strace(1), which I assume we don't want to do).

=end

#6 Updated by Eric Wong over 4 years ago

=begin
Run Paint Run Run redmine@ruby-lang.org wrote:

  • if (rv = posix_fadvise(fptr->fd, offset, len, advice))

I would release the GVL when making this call, some implementations may
block (even only on certain advice) the running thread for disk ops.
This is the case with POSIX_FADV_WILLNEED under Linux: the entire range
to be read into the page cache before returning from this function,
making it very noticeable on a slow device/filesystem with large files
such as sshfs.

--
Eric Wong

=end

#7 Updated by Run Paint Run Run over 4 years ago

=begin

  • if (rv = posix_fadvise(fptr->fd, offset, len, advice))

I would release the GVL when making this call, some implementations may
block (even only on certain advice) the running thread for disk ops.
This is the case with POSIX_FADV_WILLNEED under Linux: the entire range
to be read into the page cache before returning from this function,
making it very noticeable on a slow device/filesystem with large files
such as sshfs.

I've attempted this, but it certainly needs review.
=end

#8 Updated by Motohiro KOSAKI about 4 years ago

=begin
I've modified the patch slightly. Now IO#advise makes noop instead undefined behavior if the platform doesn't support the advise.

My point is, If the method can make undefined behavior, many developer strongly want to avoid it and they want to know which
advise is implemented. I'd like to remove such unnecessary fear.

Thanks.

  • kosaki

=end

#9 Updated by Motohiro KOSAKI about 4 years ago

=begin
Fix test failure isssue on !HAVE_POSIX_FADVISE platform (ie windows).

=end

#10 Updated by Yukihiro Matsumoto about 4 years ago

=begin
Hi,

In message "Re: [Ruby 1.9-Feature#4038] IO#advise"
on Thu, 16 Dec 2010 04:55:22 +0900, Motohiro KOSAKI redmine@ruby-lang.org writes:
|
|[1 ]
|Issue #4038 has been updated by Motohiro KOSAKI.
|
|File IO-advise-7-kosaki.patch added
|
|Fix test failure isssue on !HAVE_POSIX_FADVISE platform (ie windows).

Please check-in, but I think symbol initialization (:normal etc.) is
needed only when posix_fadvise(2) is available.

                        matz.

=end

#11 Updated by Run Paint Run Run about 4 years ago

=begin

Please check-in, but I think symbol initialization (:normal etc.) is
needed only when posix_fadvise(2) is available.

The advantage of the current approach is that we can fail earlier.
Otherwise, a script containing io.advise([1,2,3]) would not raise an
exception under Windows, but would if run under Linux.

=end

#12 Updated by Yukihiro Matsumoto about 4 years ago

=begin
Hi,

In message "Re: Re: [Ruby 1.9-Feature#4038] IO#advise"
on Thu, 16 Dec 2010 22:56:45 +0900, Run Paint Run Run runrun@runpaint.org writes:

|> Please check-in, but I think symbol initialization (:normal etc.) is
|> needed only when posix_fadvise(2) is available.
|
|The advantage of the current approach is that we can fail earlier.
|Otherwise, a script containing io.advise([1,2,3]) would not raise an
|exception under Windows, but would if run under Linux.

No, no. I meant lines like

 sym_normal = ID2SYM(rb_intern("normal"));
 sym_sequential = ID2SYM(rb_intern("sequential"));

can be wrapped by #ifdef HAVE_POSIX_FADVISE

                        matz.

=end

#13 Updated by Motohiro KOSAKI about 4 years ago

=begin
2010/12/16 Yukihiro Matsumoto matz@ruby-lang.org:

Hi,

In message "Re: [Ruby 1.9-Feature#4038] IO#advise"
   on Thu, 16 Dec 2010 04:55:22 +0900, Motohiro KOSAKI redmine@ruby-lang.org writes:
|
|[1  ]
|Issue #4038 has been updated by Motohiro KOSAKI.
|
|File IO-advise-7-kosaki.patch added
|
|Fix test failure isssue on !HAVE_POSIX_FADVISE platform (ie windows).

Please check-in, but I think symbol initialization (:normal etc.) is
needed only when posix_fadvise(2) is available.

Yes, sir :)

=end

#14 Updated by Motohiro KOSAKI about 4 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

=begin
This issue was solved with changeset r30229.
Run Paint, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

=end

Also available in: Atom PDF