Project

General

Profile

Actions

Feature #4038

closed

IO#advise

Added by runpaint (Run Paint Run Run) over 13 years ago. Updated almost 13 years ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:33110]

Description

=begin
As discussed in #4015, I suggest a wrapper around posix_fadvise(2) named IO#advise. On platforms that don't support this system call, IO#advise is a no-op. Otherwise, it provides a hint to the kernel as to how the given file descriptor will be accessed in the future. This allows the kernel to optimise its page cache accordingly.


Files

io-advise.patch (5.23 KB) io-advise.patch runpaint (Run Paint Run Run), 11/09/2010 10:27 AM
io-advise-2.patch (5.22 KB) io-advise-2.patch Updated to use rb_syserr_fail() runpaint (Run Paint Run Run), 11/09/2010 10:57 AM
io-advise-3.patch (5.47 KB) io-advise-3.patch Clarified documentation runpaint (Run Paint Run Run), 11/10/2010 01:54 AM
io-advise-4.patch (6.87 KB) io-advise-4.patch runpaint (Run Paint Run Run), 11/11/2010 06:24 AM
io-advise-5.patch (7.21 KB) io-advise-5.patch runpaint (Run Paint Run Run), 11/13/2010 09:07 AM
IO-advise-6-kosaki.patch (7.66 KB) IO-advise-6-kosaki.patch kosaki (Motohiro KOSAKI), 12/16/2010 03:44 AM
IO-advise-7-kosaki.patch (7.87 KB) IO-advise-7-kosaki.patch kosaki (Motohiro KOSAKI), 12/16/2010 04:55 AM

Related issues 2 (0 open2 closed)

Related to Ruby master - Feature #4015: File::DIRECT Constant for O_DIRECTClosed11/03/2010Actions
Related to Ruby master - Feature #4204: IO#advise should raise error for unknown symbolClosedkosaki (Motohiro KOSAKI)12/25/2010Actions
Actions #1

Updated by runpaint (Run Paint Run Run) over 13 years ago

=begin
Updated patch to use rb_syserr_fail().
=end

Actions #2

Updated by kosaki (Motohiro KOSAKI) over 13 years ago

=begin
Hello,

2010/11/9 Run Paint Run Run :

Issue #4038 has been updated by Run Paint Run Run.

File io-advise-2.patch added

Updated patch to use rb_syserr_fail().

http://redmine.ruby-lang.org/issues/show/4038

I like this patch and I've reviewed this.

diff --git a/configure.in b/configure.in
index dc8cb7c..5eb0ef5 100644
--- a/configure.in
+++ b/configure.in
@@ -1266,7 +1266,7 @@ else
fi
AC_CHECK_FUNCS(fmod killpg wait4 waitpid fork spawnv syscall chroot getcwd eaccess
truncate ftruncate chsize times utimes utimensat fcntl lockf lstat\

  •      link symlink readlink readdir_r fsync fdatasync fchown\
    
  •      link symlink readlink readdir_r fsync fdatasync fchown posix_fadvise\
         setitimer setruid seteuid setreuid setresuid setproctitle socketpair\
         setrgid setegid setregid setresgid issetugid pause lchown lchmod\
         getpgrp setpgrp getpgid setpgid initgroups getgroups setgroups\
    

diff --git a/io.c b/io.c
index 8fa6613..af6cd3a 100644
--- a/io.c
+++ b/io.c
@@ -1,3 +1,4 @@
+
/**********************************************************************

io.c -
@@ -1413,6 +1414,107 @@ rb_io_fdatasync(VALUE io)
#define rb_io_fdatasync rb_f_notimplement
#endif

+static VALUE sym_normal, sym_sequential, sym_random,

  •         sym_willneed, sym_dontneed, sym_noreuse;
    

+/*

    • call-seq:
    • ios.advise(advice, offset=0, len=0) -> nil
      
    • Announce an intention to access data from the current file in a
    • specific pattern. On platforms that do not support the
    • posix_fadvise(2) system call, this method is a no-op.
    • advice is one of the following symbols:
      • :normal - No advice to give; the default assumption for an open file.
      • :sequential - The data will be accessed sequentially:
    • with lower offsets read before higher ones.
      
      • :random - The data will be accessed in random order.
      • :willneed - The data will be accessed in the near future.
      • :dontneed - The data will not be accessed in the near future.
      • :noreuse - The data will only be accessed once.

Probably, we have to note detailed meaning is OS dependent. example,
On almost os, POSIX_FADV_DONTNEED mean to add a hint to cache
reclaim logic. But, on linux, it immediately drop page caches.

    • "data" means the region of the current file that begins
    • at offset and extends for len bytes. By default, both offset
    • and len are 0, meaning that the advice applies to the entire
    • file.

This doesn't describe offset != 0 && len == 0 case.

    • If an error occurs, one of the following exceptions will be raised:
      • IOError - The IO stream is closed.
      • Errno::EBADF - The file descriptor of the current file is
  •  invalid.
    
      • Errno::EINVAL - An invalid value for advice was given.
      • Errno::ESPIPE - The file descriptor of the current
      • file refers to a FIFO or pipe. (Linux raises Errno::EINVAL
      • in this case).

A lot of OS may return os specific errno. so probably we need to note
other Errno::*
may happen.

      • TypeError - Either advice was not a Symbol, or one of the
  •  other arguments was not an <code>Integer</code>.
    
      • RangeError - One of the arguments given was too big/small.
  • */

+static VALUE
+rb_io_advise(int argc, VALUE *argv, VALUE io)
+{

  • off_t offset, len;
  • int advice, rv;
  • VALUE initadvice, initoffset, initlen;
  • rb_io_t *fptr;
  • rb_scan_args(argc, argv, "12", &initadvice, &initoffset, &initlen);
  • if (TYPE(initadvice) != T_SYMBOL)
  •  rb_raise(rb_eTypeError, "advice must be a Symbol");
    
  • offset = NIL_P(initoffset) ? 0 : NUM2OFFT(initoffset);
  • len = NIL_P(initlen) ? 0 : NUM2OFFT(initlen);
  • advice = 0;
  • if (initadvice == sym_normal) {
    +#ifdef POSIX_FADV_NORMAL
  •  advice = POSIX_FADV_NORMAL;
    

+#endif

  • }
  • else if (initadvice == sym_random) {
    +#ifdef POSIX_FADV_RANDOM
  •  advice = POSIX_FADV_RANDOM;
    

+#endif

  • }
  • else if (initadvice == sym_sequential) {
    +#ifdef POSIX_FADV_SEQUENTIAL
  •  advice = POSIX_FADV_SEQUENTIAL;
    

+#endif

  • }
  • else if (initadvice == sym_willneed) {
    +#ifdef POSIX_FADV_WILLNEED
  •  advice = POSIX_FADV_WILLNEED;
    

+#endif

  • }
  • else if (initadvice == sym_dontneed) {
    +#ifdef POSIX_FADV_DONTNEED
  •  advice = POSIX_FADV_DONTNEED;
    

+#endif

  • }
  • else if (initadvice == sym_noreuse) {
    +#ifdef POSIX_FADV_NOREUSE
  •  advice = POSIX_FADV_NOREUSE;
    

+#endif

  • }
  • else
  •  rb_raise(rb_eArgError, "Invalid advice: :%s",
    
  •       RSTRING_PTR(rb_id2str(SYM2ID(initadvice))));
    

Don't we need following?

if (!advice)
return Qnil;

Or, should we raise not-implement exception?

+#ifdef HAVE_POSIX_FADVISE

  • io = GetWriteIO(io);
  • GetOpenFile(io, fptr);
  • if (rv = posix_fadvise(fptr->fd, offset, len, advice))
  • /* posix_fadvise(2) doesn't set errno. On success it returns 0; otherwise
  •   it returns the error code. */
    
  •  rb_syserr_fail(rv, RSTRING_PTR(fptr->pathv));
    

+#endif

  • return Qnil;
    +}

/*

  • call-seq:
  • ios.fileno    -> fixnum
    

@@ -10052,6 +10154,7 @@ Init_IO(void)
rb_define_method(rb_cIO, "binmode", rb_io_binmode_m, 0);
rb_define_method(rb_cIO, "binmode?", rb_io_binmode_p, 0);
rb_define_method(rb_cIO, "sysseek", rb_io_sysseek, -1);

  • rb_define_method(rb_cIO, "advise", rb_io_advise, -1);

    rb_define_method(rb_cIO, "ioctl", rb_io_ioctl, -1);
    rb_define_method(rb_cIO, "fcntl", rb_io_fcntl, -1);
    @@ -10220,4 +10323,10 @@ Init_IO(void)
    sym_textmode = ID2SYM(rb_intern("textmode"));
    sym_binmode = ID2SYM(rb_intern("binmode"));
    sym_autoclose = ID2SYM(rb_intern("autoclose"));

  • sym_normal = ID2SYM(rb_intern("normal"));

  • sym_sequential = ID2SYM(rb_intern("sequential"));

  • sym_random = ID2SYM(rb_intern("random"));

  • sym_willneed = ID2SYM(rb_intern("willneed"));

  • sym_dontneed = ID2SYM(rb_intern("dontneed"));

  • sym_noreuse = ID2SYM(rb_intern("noreuse"));
    }

=end

Actions #3

Updated by runpaint (Run Paint Run Run) over 13 years ago

=begin

I like this patch and I've reviewed this.

Thank you. I've updated the documentation according to your suggestions. Does it look OK?

Don't we need following?

if (!advice)
return Qnil;

Or, should we raise not-implement exception?

My thinking was that in the bizarre case where none of the POSIX_FADV_* constants were defined, but HAVE_POSIX_FADVISE was, it was acceptable to pass posix_fadvise() 0 for the advice argument. On my system, at least, POSIX_FADV_NORMAL has the value 0, so this makes even more sense. On other systems, posix_fadvise() would presumably return EINVAL in this case, which we would then raise. If this isn't acceptable, perhaps we initialise advice to a sentinel value, then return Qnil if it has this value after the else statement? I'd rather not raise NotImplementedError because otherwise we try to fail silently on platforms without this syscall.

Are there any security issues we need to consider? $SAFE, tainting, trust? #advise already raises a SecurityError when $SAFE=4.
=end

Actions #4

Updated by kosaki (Motohiro KOSAKI) over 13 years ago

=begin
Hi

2010/11/10 Run Paint Run Run :

Issue #4038 has been updated by Run Paint Run Run.

File io-advise-3.patch added

I like this patch and I've reviewed this.

Thank you. I've updated the documentation according to your suggestions. Does it look OK?

Looks good. :)

Don't we need following?

if (!advice)
   return Qnil;

Or, should we raise not-implement exception?

My thinking was that in the bizarre case where none of the POSIX_FADV_* constants were defined, but HAVE_POSIX_FADVISE was, it was acceptable to pass posix_fadvise() 0 for the advice argument. On my system, at least, POSIX_FADV_NORMAL has the value 0, so this makes even more sense. On other systems, posix_fadvise() would presumably return EINVAL in this case, which we would then raise. If this isn't acceptable, perhaps we initialise advice to a sentinel value, then return Qnil if it has this value after the else statement? I'd rather not raise NotImplementedError because otherwise we try to fail silently on platforms without this syscall.

Ah ok, I missed POSIX_FADV_NORMAL=0 case. So, I don't think initialize
advise=0 is good idea. It mean initialize platform dependent meanings.
And personally I prefer that unsupported hint behave as no-op. But,
I'm ok other behavior too if unsupported
hint keeps platform independent meanings.

Are there any security issues we need to consider? $SAFE, tainting, trust? #advise already raises a SecurityError when $SAFE=4.

I think we don't need additional limit. fadvise() is a hint of
read/write operation. and we allow read/write on $SAFE=1/2/3.

Thanks.

=end

Actions #5

Updated by runpaint (Run Paint Run Run) over 13 years ago

=begin

Ah ok, I missed POSIX_FADV_NORMAL=0 case. So, I don't think initialize
advise=0 is good idea. It mean initialize platform dependent meanings.

I've removed it and added some rudimentary tests of the API. (I can't see how functionality can be tested without parsing the output of strace(1), which I assume we don't want to do).

=end

Actions #6

Updated by normalperson (Eric Wong) over 13 years ago

=begin
Run Paint Run Run wrote:

  • if (rv = posix_fadvise(fptr->fd, offset, len, advice))

I would release the GVL when making this call, some implementations may
block (even only on certain advice) the running thread for disk ops.
This is the case with POSIX_FADV_WILLNEED under Linux: the entire range
to be read into the page cache before returning from this function,
making it very noticeable on a slow device/filesystem with large files
such as sshfs.

--
Eric Wong

=end

Actions #7

Updated by runpaint (Run Paint Run Run) over 13 years ago

=begin

  • if (rv = posix_fadvise(fptr->fd, offset, len, advice))

I would release the GVL when making this call, some implementations may
block (even only on certain advice) the running thread for disk ops.
This is the case with POSIX_FADV_WILLNEED under Linux: the entire range
to be read into the page cache before returning from this function,
making it very noticeable on a slow device/filesystem with large files
such as sshfs.

I've attempted this, but it certainly needs review.
=end

Actions #8

Updated by kosaki (Motohiro KOSAKI) over 13 years ago

=begin
I've modified the patch slightly. Now IO#advise makes noop instead undefined behavior if the platform doesn't support the advise.

My point is, If the method can make undefined behavior, many developer strongly want to avoid it and they want to know which
advise is implemented. I'd like to remove such unnecessary fear.

Thanks.

  • kosaki

=end

Actions #9

Updated by kosaki (Motohiro KOSAKI) over 13 years ago

=begin
Fix test failure isssue on !HAVE_POSIX_FADVISE platform (ie windows).

=end

Actions #10

Updated by matz (Yukihiro Matsumoto) over 13 years ago

=begin
Hi,

In message "Re: [ruby-core:33736] [Ruby 1.9-Feature#4038] IO#advise"
on Thu, 16 Dec 2010 04:55:22 +0900, Motohiro KOSAKI writes:
|
|[1 <text/plain (quoted-printable)>]
|Issue #4038 has been updated by Motohiro KOSAKI.
|
|File IO-advise-7-kosaki.patch added
|
|Fix test failure isssue on !HAVE_POSIX_FADVISE platform (ie windows).

Please check-in, but I think symbol initialization (:normal etc.) is
needed only when posix_fadvise(2) is available.

						matz.

=end

Actions #11

Updated by runpaint (Run Paint Run Run) over 13 years ago

=begin

Please check-in, but I think symbol initialization (:normal etc.) is
needed only when posix_fadvise(2) is available.

The advantage of the current approach is that we can fail earlier.
Otherwise, a script containing io.advise([1,2,3]) would not raise an
exception under Windows, but would if run under Linux.

=end

Actions #12

Updated by matz (Yukihiro Matsumoto) over 13 years ago

=begin
Hi,

In message "Re: [ruby-core:33751] Re: [Ruby 1.9-Feature#4038] IO#advise"
on Thu, 16 Dec 2010 22:56:45 +0900, Run Paint Run Run writes:

|> Please check-in, but I think symbol initialization (:normal etc.) is
|> needed only when posix_fadvise(2) is available.
|
|The advantage of the current approach is that we can fail earlier.
|Otherwise, a script containing io.advise([1,2,3]) would not raise an
|exception under Windows, but would if run under Linux.

No, no. I meant lines like

 sym_normal = ID2SYM(rb_intern("normal"));
 sym_sequential = ID2SYM(rb_intern("sequential"));

can be wrapped by #ifdef HAVE_POSIX_FADVISE

						matz.

=end

Actions #13

Updated by kosaki (Motohiro KOSAKI) over 13 years ago

=begin
2010/12/16 Yukihiro Matsumoto :

Hi,

In message "Re: [ruby-core:33736] [Ruby 1.9-Feature#4038] IO#advise"
   on Thu, 16 Dec 2010 04:55:22 +0900, Motohiro KOSAKI writes:
|
|[1  <text/plain (quoted-printable)>]
|Issue #4038 has been updated by Motohiro KOSAKI.
|
|File IO-advise-7-kosaki.patch added
|
|Fix test failure isssue on !HAVE_POSIX_FADVISE platform (ie windows).

Please check-in, but I think symbol initialization (:normal etc.) is
needed only when posix_fadvise(2) is available.

Yes, sir :)

=end

Actions #14

Updated by kosaki (Motohiro KOSAKI) over 13 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

=begin
This issue was solved with changeset r30229.
Run Paint, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.

=end

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0