Project

General

Profile

Actions

Bug #20586

open

Some filesystem calls in dir.c are missing error handling and can return incorrect results if interrupted

Added by ivoanjo (Ivo Anjo) 8 days ago. Updated 7 days ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:118346]

Description

Background

Hey! I work for Datadog on the Ruby profiler part of the datadog (previously ddtrace) gem.

A customer reached out with an issue where enabling the profiler made Dir.glob return no files for a given folder:

Without profiler:

irb(main):002:0> Dir.glob('/gcsfuse/t*')
=> ["/gcsfuse/test.html", "/gcsfuse/test.txt"]

With profiler:

irb(main):002:0> Dir.glob('/gcsfuse/t*')
=> []

It turns out the issue is related to missing error handling in dir.c.

The Datadog Ruby profiler, like stackprof, pf2 and vernier, uses unix signals to interrupt the currently-active thread and take a sample (usually SIGPROF). When some system calls get interrupted by a signal, they return an EINTR error code back to the caller.

Consider for instance the implementation of dir_each_entry in dir.c:

static VALUE
dir_each_entry(VALUE dir, VALUE (*each)(VALUE, VALUE), VALUE arg, int children_only)
{
    struct dir_data *dirp;
    struct dirent *dp;
    IF_NORMALIZE_UTF8PATH(int norm_p);

    GetDIR(dir, dirp);
    rewinddir(dirp->dir);
    IF_NORMALIZE_UTF8PATH(norm_p = need_normalization(dirp->dir, RSTRING_PTR(dirp->path)));
    while ((dp = READDIR(dirp->dir, dirp->enc)) != NULL) {
      // ... do things
    }
    return dir;
}

If READDIR returns NULL, then dir_each_entry assumes it has iterated the entire directory. But looking at the man page for readdir we see the following sharp edge (emphasis mine):

It returns NULL on reaching the end of the directory stream or if an error occurred.

So what's happening in this situation is: readdir gets interrupted, returns NULL + sets errno to EINTR. But dir_each_entry doesn't check errno, so rather than raising an exception to flag the issue, it treats it as if the end of the directory has been reached.

How to reproduce

Reproducing this is somewhat annoying, because it's dependent on timing: the signal must arrive at the exact time the dir API is getting executed.

I was able to reproduce this every time by using the google cloud gcsfuse tool. This somewhat makes sense -- a remote filesystem is much slower than a local one, so there's a much bigger window of opportunity for a signal to arrive while the system call is blocked.

Here's an example I included in https://github.com/DataDog/dd-trace-rb/pull/3720:

# Not shown: Set up a trial google cloud account, install gcsfuse, create a cloud storage bucket and put in some test files

$ gcsfuse test_fs_dd_trace_rb fuse-testing/
$ ls fuse-testing/
hello.txt  test.html  test.txt

# Not shown: Add `datadog` gem to `Gemfile`

$ DD_PROFILING_ENABLED=true DD_PROFILING_DIR_INTERRUPTION_WORKAROUND_ENABLED=false bundle exec ddprofrb exec ruby -e "Datadog::Profiling.wait_until_running; pp Dir.children('fuse-testing/')"
[]

Let me know if you'd like me to try to create a reproducer that does not depend on the datadog gem.

Additional notes

I've spent quite some time looking at the dir.c sources, and here's the full list of APIs that suffer from issues:

  • dir_each_entry does not check errno; all of its users have interruption bugs
  • dir_tell will return -1 instead of the correct position (which means that passing -1 to dir_seek/dir_set_pos will cause it to not list the directory properly)
  • do_opendir an error in system calls will only sometimes be turned into a raised exception
    • Indirect callers that pass in rb_glob_error as errfunc: rb_glob, Dir.[], Dir.glob
    • Indirect callers that pass in 0 as errfunc: ruby_glob, ruby_brace_glob
  • glob_opendir does not check errno; all of its users have interruption bugs
  • glob_getent does not check errno; all of its users have interruption bugs
  • nogvl_dir_empty_p does not check errno (of readdir! it actually checks for opendir); all of its users have interruption bugs

Less sure about these:

  • do_stat/do_lstat will turn errors into warnings (unclear if enabled or disabled by default)
  • need_normalization calls fgetattrlist / getattrlist and all errors (ret != 0) are treated in the same way
  • rb_glob_error is and rb_glob_caller leave exceptions as pending and rely on callers to raise them properly
  • Error handling of rb_home_dir_of and rb_default_home_dir are a bit suspicious

As a workaround in the Datadog Ruby profiler, in https://github.com/DataDog/dd-trace-rb/pull/3720 I've added monkey patches to all of the Ruby-level APIs that use the above functions and mask out SIGPROF so these calls are never interrupted.

This solution is does successfully work around the issue, although it prevents the profiler from sampling during these system calls, which will mean less visibility if e.g. these calls are taking a long time. And well, maintaining monkey patches is always problematic for future Ruby compatibility.


Files

readdir-bug-repro.c (2.11 KB) readdir-bug-repro.c ivoanjo (Ivo Anjo), 06/19/2024 01:57 PM

Updated by ivoanjo (Ivo Anjo) 8 days ago

There's a "related" issue in dir.c which is that it sometimes blocking system calls are performed while Ruby is still holding the GVL, thus blocking the entire VM. I've filed a separate ticket for that -- https://bugs.ruby-lang.org/issues/20587 .

Updated by mame (Yusuke Endoh) 8 days ago

How would you like to fix it?

IMO, it would be reasonable to have Dir.glob raise an exception when readdir fails. The spec of readdir says:

Applications wishing to check for error situations should set errno to 0 before calling readdir(). If errno is set to non-zero on return, an error occurred.

Do you want Ruby to automatically retry when readdir returns EINTR? If so, I am unsure because neither the manpage you linked nor the spec indicate that readdir may fail with EINTR.

Have you actually confirmed that readdir is returning EINTR? If so, it could be a bug in the OS. However, if such a bug exists in major platforms, it may be unavoidable for Ruby to handle it.

Updated by ivoanjo (Ivo Anjo) 8 days ago

mame (Yusuke Endoh) wrote in #note-2:

How would you like to fix it?

IMO, it would be reasonable to have Dir.glob raise an exception when readdir fails. The spec of readdir says:

Applications wishing to check for error situations should set errno to 0 before calling readdir(). If errno is set to non-zero on return, an error occurred.

Do you want Ruby to automatically retry when readdir returns EINTR? If so, I am unsure because neither the manpage you linked nor the spec indicate that readdir may fail with EINTR.

I believe the correct thing to do is to raise an exception. This will mean profiling the application could lead to these exceptions showing up where they didn't before, but at least they're an indication that something happened, rather than an incorrect result.

We should avoid retrying, or at least retrying forever, because if some operation always takes 100ms, and the profiler executes every 10ms, then it will keep interrupting the operation and the app will also get stuck.

Have you actually confirmed that readdir is returning EINTR?

I have! I have attached a pure-C reproducer, and when I run it with the gcsfuse folder I get:

$ gcc readdir-bug-repro.c -o readdir-bug-repro -lpthread && ./readdir-bug-repro fuse-testing/
Set up signal handler!
Received 113 signals, calling readdir...
Failed to read directory 'fuse-testing/': Interrupted system call

For reference, I'm running it on:

  • Linux rubyshade 6.5.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May 7 09:00:52 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
  • Ubuntu 22.04.4 LTS
  • gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)

If so, it could be a bug in the OS. However, if such a bug exists in major platforms, it may be unavoidable for Ruby to handle it.

This is a good question. I'll be honest that I could never reproduce with a local filesystem. I'm not sure if this just means that some filesystems (perhaps those implemented with FUSE?) allow this kind of failure, or if it's just really hard to hit it when there's kernel cache and all those things making sure these calls are very speedy.

Updated by Eregon (Benoit Daloze) 7 days ago

If it returns EINTR, then we should retry, because that's already the behavior for all other blocking syscalls like read(2), etc.

Is the SIGPROF handler installed with sigaction() without the SA_RESTART flag?
Does it still happen if the SA_RESTART is passed to sigaction()? If yes it sounds like a possible bug of gcsfuse.

I suppose it can make sense the SIGPROF handler does not pass SA_RESTART to be able to profile during blocking syscalls.
But depending on the method of profiling it might be unnecessary to interrupt syscalls, in which case it would be better to pass SA_RESTART.

But in any case we should retry in CRuby, because for instance SIGVTALRM, used internally, on purpose interrupts syscalls.

Updated by mame (Yusuke Endoh) 7 days ago

I don't think it's a good idea to retry readdir blindly.

It is probably that the FUSE driver throws EINTR when communicating, but it is not clear what kind of processing is done before and after the communication. It is possible that the internal state is inconsistent after readdir returns EINTR.

Then, should we rewinddir and try again? Or do we need to closedir and opendir again?

I don't think this is a problem we should deal with in our imagination. If we are serious about this problem, we should start by reporting the problem to the Linux kernel and asking them to clarify the behavior on their manpage.

As for Ruby, I think it is good to raise an exception by rb_sys_fail for the time being.

And as for SIGPROF, it would be easy to set SA_RESTART. In fact, venier seems to set it.

https://github.com/jhawthorn/vernier/blob/beb8ca2561c7924b0d7e8b4759681835cf21df80/ext/vernier/vernier.cc#L1606

Updated by ivoanjo (Ivo Anjo) 7 days ago

Eregon (Benoit Daloze) wrote in #note-4:

Is the SIGPROF handler installed with sigaction() without the SA_RESTART flag?
Does it still happen if the SA_RESTART is passed to sigaction()? If yes it sounds like a possible bug of gcsfuse.

Unfortunately the semantics aren't as simple as that :/

Both the reproducer I shared above as well readdir-bug-repro.c as well as the Datadog Ruby profiler set SA_RESTART when adding the signal handler.

The man page has a section mentioning some system calls are not restarted, regardless of the flag.

  The following interfaces are never restarted after being
  interrupted by a signal handler, regardless of the use of
  SA_RESTART; they always fail with the error EINTR when
  interrupted by a signal handler:
  (...)

readdir is not mentioned in this section, but it's also not mentioned in the section about calls that are restarted. So it's in a weird limbo.

mame (Yusuke Endoh) wrote in #note-5:

It is probably that the FUSE driver throws EINTR when communicating, but it is not clear what kind of processing is done before and after the communication. It is possible that the internal state is inconsistent after readdir returns EINTR.

Then, should we rewinddir and try again? Or do we need to closedir and opendir again?

I don't think this is a problem we should deal with in our imagination. If we are serious about this problem, we should start by reporting the problem to the Linux kernel and asking them to clarify the behavior on their manpage.

Yeah -- I suspect the kernel/FUSE should hide a lot of these, otherwise most userland apps will get it wrong, but on the docs I could find it's not clear this behavior is entirely expected.

As for Ruby, I think it is good to raise an exception by rb_sys_fail for the time being.

This is my thinking as well -- if this does turn out to be a linux/FUSE/etc bug, it'll probably take a long time to roll out the fix so having Ruby do the right thing in the face of the weird issue is probably worth it?

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0