Project

General

Profile

Actions

Feature #16978

open

Ruby should not use realpath for __FILE__

Added by vo.x (Vit Ondruch) 12 months ago. Updated 3 months ago.

Status:
Open
Priority:
Normal
Target version:
-
[ruby-core:98920]

Description

This is the simplest test case:

$ mkdir a

$ echo "puts __FILE__" > a/test.rb

$ ln -s a b

$ ruby -Ib -e "require 'test'"
/builddir/a/test.rb

This behavior is problematic, because Ruby should not know nothing about the a directory. It was not instructed to use it. I should always refer to the file using the original path and do not dig into the underlying details, otherwise depending on file system setup, one might be forced to used File.realpath everywhere trying to use __FILE__.


Related issues

Related to Ruby master - Bug #10222: require_relative and require should be compatible with each other when symlinks are usedClosedActions

Updated by jeremyevans0 (Jeremy Evans) 11 months ago

  • Backport deleted (2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN)
  • ruby -v deleted (ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux])
  • Tracker changed from Bug to Feature

I don't think this is a bug. __FILE__ is documented as follows: The path to the current file. Which path (real, absolute, relative, expanded) is not specified.

Not using the real path would lead to behavior that depends on the first path used when requiring the file.

a/test.rb (b symlinked to a):

def a
  __FILE__
end
ruby -Ia -Ib -rtest -e 'a'
# /path/to/a/test.rb
ruby -Ib -Ia -rtest -e 'a'
# Current: /path/to/a/test.rb
# Your proposed: /path/to/b/test.rb

What actually happens is not the file path being converted to a real path, but the include directory being converted to a real path before the file is required (in rb_construct_expanded_load_path). Changing this to not use a real path would probably break the code that checks that a feature hasn't been require twice. For example, this code would change behavior:

$: << 'a'
require 'test'
$: << 'b'
require 'test'
# Current: not loaded again
# Your proposed: loaded again

If you symlink the file itself and not the include directory, Ruby will attempt to require it as a separate feature.

Note that if you provide a path when requiring, Ruby already operates the way you want:

ruby -r./a/test -e 'a' # /path/to/a/test.rb
ruby -r./b/test -e 'a' # /path/to/b/test.rb

I can certainly see pros and cons from changing the behavior, but I would consider this a feature request and not a bug.

Updated by Eregon (Benoit Daloze) 11 months ago

The "main file" (the file passed to ruby myfile.rb is also special in that __FILE__ and $0 can both be relative paths (basically the same path as passed on the command line).

Updated by vo.x (Vit Ondruch) 11 months ago

this code would change behavior:

Absolutely, different $LOAD_PATH must result in different behavior.

$: << 'a'
require 'test'
$: << 'b'
require 'test'
# Current: not loaded again
# Your proposed: loaded again

This is actually where I am coming from. I am not 100 % sure what is the spec of require, but I would expect that require 'test' works just once and it does not matter what is the current $LOAD_PATH status. So my proposal is actually:

# Your proposed: not loaded again

Updated by vo.x (Vit Ondruch) 5 months ago

Can this be resolved please? This is another scenario, which should work IMO, but it does not work:

$ mkdir a/
$ echo "require_relative 'b'" > a/test.rb
$ touch b.rb

$ ll
total 4
drwxrwxr-x. 1 vondruch vondruch 14 Jan 22 17:17 a
-rw-rw-r--. 1 vondruch vondruch  0 Jan 22 17:17 b.rb
lrwxrwxrwx. 1 vondruch vondruch  9 Jan 22 17:18 test.rb -> a/test.rb

$ ruby test.rb
Traceback (most recent call last):
    1: from test.rb:1:in `<main>'
test.rb:1:in `require_relative': cannot load such file -- /home/vondruch/tt/a/b (LoadError)

The point is that in the context of current directory, the b.rb is relative to test.rb and therefore this example should not work. It does not really matter that the test.rb is symlink. This behavior with requie_relative being more and more common basically prohibits usage of symlinks for Ruby code.

This is issue because gems typically does not ship their test suites. Therefore when testing packages in Fedora, we would like do something like:

$ gem unpack rspec-rails
Fetching rspec-rails-4.0.2.gem
Unpacked gem: '/home/vondruch/tt/rspec-rails-4.0.2'

$ cd rspec-rails-4.0.2/

# Link from git checkout or from separate archive
$ ln -s ~/projects/rspec/rspec-rails/spec/ .

This blows out as soon as require_relative is used in test suite, because it cannot refer back to require_relative "../lib/rspec_rails", which expands to /home/vondruch/projects/rspec/rspec-rails/spec/../lib/rspec_rails, where there is either loaded completely different file or nothing. In Fedora 1, the test suite is typically extracted from tarball, so this would crash.

Updated by akr (Akira Tanaka) 5 months ago

I think we should use realpath.
The location of actual file is more robust.

Also, ELF dynamic linker has a feature, $ORIGIN, to refer a shared library using a relative path.
It also resolves symlinks.
https://refspecs.linuxbase.org/elf/gabi4+/ch5.dynamic.html

Updated by Dan0042 (Daniel DeLorme) 5 months ago

+1 for keeping the current behavior.

I remember in 1.8 we had so many problems with double-loading code because a file could be require'd with different paths. I have no wish to go back to that mess.

Updated by vo.x (Vit Ondruch) 5 months ago

I have problem with double loading because of symlinks resolving into real path. If there was ensured that require "foo" load file just once, then double loading can't happen.

In Fedora, to avoid duplication, we have openssl gem extracted into independent package, which links back to the StdLib to keep the Ruby functionality. Unfortunately, since this commit 1, which introduces require_relative, we have issues with double loading. The simple reproducer is ruby --disable-gems -e 'require "openssl"; require "openssl/digest".

You can admit that Fedora is doing something unexpected, but I think that the package split was done prior the require_relative was even introduced.

Updated by austin (Austin Ziegler) 5 months ago

Why not put some special handling in to require_relative such that it checks both the pre-realpath and realpath versions (perhaps behind a #define flow so that Fedora and other systems integrators can do this without impacting anyone else, assuming require_relative is in C) so that it can handle this case?

Updated by Dan0042 (Daniel DeLorme) 5 months ago

Interestingly, ruby does not use realpath for __FILE__, only for __dir__ and require_relative

$ cat test.rb
p __dir__ => __FILE__
require_relative "b"

$ cat a/b.rb
p __dir__ => __FILE__
require_relative "c"

$ ln -s a/b.rb

$ touch c.rb 

$ ruby test.rb
{"/home/dan42/16978"=>"test.rb"}
{"/home/dan42/16978/a"=>"/home/dan42/16978/b.rb"}
Traceback (most recent call last):
    3: from test.rb:2:in `<main>'
    2: from test.rb:2:in `require_relative'
    1: from /home/dan42/16978/b.rb:2:in `<top (required)>'
/home/dan42/16978/b.rb:2:in `require_relative': cannot load such file -- /home/dan42/16978/a/c (LoadError)

Updated by Eregon (Benoit Daloze) 5 months ago

vo.x (Vit Ondruch) wrote in #note-7:

In Fedora, to avoid duplication, we have openssl gem extracted into independent package, which links back to the StdLib to keep the Ruby functionality.

What does it look like on the filesystem?

Even if the package is split, the standard location could be used, no need for symlinks, isn't it?

Updated by vo.x (Vit Ondruch) 5 months ago

Eregon (Benoit Daloze) wrote in #note-10:

vo.x (Vit Ondruch) wrote in #note-7:

In Fedora, to avoid duplication, we have openssl gem extracted into independent package, which links back to the StdLib to keep the Ruby functionality.

What does it look like on the filesystem?

This is how it is constructed. Here is where the OpenSSL structure is created in traditional gem directories:

https://src.fedoraproject.org/rpms/ruby/blob/master/f/ruby.spec#_761

And here are the links back to the StdLib location:

https://src.fedoraproject.org/rpms/ruby/blob/master/f/ruby.spec#_771

Even if the package is split, the standard location could be used, no need for symlinks, isn't it?

Not sure what do you mean specifically, because of course I'd prefer if there are no symlinks. The idea is that rubygem-openssl, should be replaced by newer version of openssl released on rubygems.org, without upgrading Ruby itself, therefore the files have to be located in the gem directory. Of course there are different possibilities, how to workaround that (e.g. #14737). The only question how sustainable they are.

However, Fedora is orthogonal and just distraction to this issue, because I can imagine if I were ruby openssl developer, I'd like to replace content of Ruby build with my development version of OpenSSL from git snapshot via symlinks, e.g.:

$ cd ~
$ git clone https://github.com/ruby/openssl.git
# rm /usr/local/lib/ruby/3.0.0/openssl*
# cd /usr/local/lib/ruby/3.0.0
# ln -s ~/openssl/lib/openssl.rb .
# ln -s ~/openssl/lib/openssl .
$ /usr/local/bin/ruby --disable-gems -e 'require "openssl"; require "openssl/digest"'
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:45: warning: already initialized constant OpenSSL::Digest::MD4
/builddir/openssl/lib/openssl/digest.rb:45: warning: previous definition of MD4 was here
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:45: warning: already initialized constant OpenSSL::Digest::MD5
/builddir/openssl/lib/openssl/digest.rb:45: warning: previous definition of MD5 was here
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:45: warning: already initialized constant OpenSSL::Digest::RIPEMD160
/builddir/openssl/lib/openssl/digest.rb:45: warning: previous definition of RIPEMD160 was here
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:45: warning: already initialized constant OpenSSL::Digest::SHA1
/builddir/openssl/lib/openssl/digest.rb:45: warning: previous definition of SHA1 was here
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:45: warning: already initialized constant OpenSSL::Digest::SHA224
/builddir/openssl/lib/openssl/digest.rb:45: warning: previous definition of SHA224 was here
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:45: warning: already initialized constant OpenSSL::Digest::SHA256
/builddir/openssl/lib/openssl/digest.rb:45: warning: previous definition of SHA256 was here
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:45: warning: already initialized constant OpenSSL::Digest::SHA384
/builddir/openssl/lib/openssl/digest.rb:45: warning: previous definition of SHA384 was here
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:45: warning: already initialized constant OpenSSL::Digest::SHA512
/builddir/openssl/lib/openssl/digest.rb:45: warning: previous definition of SHA512 was here
/usr/local/lib/ruby/3.0.0/openssl/digest.rb:52:in `<class:Digest>': superclass mismatch for class Digest (TypeError)
    from /usr/local/lib/ruby/3.0.0/openssl/digest.rb:16:in `<module:OpenSSL>'
    from /usr/local/lib/ruby/3.0.0/openssl/digest.rb:15:in `<top (required)>'
    from -e:1:in `require'
    from -e:1:in `<main>'

Everybody can try the example above. Is it artificial and should be prohibited? Should it fail?

Updated by Dan0042 (Daniel DeLorme) 5 months ago

I think this is a bug and should be fixed, but IMO the proper fix is to use realpath for __FILE__

So in the example above when you do require "openssl/digest", ruby will find /usr/local/lib/ruby/3.0.0/openssl/digest.rb and then it should check the realpath (~/openssl/lib/openssl/digest.rb) before loading the file.

Updated by vo.x (Vit Ondruch) 5 months ago

Dan0042 (Daniel DeLorme) wrote in #note-12:

I think this is a bug and should be fixed, but IMO the proper fix is to use realpath for __FILE__

This is though one thinking about this again, but you are probably right. Maybe the biggest concern is that the behavior is inconsistent. If the require and __FILE__ used realpath, I think the example in #16978-11 would work as you said.

Actions #14

Updated by mame (Yusuke Endoh) 4 months ago

  • Related to Bug #10222: require_relative and require should be compatible with each other when symlinks are used added

Updated by ko1 (Koichi Sasada) 3 months ago

  • Assignee set to nobu (Nobuyoshi Nakada)
Actions

Also available in: Atom PDF