Project

General

Profile

Actions

Feature #10561

open

Improve function of Thread::Backtrace::Location #path and #absolute_path

Added by sam.saffron (Sam Saffron) over 9 years ago. Updated over 3 years ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:66642]

Description

I was working on this issue in Rails and hit an area where Backtrace Location can be improved

https://github.com/rails/rails/pull/17782

  1. It is undefined in the documentation how #absolute_path should operate when #path is invalid (in case of instance eval)
  2. There are a few conditions where #path and #absolute_path can return nil, this forces extra protection code when parsing paths to check for nil. (for example getting filename)

Suggestions:

  1. Instead of returning Qnil from location_path and location_absolute_path on invalid conditions, return the string "(unknown)" which is easier to parse and sticks out better in a big backtrace. There is precedent here with the string "(eval)"
  2. If path is invalid have absolute_path return "(unknown)", define that in the documentation
  3. (possible) add an additional method on caller_location called #filename so people stop parsing filename from #path and #absolute_path
  4. Evaluate if it makes sense to have #path and #absolute_path in the API as both methods can return full paths so the semantic difference is subtle.

Updated by yorickpeterse (Yorick Peterse) over 9 years ago

I'd like to have some extra clarification on this matter as well. Right now in Rubinius we return just the filename when calling Thread::Backtrace::Location#path as the documentation suggested this was the expected behaviour:

Returns the file name of this frame.

For example, using caller_locations.rb from Thread::Backtrace::Location

loc = c(0..1).first
loc.path #=> caller_locations.rb

Updated by sam.saffron (Sam Saffron) over 9 years ago

I think the name #path should always refer to a #path of sorts using it as #filename is kind of odd. Which is a big reason I find the duality of #path / #absolute_path confusing.

Updated by yorickpeterse (Yorick Peterse) about 9 years ago

Nobu: I talked about this with Koichi today after noticing you added tests for
path/absolute_path after my talk at FOSDEM. Since these tests assert that path
returns the full path I'd like to know which one is correct.

Is it this:

caller_locations[0].path # => "example.rb"

Or is this the correct way:

caller_locations[0].path # => "/home/some-user/example.rb"

If the last example is correct, is it perhaps an idea to add "basename" to
Thread::Location::Backtrace, which would return just the filename, and make
path an alias to absolute_path? So:

caller_locations[0].path          # => "/home/some-user/example.rb"
caller_locations[0].absolute_path # => "/home/some-user/example.rb"
caller_locations[0].basename      # => "example.rb"

Updated by nobu (Nobuyoshi Nakada) about 9 years ago

The result of #path equals to __FILE__ in the script file.
It's expanded in required libraries, but not in main scripts.

Updated by yorickpeterse (Yorick Peterse) about 9 years ago

Nobu: Thanks! I'll update the Rubinius implementation of this to match that behaviour.

Updated by yorickpeterse (Yorick Peterse) about 9 years ago

Also, I'll submit a patch to fix the documentation of path so that it states it's an alias of absolute_path.

Updated by Eregon (Benoit Daloze) about 9 years ago

Nobuyoshi Nakada wrote:

The result of #path equals to __FILE__ in the script file.
It's expanded in required libraries, but not in main scripts.

There is only one main script (the file passed to the ruby executable) in a given execution, right?

Yorick: so #path is #absolute_path except for the main script, so it is similar to the formatting of usual exceptions backtraces:

$ echo "raise" > tmp.rb
$ ruby tmp.rb 
tmp.rb:1:in `<main>': unhandled exception

Here tmp.rb is main script.

$ echo "require_relative 'tmp'" > tmp2.rb
$ ruby tmp2.rb    
/home/eregon/code/tmp.rb:1:in `<top (required)>': unhandled exception
from tmp2.rb:1:in `require_relative'
from tmp2.rb:1:in `<main>'

Here tmp2.rb is the main script but tmp.rb becomes a "required library" so full path is shown.

Updated by yorickpeterse (Yorick Peterse) about 9 years ago

Benoit: correct, I just found this out by messing around more with this. This
behaviour is extremely confusing, especially since the tests both just compare
if path and absolute_path equal __FILE__.

Perhaps the Tempfile code in the tests (https://github.com/ruby/ruby/blob/69be3620302aab57c43014aca08629c9d7b7ce44/test/ruby/test_backtrace.rb#L170)
checks for this, but it too is very confusing to understand. In case of Rubinius
I'll see if we even have a way of detecting if something is called from the main
script, and again adjust our specs and implementation.

Updated by ko1 (Koichi Sasada) about 9 years ago

[MAYBE IT SHOULD BE DIFFERENT TICKET]

How about to unify #path and #absolute_path?
Returning absolute path on both methods.

When I made this API, there are three types path, (1) path of main script (partial path) and (2) paths of required files which is absolute path as described in this ticket and (3) string passed by "eval" methods (or "-e" command line option).

I think we don't need to separate (1) and (2). (1) can be absolute path.

What do you think?

Updated by ko1 (Koichi Sasada) about 9 years ago

I think we don't need to separate (1) and (2). (1) can be absolute path.

I guess the reason is that Matz thought backtrace should be simple for main script.

Updated by cheba (Alexander Mankuta) about 9 years ago

Koichi Sasada wrote:

I guess the reason is that Matz thought backtrace should be simple for main script.

There's little benefit in having simple (i.e. basename only) path for main script (and only main script). I think consistent output is much more useful.

Getting basename out of full path is easier than the other way around.

Updated by yorickpeterse (Yorick Peterse) about 9 years ago

It's fine for the methods to do different things, the naming however is a bit
confusing. Using path doesn't clearly state when it's absolute and when it's
relative. Perhaps script_path would make more sense.

Merging the two is also fine, although it could potentially break code depending
on the current behaviour of path, thus it would have to be done with some
care.

Updated by jeremyevans0 (Jeremy Evans) over 3 years ago

  • Tracker changed from Bug to Feature
  • ruby -v deleted (2.2.0)
  • Backport deleted (2.0.0: UNKNOWN, 2.1: UNKNOWN)

I've improved the documentation for #path and #absolute path in fae135c5b39db173bf97dfa3c3a34eb8fb230276. The current behavior is not a bug, so switching this to a feature request.

Personally, I'm OK with suggestions 1, 2, and 4 and willing to implement the changes if they are desired. I don't think adding #filename (suggestion 3) is worth it. It doesn't seem to be needed in most code, and File.basename can be used in the cases where it is needed (and people that need it can define the method themselves).

Updated by Eregon (Benoit Daloze) over 3 years ago

There is an extra difference that #absolute_path will also try to canonicalize the path and resolve symlinks, so then even if #path is absolute, #absolute_path can be different.
See https://github.com/ruby/spec/blob/e829fb0bcb667013f3527142b9e29883f6904537/core/thread/backtrace/location/absolute_path_spec.rb#L39-L69

So in short #path is the same as __FILE__. #absolute_path is essentially File.realpath(#path) + it seems to work on CRuby even if the file was removed.
For eval(code, binding, file, line) code, #path and #absolute_path just return file and do not expand it, so it can be a relative path if file is relative.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0