Project

General

Profile

Actions

Feature #17762

open

A simple way to trace object allocation

Added by mame (Yusuke Endoh) about 2 months ago. Updated 3 days ago.

Status:
Assigned
Priority:
Normal
Target version:
-
[ruby-core:103102]

Description

How about having a short hand to ObjectSpace.trace_object_allocations_start, ObjectSpace.allocation_sourcefile and ObjectSpace.allocation_sourceline?

They are a very powerful tool for debugging and code-reading which allows us to identify an allocation site of an object.
Though they are never lightweight, they are the last resort when you try debugging code written by someone else.

However, the names are too long for me to remember and to type. Whenever I want to use them, I have to google, copy and paste the names.

Proposal

To enable trace allocations:

require "objspace/trace" #=> objspace/trace is enabled

To show the allocation site of an object:

p obj #=> #<Object:0x...> @ (file.rb):(lineno)

Example

require "objspace/trace"
require "active_support/all"

p ActiveSupport::VERSION::STRING
  #=> "6.1.3.1" @ /home/mame/work/ruby/local/lib/ruby/gems/3.1.0/gems/activesupport-6.1.3.1/lib/active_support/gem_version.rb:15

Discussion

I've attached a simple patch that is originally authored by ko1 (Koichi Sasada) .

  • Is the message objspace/trace is enabled needed or not?
  • To stop the trace, you need to use ObjectSpace.trace_object_allocations_stop. But, I guess that it is rare that we need to stop it during debugging.
  • Is it too radical to redefine Kernel#p? I think that it is good enough for many cases. When it matters, the original APIs (ObjectSpace.trace_object_allocations_start, ...) can be used.

Files

objspace-trace.patch (631 Bytes) objspace-trace.patch mame (Yusuke Endoh), 03/30/2021 02:49 PM

Related issues

Related to Ruby master - Feature #10932: Enabling allocation tracing as early as possibleOpenko1 (Koichi Sasada)Actions

Updated by shevegen (Robert A. Heiler) about 2 months ago

I think this is a good suggestion.

Please correct me if my assumptions are wrong, but if I understood the general gist of it correctly
then the main point here is that, rather than focusing on specific names, such as
ObjectSpace.trace_object_allocations_start or ObjectSpace.this_aptly_named_method_is_really_not_easy_to_remember,
the main idea is that you can, sort of "activate" a more general debug-centric way to handle
output that may be relevant for getting additional information about different
objects. In the specific example, you tie it to Kernel#p such as the:

p obj

example shows. So basically, the idea is to get more information that is useful for
debugging, without depending on specific API names as such, yes?

In that case I think it's a good idea.

To your three questions, I'll only comment on the Kernel#p last part.

I think it is fine to modify Kernel#p in this case. You could perhaps add
ObjectSpace.enable_tracing, ObjectSpace.strop_tracing or just a flipper
ObjectSpace.flip or ObjectSpace.trace for simpler names. Good API design
is hard. What I like about the idea is that you can sort of get more
information with simple things such as p (or perhaps pp ... could pp
also be used here? tenderlove once wrote that he is a puts debugger,
on his blog; I am a pp debugger! pp all the things).

Updated by byroot (Jean Boussier) about 2 months ago

Whenever I want to use them, I have to google, copy and paste the names.

Seconded. I use this feature almost daily, and somehow I also have to reach to the doc every time.

require "objspace/trace"

That makes sense to me.

p obj #=> #Object:0x... @ (file.rb):(lineno)

I'm not too sure about that one. I suppose it's fine, but I'm not always in a situation where p is usable. Would Object#allocation_source be acceptable?

Actions #3

Updated by Eregon (Benoit Daloze) about 2 months ago

require "objspace/trace" automatically starting tracing seems dangerous, there is a pretty big performance penalty to enable it.
So I think the message on stderr is needed, and maybe it should be more explicit like require "objspace/start_tracing".

There might be different sorts of tracing so ultimately I feel ObjectSpace.trace_object_allocations_start is long but is clear and well-named IMHO.

byroot (Jean Boussier) wrote in #note-2:

I'm not too sure about that one. I suppose it's fine, but I'm not always in a situation where p is usable. Would Object#allocation_source be acceptable?

How about ObjectSpace.allocation_source(obj)?

Object#allocation_source wouldn't work for BasicObject and it would pollute Object/Kernel which seems suboptimal to me.

Updated by tenderlovemaking (Aaron Patterson) about 2 months ago

I submitted #10932, so I would definitely like a feature like this. 😆

Is the message objspace/trace is enabled needed or not?

I don't think it's needed. If you require the file, you know it's enabled.

To stop the trace, you need to use Object.trace_object_allocations_stop. But, I guess that it is rare that we need to stop it during debugging.

This is fine. Nobody will need to stop it (or if they do, they can look up the API).

Is it too radical to redefine Kernel#p? I think that it is good enough for many cases. When it matters, the original APIs (ObjectSpace.trace_object_allocations_start, ...) can be used.

Changing the output from p might be confusing, and if the object is something deeply nested, it might take a long time to print. Could we defined Kernel#o on require "objspace/trace", then o(object) just prints the allocation info? (o is just a suggestion, but something short and easy)

Updated by Eregon (Benoit Daloze) about 2 months ago

tenderlovemaking (Aaron Patterson) wrote in #note-4:

I don't think it's needed. If you require the file, you know it's enabled.

For the person who just wrote it it's clear enough.
But what if that accidentally gets committed and the app/program is N times slower and it's hard to find out why?

Actions #6

Updated by mame (Yusuke Endoh) about 2 months ago

  • Related to Feature #10932: Enabling allocation tracing as early as possible added

Updated by mame (Yusuke Endoh) about 2 months ago

  • Description updated (diff)

Thank you all for the comments.

byroot (Jean Boussier) wrote in #note-2:

p obj #=> #Object:0x... @ (file.rb):(lineno)

I'm not too sure about that one. I suppose it's fine, but I'm not always in a situation where p is usable. Would Object#allocation_source be acceptable?

I think this is also one of the reasonable options.

In fact, my original idea was Object#allocation_site, but ko1 (Koichi Sasada) proposed the current approach (extending Kernel#p).
I like his approach because it is much shorter.

Updated by mame (Yusuke Endoh) about 2 months ago

Eregon (Benoit Daloze) wrote in #note-3:

require "objspace/trace" automatically starting tracing seems dangerous, there is a pretty big performance penalty to enable it.
So I think the message on stderr is needed, and maybe it should be more explicit like require "objspace/start_tracing".

require "objspace/start_tracing" sounds good to me.

byroot (Jean Boussier) wrote in #note-2:

I'm not too sure about that one. I suppose it's fine, but I'm not always in a situation where p is usable. Would Object#allocation_source be acceptable?

How about ObjectSpace.allocation_source(obj)?

This is also nice-to-have.
p ObjectSpace.allocation_source(obj) is much better than

p [ObjectSpace.allocation_sourcefile(obj), ObjectSpace.allocation_sourceline(obj)]

.

But IMO, p ObjectSpace.allocation_source(obj) is too long for quick debugging.
I guess this is the reason why #10932 proposed include ObjectSpace, but it pollutes the namespace needlessly.

Object#allocation_source wouldn't work for BasicObject and it would pollute Object/Kernel which seems suboptimal to me.

I think this is a trade-off between robustness and usability.
If you want general-purpose and robust solution, you can use the current APIs.
The goal of this proposal is "useful and good enough in many cases".

Updated by mame (Yusuke Endoh) about 2 months ago

tenderlovemaking (Aaron Patterson) wrote in #note-4:

I submitted #10932, so I would definitely like a feature like this. 😆

Oops, sorry!
Good news: I talked with ko1 (Koichi Sasada) in person, and now he is positive (or, at least not negative) to this proposal.

Changing the output from p might be confusing, and if the object is something deeply nested, it might take a long time to print.

I admit that it might be confusing, but priting time will not matter. My proposal changes Kernel#p, not #inspect. In other words, it shows only the allocation site of the outermost object.

Updated by mame (Yusuke Endoh) about 2 months ago

Eregon (Benoit Daloze) wrote in #note-5:

tenderlovemaking (Aaron Patterson) wrote in #note-4:

I don't think it's needed. If you require the file, you know it's enabled.

For the person who just wrote it it's clear enough.
But what if that accidentally gets committed and the app/program is N times slower and it's hard to find out why?

Agreed. I'm worried I may commit it inadvertently after debugging is finished.
But I'm unsure if the message is helpful to prevent the accident.
A real-world application tends to print a lot of log messages, so the warning may be overlooked easily.

Updated by byroot (Jean Boussier) about 2 months ago

How about ObjectSpace.allocation_source(obj)?

That would be fine by me, it is a much more reasonable than the [ObjectSpace.allocation_sourcefile(obj), ObjectSpace.allocation_sourceline(obj)] I write almost daily.

Even though a the name is very slightly weird when put in context with Method#source_location, as source have a different meaning in both of these. But allocation_location would be even weirder I think.

require "objspace/start_tracing" sounds good to me.

While we're at it could we also alias the ::trace_object_allocations_* family of methods ?

ObjectSpace.trace_object_allocations_start => ObjectSpace.start_tracing
ObjectSpace.trace_object_allocations_stop => ObjectSpace.stop_tracing
ObjectSpace.trace_object_allocations => ObjectSpace.trace
ObjectSpace.trace_object_allocations_clear => ObjectSpace.clear_traces

Or, I know you'd like to keep this change limited, but maybe have a simpler interface?

ObjectSpace.trace {}
ObjectSpace.tracing_enabled # => true/false
ObjectSpace.tracing_enabled=(true/false)
ObjectSpace.clear_traces

Updated by Eregon (Benoit Daloze) about 2 months ago

byroot (Jean Boussier) wrote in #note-11:

Even though a the name is very slightly weird when put in context with Method#source_location, as source have a different meaning in both of these. But allocation_location would be even weirder I think.

I've thought about this too, but indeed allocation_source_location seems bad.

I guess if we do have a method on Object/Kernel it could simply be Object#source_location for consistency with Proc/Binding/Module(#17749) (they already show where allocated), and to some degree with Method/UnboundMethod (= where to look at for debugging).
We could even have that method always defined potentially, and just return nil when no information is available.

Updated by byroot (Jean Boussier) about 2 months ago

I guess if we do have a method on Object/Kernel it could simply be Object#source_location for consistency with [...] Method/UnboundMethod

Hum, That would actually be a problem. Because I might want to know where a Method on UnboundMethod was allocated, (where method(:name) was called).

In the end a short enough ObjectSpace.something(object) # => [file, lineno] is probably fine. No need to add a method in Object.

Updated by sam.saffron (Sam Saffron) about 1 month ago

I am with Eregon (Benoit Daloze) here:

ObjectSpace.trace_object_allocations_start

as a side effect of a "require" seems like a mighty dangerous weapon. It will get checked into production at some point.

Also not a fan of:

require "objspace/start_tracing"

Cause then ... do we have to add:

require "objspace/stop_tracing"

Redefining p also feels extremely radical.

Overall a much prefer a deliberate API here:

t = ObjectSpace.allocation_tracer
t.start
t.stop
t.debug(x)

Updated by byroot (Jean Boussier) about 1 month ago

It will get checked into production at some point.

I understand the concern. A require always seem more harmless than a method call. But being able to enable tracing with a require allow to enable it without modifying the code through: RUBYOPT='-r objspace/tracing', which is in my opinion very valuable.

Updated by Eregon (Benoit Daloze) about 1 month ago

byroot (Jean Boussier) wrote in #note-15:

I understand the concern. A require always seem more harmless than a method call. But being able to enable tracing with a require allow to enable it without modifying the code through: RUBYOPT='-r objspace/tracing', which is in my opinion very valuable.

It's too bad -e is not allowed in RUBYOPT, otherwise one could do something like this:

RUBYOPT="-robjspace -eObjectSpace.trace_object_allocations_start" ruby ...

(or a shorter method to start it of course)

Maybe that would be useful to allow?

Updated by matz (Yukihiro Matsumoto) about 1 month ago

It's OK to add this feature. The side effects are acceptable (for this is only for debugging anyway). I am OK with any library name (for example with or without _start).

Matz.

Actions #18

Updated by mame (Yusuke Endoh) 3 days ago

  • Status changed from Open to Closed

Applied in changeset git|cf1e1879f12ad547f95fe94ab62b4d960e804eb8.


ext/objspace/lib/objspace/trace.rb: Added

This file, when require'ed, starts tracing the object allocations, and
redefines Kernel#p to show the allocation site.

This commit is experimental; the library name and APIs may change.

[Feature #17762]

Updated by mame (Yusuke Endoh) 3 days ago

  • Assignee set to mame (Yusuke Endoh)
  • Status changed from Closed to Assigned

I realized that the compatibility of this library is not important at all, because this is a tool for human's debugging. Any codebase must not use it. This means that we can change the file name and APIs anytime if needed, even after it is release.

So, tentatively, I've committed my proposal as require "objspace/trace". I've heard ko1 (Koichi Sasada) plans to create lib/debug/run.rb to enable a new debugger feature that he is now developing. require "debug/run" sounds simple and great to me, so I think require "objspace/trace" is also good enough. However, I'm okay to change if many people prefer require "objspace/start_tracing".

I admit redefining Kernel#p may be radical. But I want to see if it brings any actual issue. I'm not against adding allocation_location or something. If anyone wants it too, a pull request is welcome.

Just an idea: it might be good to create a rubocop rule to prevent require "objspace/trace" from being committed accidentally.

Actions

Also available in: Atom PDF