Feature #21539
openFacilitate walking native and interpreter (and jit?) stacks from outside of the ruby process
Description
While ruby does have a great API for getting stack traces within the ruby processes, as used by profilers like vernier and stackprof, there are some projects which aim to profile ruby from outside of the process.
Some examples include:
- rbspy (https://github.com/rbspy/rbspy/tree/main/ruby-structs/src)
- rbperf (https://github.com/javierhonduco/rbperf)
- open telemetry (https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/support/ebpf/ruby_tracer.ebpf.c, https://github.com/open-telemetry/opentelemetry-ebpf-profiler/blob/main/interpreter/ruby/ruby.go)
These first two take the approach of embedding the ruby headers within the profiler to be able to walk the stack.
otel's bpf exporter was relying on access to symbols (https://github.com/open-telemetry/opentelemetry-ebpf-profiler/issues/202), which were removed some time ago https://github.com/ruby/ruby/pull/7459. As a result, it cannot profile newer rubies as it cannot unwind the stacks.
All of these solutions kind of take the approach of trying to reverse engineer the ruby process and rely on somewhat hacky approaches that touch internal things that might frequently move around between ruby versions, and all feel a bit brittle. I think the root of this is that there is no public, stable api for this sort of external profiling.
I'm not exactly sure what the solution should be, but it would be great for ruby to offer a recommended, supported, and stable way for external profilers (whether they are using perf api's or ptrace) to be able to:
- Obtain the ruby native stack starting point, and walk it while being able to resolve the native symbols
- Obtain a reference to the ruby execution context and walk the ruby interpreter stack
- Have all of this work in a stable and reasonable way regardless of if / which ruby jit is enabled
It would be awesome for some ruby maintainers / experts to weigh-in on what such an external API could look like. Perhaps a public header containing stable (or at least perhaps, versioned?) public symbols and perhaps structs as a starting point? This might necessitate that some fields be refactored out of existing structs, and could slow down work related to them (eg, members need to be added or removed).
Updated by mame (Yusuke Endoh) 1 day ago
- Related to Feature #19119: Add an interface for out-of-process profiling tools to access Ruby information added