Bug #9629: GC::Profiler.total_time under-reports GC time compared to dtrace GC probe measurement - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #9629

closed

GC::Profiler.total_time under-reports GC time compared to dtrace GC probe measurement

Added by benweint (Ben Weintraub) over 11 years ago. Updated over 5 years ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-darwin13.0]

Backport:

[ruby-core:61446]

Description

I'm trying to square the numbers that I'm getting from GC::Profiler.total_time against those that I'm getting out of instrumentation with the GC dtrace probes embedded in Ruby, and having a hard time getting the two sources to agree.

I'm not sure if this is due to a legitimate bug in Ruby, or a misunderstanding on my part about what the two measurements mean.

You can reproduce this using the scripts in this gist (run standalone.rb first, it will prompt you for what to do next):
https://gist.github.com/benweint/9519384

The high-level summary of what that does is:

Call GC::Profiler.enable
Save GC::Profiler.total_time
Instrument with a dtrace script that tracks mark and sweep start/stop and keeps a running total of GC time
Run some code that exercises GC
Calculate elapsed GC time with GC::Profiler.total_time - <saved value from step 2>
Compare the Ruby-measured total GC time to the dtrace-measured total GC time

It seems that the measurement from GC::Profiler is consistently lower than the dtrace measurement, by a non-trivial margin (15-20% in my testing).

Looking at GC::Profiler.raw_data, the bulk of the difference seems to be in the sweep time measurement (mark times line up pretty closely between the two ways of measuring).

Any insight into whether this represents a legitimate bug, an error in my measurement technique, or a misunderstanding of these measurements would be greatly appreciated!

Files

Download all files

standalone.rb (926 Bytes) standalone.rb		benweint (Ben Weintraub), 03/13/2014 12:31 AM
trace-gc-standalone.sh (1.01 KB) trace-gc-standalone.sh		benweint (Ben Weintraub), 03/13/2014 12:31 AM

Actions

Copy link Download all files

#1 [ruby-core:61447]

Updated by benweint (Ben Weintraub) over 11 years ago

File standalone.rb standalone.rb added
File trace-gc-standalone.sh trace-gc-standalone.sh added

Uploading the two files from that gist, just to keep everything in one place.

Actions

Copy link

#2 [ruby-core:61448]

Updated by benweint (Ben Weintraub) over 11 years ago

Worth noting: a quick read through gc.c suggested that the missing GC sweep time might be due to me not having built with GC_PROFILE_MORE_DETAIL, but even after building with that, I still see the same discrepancy where GC::Profiler is significantly lower.

Actions

Copy link

#3 [ruby-core:61453]

Updated by benweint (Ben Weintraub) over 11 years ago

I realized that these are actually measuring different things: dtrace's timestamps measure wall clock time, whereas GC::Profiler on Mac OS X uses getrusage, which measures user CPU time. It still seems weird that the two would be so divergent though, given that GC mark and sweep should be CPU-bound.

Actions

Copy link

#4 [ruby-core:61481]

Updated by benweint (Ben Weintraub) over 11 years ago

I think I've figured out the discrepancy here: the dtrace probes wrap around the getrusage(2) calls that GC::Profiler bases its timings on for Mac OS X. The average lazy sweep time is quite short (single-digit microseconds per lazy sweep). Unfortunately, getrusage itself has an overhead of ~1.5 us per call on average on my Mac OS X box, which adds up to 3 us total per lazy sweep (since we call it once to start the timer and once to stop). That means dtrace sees a measurement for lazy sweeps that's on average 3 us higher than what GC::Profiler is able to measure. Because there are so many lazy sweeps, these 3 us chunks add up to a non-trivial amount of time.

I'm guessing that this is less of an issue on Linux, because clock_gettime will be used there instead of getrusage.

Feel free to close this out.

Actions

Copy link

#5 [ruby-core:61498]

Updated by benweint (Ben Weintraub) over 11 years ago

One minor follow-up: it's actually not that getrusage takes a 'long' time (relative to the cost of each lazy sweep invocation), it's the dtrace probes themselves firing. The conclusion remains the same, though GC::Profiler seems correct.

Actions

Copy link

Updated by jeremyevans0 (Jeremy Evans) over 5 years ago

Status changed from Open to Closed
Backport deleted (~~1.9.3: UNKNOWN, 2.0.0: UNKNOWN, 2.1: UNKNOWN~~)

Actions

Copy link

Also available in: Atom PDF

Like0

Like0Like0Like0Like0Like0Like0

Project

General

Profile

Ruby

Tags

Custom queries

Bug #9629

GC::Profiler.total_time under-reports GC time compared to dtrace GC probe measurement

Updated by benweint (Ben Weintraub) over 11 years ago

Updated by benweint (Ben Weintraub) over 11 years ago

Updated by benweint (Ben Weintraub) over 11 years ago

Updated by benweint (Ben Weintraub) over 11 years ago

Updated by benweint (Ben Weintraub) over 11 years ago

Updated by jeremyevans0 (Jeremy Evans) over 5 years ago