Bug #7500

Improve GC profiler timings on linux

Added by Aman Gupta over 2 years ago. Updated over 2 years ago.

[ruby-core:50495]
Status:Closed
Priority:Normal
Assignee:Narihiro Nakamura
ruby -v:ruby 2.0.0dev (2012-12-03 trunk 38149) [x86_64-darwin12.2.0] Backport:

Description

On linux kernels, getrusage()'s precision depends on the value of HZ when the kernel was compiled. By default, HZ=250 provides a 4ms granularity.

This patch uses clock_gettime() with CLOCK_PROCESS_CPUTIME_ID when available, which provides a 1ns precision on linux.

gc_timing.patch Magnifier (586 Bytes) Aman Gupta, 12/03/2012 11:32 AM

Associated revisions

Revision 38214
Added by nari over 2 years ago

  • gc.c (getrusage_time): uses clock_gettime() with CLOCK_PROCESS_CPUTIME_ID when available, which provides a 1ns precision on linux. [Bug #7500]

Revision 38214
Added by nari over 2 years ago

  • gc.c (getrusage_time): uses clock_gettime() with CLOCK_PROCESS_CPUTIME_ID when available, which provides a 1ns precision on linux. [Bug #7500]

History

#1 Updated by Narihiro Nakamura over 2 years ago

  • Category set to core
  • Assignee set to Narihiro Nakamura

#2 Updated by Motohiro KOSAKI over 2 years ago

AFAIK, clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts) return time included kernel running. So, clock_gettime() and ru_utime of getrusage() aren't equivalent.

#3 Updated by Motohiro KOSAKI over 2 years ago

And, if i understand the kernel source correctly, getrusage() and get_time(CLOCK_PROCESS_CPUTIME_ID) use the same clock source. So, I doubt this patch improve time accuracy.

Do anyone have a test result?

#4 Updated by Aman Gupta over 2 years ago

tmm1@fe19:~$ uname -a
Linux fe19.rs.github.com 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012 x86_64 GNU/Linux

tmm1@fe19:~$ cat timings.c
#include
#include
#include
#include
#include

double getrusage_time() {
struct rusage usage;
struct timeval time;
getrusage(RUSAGE_SELF, &usage);
time = usage.ru_utime;
return time.tv_sec + time.tv_usec * 1e-6;
}

double clock_time() {
struct timespec ts;

  if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts) == 0) {
      return ts.tv_sec + ts.tv_nsec * 1e-9;
  }
  return 0.0;

}

int main() {
int n;

  printf("getrusage() before: %f\n", getrusage_time());
  for (n=0; n<10000; n++) pow(2, 2048);
  printf("getrusage() after: %f\n", getrusage_time());

  printf("clock_gettime() before: %f\n", clock_time());
  for (n=0; n<10000; n++) pow(2, 2048);
  printf("clock_gettime() after: %f\n", clock_time());

}

tmm1@fe19:~$ gcc -o timings timings.c -lrt

tmm1@fe19:~$ ./timings
getrusage() before: 0.000000
getrusage() after: 0.000000
clock_gettime() before: 0.001244
clock_gettime() after: 0.001358

#5 Updated by Aman Gupta over 2 years ago

Increasing the number of iterations shows the 4ms granularity.

tmm1@fe19:~$ grep loops timings.c
int loops = 1000000;
for (n=0; n<loops; n++) pow(2, 2048);
for (n=0; n<loops; n++) pow(2, 2048);

tmm1@fe19:~$ gcc -o timings timings.c -lrt && ./timings
getrusage() before: 0.000000
getrusage() after: 0.004000
clock_gettime() before: 0.006966
clock_gettime() after: 0.010733

#6 Updated by Narihiro Nakamura over 2 years ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r38214.
Aman, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • gc.c (getrusage_time): uses clock_gettime() with CLOCK_PROCESS_CPUTIME_ID when available, which provides a 1ns precision on linux. [Bug #7500]

Also available in: Atom PDF