Bug #7500

Improve GC profiler timings on linux

Added by Aman Gupta over 1 year ago. Updated over 1 year ago.

[ruby-core:50495]
Status:Closed
Priority:Normal
Assignee:Narihiro Nakamura
Category:core
Target version:2.0.0
ruby -v:ruby 2.0.0dev (2012-12-03 trunk 38149) [x86_64-darwin12.2.0] Backport:

Description

On linux kernels, getrusage()'s precision depends on the value of HZ when the kernel was compiled. By default, HZ=250 provides a 4ms granularity.

This patch uses clockgettime() with CLOCKPROCESSCPUTIMEID when available, which provides a 1ns precision on linux.

gc_timing.patch Magnifier (586 Bytes) Aman Gupta, 12/03/2012 11:32 AM

Associated revisions

Revision 38214
Added by nari over 1 year ago

  • gc.c (getrusagetime): uses clockgettime() with CLOCKPROCESSCPUTIME_ID when available, which provides a 1ns precision on linux. [Bug #7500]

History

#1 Updated by Narihiro Nakamura over 1 year ago

  • Category set to core
  • Assignee set to Narihiro Nakamura

#2 Updated by Motohiro KOSAKI over 1 year ago

AFAIK, clockgettime(CLOCKPROCESSCPUTIMEID, &ts) return time included kernel running. So, clockgettime() and ruutime of getrusage() aren't equivalent.

#3 Updated by Motohiro KOSAKI over 1 year ago

And, if i understand the kernel source correctly, getrusage() and gettime(CLOCKPROCESSCPUTIMEID) use the same clock source. So, I doubt this patch improve time accuracy.

Do anyone have a test result?

#4 Updated by Aman Gupta over 1 year ago

tmm1@fe19:~$ uname -a
Linux fe19.rs.github.com 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012 x86_64 GNU/Linux

tmm1@fe19:~$ cat timings.c
#include
#include
#include
#include
#include

double getrusagetime() {
struct rusage usage;
struct timeval time;
getrusage(RUSAGE
SELF, &usage);
time = usage.ruutime;
return time.tv
sec + time.tv_usec * 1e-6;
}

double clock_time() {
struct timespec ts;

  if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts) == 0) {
      return ts.tv_sec + ts.tv_nsec * 1e-9;
  }
  return 0.0;

}

int main() {
int n;

  printf("getrusage() before: %f\n", getrusage_time());
  for (n=0; n<10000; n++) pow(2, 2048);
  printf("getrusage() after: %f\n", getrusage_time());

  printf("clock_gettime() before: %f\n", clock_time());
  for (n=0; n<10000; n++) pow(2, 2048);
  printf("clock_gettime() after: %f\n", clock_time());

}

tmm1@fe19:~$ gcc -o timings timings.c -lrt

tmm1@fe19:~$ ./timings
getrusage() before: 0.000000
getrusage() after: 0.000000
clockgettime() before: 0.001244
clock
gettime() after: 0.001358

#5 Updated by Aman Gupta over 1 year ago

Increasing the number of iterations shows the 4ms granularity.

tmm1@fe19:~$ grep loops timings.c
int loops = 1000000;
for (n=0; n<loops; n++) pow(2, 2048);
for (n=0; n<loops; n++) pow(2, 2048);

tmm1@fe19:~$ gcc -o timings timings.c -lrt && ./timings
getrusage() before: 0.000000
getrusage() after: 0.004000
clockgettime() before: 0.006966
clock
gettime() after: 0.010733

#6 Updated by Narihiro Nakamura over 1 year ago

  • Status changed from Open to Closed
  • % Done changed from 0 to 100

This issue was solved with changeset r38214.
Aman, thank you for reporting this issue.
Your contribution to Ruby is greatly appreciated.
May Ruby be with you.


  • gc.c (getrusagetime): uses clockgettime() with CLOCKPROCESSCPUTIME_ID when available, which provides a 1ns precision on linux. [Bug #7500]

Also available in: Atom PDF