Project

General

Profile

Actions

Feature #10090

closed

Display of program name in process listing under AIX

Added by nichogl (Geoff Nichols) over 9 years ago. Updated about 9 years ago.

Status:
Third Party's Issue
Target version:
-
[ruby-core:63998]

Description

On AIX, the process listing (displayed with the ps command) for a program using Ruby 2.1.2 (or Ruby 1.9.3) shows only the Ruby interpreter path.

However, on other platforms (Linux, OS X), the process listing (for the same Ruby program) shows the Ruby interpreter path as well as the program name.

The requested default behavior is for the process listing to display the Ruby interpreter path as well as the program name on AIX.

Here's an example of the current behavior (on AIX 7.1):

# /tmp/test_script.rb &
[1] 10420428

# ps -ef | grep 10420428 | grep -v grep
    root 10420428  7799016   0 05:35:10  pts/0  0:00 /usr/bin/ruby

# /usr/bin/ruby -v
ruby 2.1.2p95 (2014-05-08 revision 45877) [powerpc-aix7.1.0.0]

Here's an example of the desired behavior (on CentOS 6.5):

# /tmp/test_script.rb &
[1] 4951

# ps -ef | grep 4951 | grep -v grep
root      4951  4244  0 12:22 pts/1    00:00:00 /usr/bin/ruby /tmp/test_script.rb

# /usr/bin/ruby -v
ruby 2.1.2p95 (2014-05-08 revision 45877) [i686-linux]

Here is the test script used on both platforms:

#!/usr/bin/ruby

loop do
  sleep(1)
end

Updated by nobu (Nobuyoshi Nakada) over 9 years ago

  • Status changed from Open to Feedback

On AIX, SPT_REUSEARGV should be used, as same as linux.
How is SPT_TYPE defined in your config.h file?

Updated by nichogl (Geoff Nichols) over 9 years ago

I'm using the config.h created by executing the unmodified configure script from the source tarball.

./configure --disable-install-rdoc

Here's the definition from the resulting config.h:

#define SPT_TYPE SPT_REUSEARGV

Updated by moses (Moses Mendoza) over 9 years ago

+1 on fixing this.

Is it possible AIX is missing some headers supplied on other systems?

Updated by mckern (Ryan McKern) over 9 years ago

+1 for fixing, or at least some communications about areas we can investigate to help resolve/remediate this.

Updated by nobu (Nobuyoshi Nakada) over 9 years ago

  • Status changed from Feedback to Third Party's Issue

Your code diesn't set $0, so ruby does and can do nothing.
The behavior of ps is a part of OS.
Complain to the vendor, IBM.

Updated by mckern (Ryan McKern) over 9 years ago

If we run this example code under Ruby 1.8.7 on AIX, we're able to see the entire command line that the process was called with (in the same manner as our CentOS 6.5 example). When we run this example code on Ruby 1.9.3 or Ruby 2.1.2, we see the truncated command line. I do not believe this to be a result of setting $0, as there's no need to set that on any other OS or an older version of the Ruby interpreter.

Updated by mckern (Ryan McKern) over 9 years ago

Has there been any movement on this? It's clearly a regression when compared to the 1.8 series, and it directly corresponds to the rewritten proctitle code introduced in 1.9.

Updated by mckern (Ryan McKern) over 9 years ago

This is absolutely not a 3rd party issue if the setting of the process name worked as expected on AIX under Ruby 1.8.7 (and it does) and no longer works as expected on 1.9.0 and beyond. I don't understand why this has been moved to "vendor issue".

Updated by mckern (Ryan McKern) over 9 years ago

I wanted to update this with our findings. This is definitely a result of the functionality in missing/setproctitle.c, which was imported from OpenSSH starting in the 1.9.x branch. In version 1.9.3p547, on line 118, there's the line:

argv[1] = NULL;

This results in the following behavior with a variation of our infinite loop test script on AIX:

[0] [AIX] root@pe-aix-71-agent:~/ruby-oob/ruby-1.9.3-p547 # /opt/ruby-1.9.3-p547/bin/ruby ../tests/test_proctitle.rb 
proctitle is ../tests/test_proctitle.rb
pid is 5111980
[0] [AIX] root@pe-aix-71-agent:~ # ps auxww | grep 5111980
root      5111980  0.0  1.0 4248 5428  pts/0 A    00:12:47  0:00 /opt/ruby-1.9.3-p547/bin/ruby

Changing this value to initialize a different index in argv results in predictable behavior, where the proctitle is truncated according to where the NULL is encountered in argv:

argv[3] = NULL;
[0] [AIX] root@pe-aix-71-agent:~/ruby-oob/ruby-1.9.3-p547 # /opt/ruby-1.9.3-p547/bin/ruby -r rubygems ../tests/test_proctitle.rb 
proctitle is ../tests/test_proctitle.rb
pid is 5111994
root      5111994  0.0  0.0 4256 3740  pts/0 A    00:18:49  0:00 /opt/ruby-1.9.3-p547/bin/ruby -r rubygems 
[0] [AIX] root@pe-aix-71-agent:~ #

And commenting out or removing this assignment entirely results in behavior that appears to correspond with both expectations and with Ruby 1.8.7:

// argv[1] = NULL;
[0] [AIX] root@pe-aix-71-agent:~/ruby-oob/ruby-1.9.3-p547 # /opt/ruby-1.9.3-p547/bin/ruby -r rubygems ../tests/test_proctitle.rb 
proctitle is ../tests/test_proctitle.rb
pid is 5112008
root      5112008  0.0  0.0 4256 3836  pts/0 A    00:29:42  0:00 /opt/ruby-1.9.3-p547/bin/ruby -r rubygems ../tests/test_proctitle.rb 
[0] [AIX] root@pe-aix-71-agent:~ # 

This appears pretty clearly to be a bug in the code and not a shortcoming of the platform.

Updated by ReiOdaira (Rei Odaira) over 9 years ago

I would like to fix this problem. It seems there is no reason this line is necessary (missing/setproctitle.c:compat_init_setproctitle), as Ryan pointed out:

	argv[1] = NULL;

Does anyone know why it is needed?

It is interesting that Linux's ps does not inspect the argv array, while AIX's ps does.

Updated by ReiOdaira (Rei Odaira) over 9 years ago

Ah, I almost understood. After Process.setproctitle is called, argv[1], argv[2], etc. are no longer valid, so we must set argv[1] to NULL.

However, they should be valid until the first call to Process.setproctitle. I thinks the following patch will work. If there is no concern, I will commit this.

--- missing/setproctitle.c      (revision 47835)
+++ missing/setproctitle.c      (working copy)
@@ -74,6 +74,7 @@
 static char *argv_start = NULL;
 static size_t argv_env_len = 0;
 static size_t argv_len = 0;
+static char **argv1_addr = NULL;
 #endif
 
 #endif /* HAVE_SETPROCTITLE */
@@ -119,7 +120,9 @@
                        lastenvp = envp[i] + strlen(envp[i]);
        }
 
-       argv[1] = NULL;
+       /* We keep argv[1], argv[2], etc. at this moment,
+          because the ps command of AIX refers to them. */
+       argv1_addr = &argv[1];
        argv_start = argv[0];
        argv_len = lastargv - argv[0];
        argv_env_len = lastenvp - argv[0];
@@ -162,6 +165,8 @@
        argvlen = len > argv_len ? argv_env_len : argv_len;
        for(; len < argvlen; len++)
                argv_start[len] = SPT_PADCHAR;
+       /* argv[1], argv[2], etc. are no longer valid. */
+       argv1_addr = NULL;
 #endif
 
 #endif /* SPT_NONE */

Updated by mckern (Ryan McKern) over 9 years ago

Rei,
thank you for that patch! I will test it on AIX today.

Updated by mckern (Ryan McKern) over 9 years ago

Rei,
I can confirm that this patch works as expected and that ps auxww now reports all of argv, as expected!

Thank you for taking a look at this!

Ryan

Updated by ReiOdaira (Rei Odaira) over 9 years ago

Ryan,

The previous patch was slightly (but critically) wrong, so please apply r47852.

Updated by nichogl (Geoff Nichols) about 9 years ago

Rei,
Thank you for your work on this. The r47852 patch appears to resolve the issue.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0