Bug #16787
closed[patch] allow Dir.home to work for non-login procs when $HOME not set
Description
The 'Dir.home' method in versions of Ruby 2.x through the latest (2.7.1,
released 2020-03-31) is unable to reliably locate the user's home directory
when all three of the following are true at the same time:
1. Ruby is running on a Unix-like OS
2. The $HOME environment variable is not set
3. The process is not a descendant of login(1) (or a work-alike)
When the above conditions are met, the condition can be triggered simply:
$ unset HOME
$ ruby -e print "home is: #{Dir.home}\n";
-e:1:in `home': couldn't find login name -- expanding `~' (ArgumentError)
from -e:1:in `<main>'
The expectation is that Dir.home should be able to obtain the user's default
home directory regardless of whether or not the process is a (grand)child of
login(1). This behavior surfaced when running unit tests on GitHub Actions,
where the driving process did not use a login session. The unit tests failed
due to the different behavior of Dir.home in this scenario, but Dir.home ought
to behave the same either way.
The actual observed behavior is that Dir.home is able to obtain the user's
default home directory only for processes that are (grand)children of
login(1).
This behavior has been confirmed directly on (at least) the following
versions, though it is clear from browsing the code that this is long standing
behavior:
$ ruby --version
ruby 2.5.5p157 (2019-03-15 revision 67260) [x86_64-linux-gnu]
$ ruby --version
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
$ ruby --version
ruby 2.8.0dev (2020-04-15T07:06:48Z master 69b3e0ac59) [x86_64-linux]
On a Unix-like OS, when the $HOME environment variable is not set, Ruby
attempts to obtain the user's home directory from the password database, as
one would expect. But the mechanism it uses only works for (grand)children of
login(1) (or work-alikes). In particular, it uses getlogin(3) to obtain the
username, with the intent to then obtain the user's password record (and its
'pw_dir' member) by looking it up by name (getpwnam(3)). That getlogin() call
fails, of course, because there is no logged-in user for the process.
The attached patch preserves the basic intent of the existing code, but allows
it work in the above scenario because the lookup for the user's record in the
password database is done directly by uid (getpwuid_r(3)), which is always
available, regardless of whether or not the process was launched by a
subprocess of login(1).
The patch applies cleanly against the HEAD of both 'master' and 'ruby_2_7',
and was tested against both on Debian GNU/Linux (buster/bullseye mix).
Motivation¶
This issue surfaced this past week
in the heroku/netrc project when CI builds
were first setup for the project using the GitHub Actions service. The process
that runs the unit tests there is not a (grand)child of login(1), so failed on
unit tests that exercise logic in that library when the $HOME environment
variable is not set (changing its value and/or unsetting it are legitimate
user activities; the tests were exercising that legitimate code path).
How to reproduce¶
In order to reproduce the issue you need to get some startup daemon process to
launch your ruby program; triggering the issue will not work for the ruby
process to be a subprocess of any process that is itself a (grand)child of a
login process. The GitHub Actions service happens to run code that way (see
above issue link for an example), but it can be simulated locally fairly
easily, too, using atd(8).
A process that is not a (grand)child of login(1) will not have its 'loginuid'
attribute set, so there will discrepancy between the values reported by id(1)
and the never-initialized value in '/proc/self/loginuid':
$ /usr/bin/id
uid=1001(runner) gid=115(docker) groups=115(docker)
$ /usr/bin/id --user
1001
/usr/bin/getent passwd 1001
runner:x:1001:115:,,,:/home/runner:/bin/bash
$ cat /proc/self/loginuid
4294967295
Note that '4294967295' is the largest unsigned value that will fit in 32 bits,
so it's signed value interpretation is '-1'. A 'loginuid' attribute with that
value is an indication that it has never been set. In a typical configuration,
it would be set as a side effect of the login process by PAM (see
pam_loginuid(8)).
The out-of-the-box 'atd(8)' configuration on Debian is also configured to have
PAM account for the 'loginuid' attribute, but for the purpose of testing the
fix for this issue, it can be easily disabled by editing the '/etc/pam.d/atd'
file. Find the line that looks like this:
session required pam_loginuid.so
and comment it out so it looks like this:
#session required pam_loginuid.so
That change will take effect as soon as you save the file; there is no need to
restart any services or anything like that.
To test the before and after behaviors, I simply ran a pristine and a patched
version of the code side-by-side, indirectly via at(1).
$ cat /tmp/algo-doit2
#!/bin/bash -
set -x
my_log_fpath='/tmp/algo-doit2.log'
#RUBY_UNPATCHED='/usr/bin/ruby2.5'
RUBY_UNPATCHED='/tmp/aljunk-ruby-from-git/bin/ruby'
#RUBY_PATCHED='/tmp/aljunk-ruby-from-git-patched/bin/ruby'
RUBY_PATCHED='/tmp/aljunk-ruby-from-git-patched-master/bin/ruby'
(
set -x
/usr/bin/id
/usr/bin/id --user
printf '%s\n' $(cat /proc/self/loginuid)
: DEBUG 1 unpatched: good
"${RUBY_UNPATCHED}" -e 'print "home is: #{Dir.home}\n";'
:
: DEBUG 2 unpatched: now bad
unset HOME
"${RUBY_UNPATCHED}" -e 'print "home is: #{Dir.home}\n";'
) 1>> "${my_log_fpath}" 2>&1
(
set -x
/usr/bin/id
/usr/bin/id --user
printf '%s\n' $(cat /proc/self/loginuid)
: DEBUG 3 patched: good
"${RUBY_PATCHED}" -e 'print "home is: #{Dir.home}\n";'
:
: DEBUG 4 patched: still good
unset HOME
"${RUBY_PATCHED}" -e 'print "home is: #{Dir.home}\n";'
) 1>> "${my_log_fpath}" 2>&1
For best results, run 'tail -F' on the output log in the background in your
shell:
$ tail -F /tmp/algo-doit2.log &
With that setup, now each time you run the at(1) command you'll see the
output (from the log file) right away:
$ at now < /tmp/algo-doit2
warning: commands will be executed using /bin/sh
job 17 at Wed Apr 15 06:04:00 2020
+ set -x
+ /usr/bin/id
uid=1000(someuser) gid=1000(someuser) groups=1000(someuser)
+ /usr/bin/id --user
1000
+ cat /proc/self/loginuid
+ printf %s\n 4294967295
4294967295
+ : DEBUG 1 unpatched: good
+ /tmp/aljunk-ruby-from-git/bin/ruby -e print "home is: #{Dir.home}\n";
home is: /home/someuser
+ :
+ : DEBUG 2 unpatched: now bad
+ unset HOME
+ /tmp/aljunk-ruby-from-git/bin/ruby -e print "home is: #{Dir.home}\n";
-e:1:in `home': couldn't find login name -- expanding `~' (ArgumentError)
from -e:1:in `<main>'
+ set -x
+ /usr/bin/id
uid=1000(someuser) gid=1000(someuser) groups=1000(someuser)
+ /usr/bin/id --user
1000
+ cat /proc/self/loginuid
+ printf %s\n 4294967295
4294967295
+ : DEBUG 3 patched: good
+ /tmp/aljunk-ruby-from-git-patched-master/bin/ruby -e print "home is: #{Dir.home}\n";
home is: /home/someuser
+ :
+ : DEBUG 4 patched: still good
+ unset HOME
+ /tmp/aljunk-ruby-from-git-patched-master/bin/ruby -e print "home is: #{Dir.home}\n";
home is: /home/someuser
After testing, be sure to restore your atd(8) PAM configuration.
Legal¶
I agree that the code in the attached patch may be distributed and/or modified
under Ruby's License.
Related Bugs¶
Bug #12226 seems as if it might be related "in spirit", but that bug is
specific to MS Windows, and the current issue (and patch) is specific to
Unix-like systems.
"Dir.home with valid named user raises ArgumentError on Windows"
https://bugs.ruby-lang.org/issues/12226
Files
Updated by salewski (Alan Salewski) over 4 years ago
I created a pull request for this over on GitHub:
https://github.com/ruby/ruby/pull/3034
The automated intgration tests there have some complaints. I'll see what I can do about getting those fixed up, and will report back.
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
The reason to prefer getpwnam
over getpwuid
is that some login names who have different home directories can share the same user id.
I think it looks good as the next fallback when getlogin
and/or getpwnam
fail.
Updated by salewski (Alan Salewski) over 4 years ago
nobu (Nobuyoshi Nakada) wrote in #note-2:
The reason to prefer
getpwnam
overgetpwuid
is that some login names who have different home directories can share the same user id.
I think it looks good as the next fallback whengetlogin
and/orgetpwnam
fail.
Thanks; I agree. I hadn't considered that possibility. I responded, too, over on GitHub; I'l rework the patch to try name-based lookup first.
Updated by salewski (Alan Salewski) over 4 years ago
- File allow-dir.home-for-non-login-procs-v2.patch allow-dir.home-for-non-login-procs-v2.patch added
- ruby -v changed from ruby 2.8.0dev (2020-04-15T07:06:48Z master 69b3e0ac59) [x86_64-linux] to ruby 2.8.0dev (2020-04-15T19:21:47Z ads/b.r-l.o-issue-.. f52915422d) [x86_64-linux]
Just attaching the "v2" version of the patch, which adds functionality to fallback on using getpwuid() when getpwuid_r() is not available at compile time. This patch is already obsolete, as we are discussing doing name based lookups, and then falling back to uid-based lookups only if the name-based lookups fail. Just recording the patch as a stepping stone on the journey.
Updated by salewski (Alan Salewski) over 4 years ago
Just attaching the "v3" version of the patch. This one is /not/ a commit candidate; needs some beautification.
The functionality is there to start with the password record lookup by username, and only if that fails to then fall back on the lookup by uid. It is maximally portable in the sense that it will use any combination of the getlogin_r(), getlogin(), getpwnam_r(), getpwnam(), getpwuid_r(), and getpwuid() functions that are available, with the compile-time preference for the *_r() variations. But it is a big step back in terms of readability, as I turned it into a sea of cpp conditionals. Just noting it all here to show a pulse.
Unless the build infra turns up something that I need to fix, my next step will be to break apart the changes suppress the ugly bits.
Updated by salewski (Alan Salewski) over 4 years ago
- File allow-dir.home-for-non-login-procs-v4.patch allow-dir.home-for-non-login-procs-v4.patch added
- ruby -v changed from ruby 2.8.0dev (2020-04-15T19:21:47Z ads/b.r-l.o-issue-.. f52915422d) [x86_64-linux] to ruby 2.8.0dev (2020-04-15T20:23:24Z ads/b.r-l.o-issue-.. 1e64386fac) [x86_64-linux]
Slightly cleaned-up patch: v4
[still just a wip; please ignore]
Updated by salewski (Alan Salewski) over 4 years ago
Attaching the "v5" version of the patch. This version corresponds with the changes I just pushed on my ads/b.r-l.o-issue-16787 branch for PR 3034:
This version preserves the capability of the v4 patch, but breaks the functionality out into three new internal helper functions in 'file.c':
VALUE rb_getlogin(void);
/* read as: "get pwd db home dir by username for login" */
VALUE rb_getpwdirnam_for_login(void);
/* read as: "get pwd db home dir by uid" */
VALUE rb_getpwdiruid(void);
This change gets rb_default_home_dir(...) back to being readable.
I've tested these changes with all the various combinations of the six getlogin*(), getpwnam*(), and getpwuid*() functions.
Updated by shyouhei (Shyouhei Urabe) over 4 years ago
- Related to Feature #12695: File.expand_path should resolve ~/ using /etc/passwd when HOME is not set added
Updated by salewski (Alan Salewski) over 4 years ago
Attaching the "v5" version of the patch (for real this time).
Updated by nobu (Nobuyoshi Nakada) over 4 years ago
Thank you, I left some comments at the PR for the details.
IMHO, the new functions may fit more in process.c.
Updated by salewski (Alan Salewski) over 4 years ago
nobu (Nobuyoshi Nakada) wrote in #note-10:
Thank you, I left some comments at the PR for the details.
IMHO, the new functions may fit more in process.c.
Thanks; I'm taking a look now.
Updated by salewski (Alan Salewski) over 4 years ago
- File allow-dir.home-for-non-login-procs-v6.patch allow-dir.home-for-non-login-procs-v6.patch added
- ruby -v changed from ruby 2.8.0dev (2020-04-15T20:23:24Z ads/b.r-l.o-issue-.. 1e64386fac) [x86_64-linux] to ruby 2.8.0dev (2020-04-23T10:11:21Z ads/b.r-l.o-issue-.. 5369b67fc8) [x86_64-linux]
Attaching version "v6" of the patch, which is just another WIP milestone.
Please do not spend too much time with this version -- I'm mainly just documenting where I am currently because I haven't had much time to look at this the last few days and want to document where I'm at with it.
With one big exception, his variation incorporates most of the feedback provided by @nobu (Nobuyoshi Nakada) on 2020-04-17, including:
-
The original error message of
rb_default_home_dir(...)
is retained, for backward compatibility. -
All four functions touched now end with a return statement in every variation (to keep compilers happy).
-
Corrected reference to
ENOENT
used without a value comparison. -
rb_getlogin(...)
now returns the reference to theRString
already created, as opposed to unnecessarily creating a new instance from its string content pointer. -
Got rid of excessive
#error
blocks inrb_getlogin()
,rb_getpwdirnam_for_login()
, andrb_getpwdiruid()
One piece of feedback was that core of the new functions might more properly live in process.c
. I agree, but am leaving that for the next milestone for two reasons:
-
I've not yet sized it up the work; will be cleaner if such a change is done in isolation of the changes from the current patch revision; and
-
I suspect that work might lead to changes that cannot be applied cleanly by a single patch on both the 'master' and 'ruby_2_7' branches. If that is the case, it might make sense to keep a series of two patches -- one similar to the current more minimal changes that can be easily cherry-picked for 'ruby_2_7', and a second more intrusive patch that organizes the code in a way more suitable for long term maintainability.
As with prior versions, I've tested these changes with all the various combinations of the six getlogin*()
, getpwnam*()
, and getpwuid*()
functions. I have also pushed this change on my ads/b.r-l.o-issue-16787 over on GitHub, mainly to see how the CI machinery likes it.
I'll take a look at moving most of the functionality into process.c
next; might be a few days...
Updated by salewski (Alan Salewski) over 4 years ago
- ruby -v changed from ruby 2.8.0dev (2020-04-23T10:11:21Z ads/b.r-l.o-issue-.. 5369b67fc8) [x86_64-linux] to ruby 2.8.0dev (2020-04-24T10:02:57Z ads/b.r-l.o-issue-.. 66fa7717ab) [x86_64-linux]
- File allow-dir.home-for-non-login-procs-v7.patch allow-dir.home-for-non-login-procs-v7.patch added
Attaching version "v7" of the patch. This version corresponds with the changes I just pushed on my ads/b.r-l.o-issue-16787 branch for PR 3034:
- https://github.com/salewski/ruby/commit/66fa7717ab8c2d37042866cddf3fcf38d0095f99
- https://github.com/ruby/ruby/commit/66fa7717ab8c2d37042866cddf3fcf38d0095f99
This one is a candidate for further review and/or merging.
This one builds on the earlier version of the patch, and moves the new pwd.h related functions from file.c to process.c
This patch has been generated of a branch that was rebased on top of the 'master' branch within the last 30 minutes.
@nobu (Nobuyoshi Nakada): This variation incorporates the final outstanding recommendation of the feedback you provided (thanks for that!) on 2020-04-17[0][1], which was to look into moving the new functions into process.c rather than putting them in file.c.
It turns out that file.c was already including functionality from process.c, so this change does not add a new dependency between the files.
It does widen the visibility of the three new functions, as they are now declared in the internal/process.h
header:
VALUE rb_getlogin(void);
VALUE rb_getpwdirnam_for_login(VALUE login);
VALUE rb_getpwdiruid(void);
Note that I changed the signature of the new rb_getpwdirnam_for_login(...)
to accept the login name as a parameter. The reason is that the single calling location from rb_default_home_dir(...)
in `file.c has historical behavior of raising an exception with the message:
"couldn't find login name -- expanding `~'"
While it is a corner case, there is one scenario in which it would still be more appropriate for the code to emit that message than the newly introduced message that mentions the uid: if the attempt to find the login name failed (either because the system doesn't have getlogin_r()
or getlogin()
, or because the process is not a descendant of login) AND the system (for whatever reason) has pwd.h
but does not have either getpwuid_r()
or getpwuid()
. So yeah, a corner case -- but theoretically possible.
Note that this change cannot be cleanly applied to the ruby_2_7
branch because the internal/process.h
file does not exist on that branch (looks like it was introduced more recently). However, the three function definitions can easily be lifted out of internal/process.h
from the branch and added to ruby.h (next to the other process.c functions, such as rb_last_status_clear(...)
), so it wouldn't be too much work to cherry-pick it with minor modifications.
As with prior versions, I've tested these changes with all the various combinations of the six getlogin*()
, getpwnam*()
, and getpwuid*()
functions. I have also tested it with faked-up scenarios of getlogin_r()
and getlogin()
returning NULL
to verify the backward compat code path mentioned above.
[0] https://bugs.ruby-lang.org/issues/16787#note-10
[1] https://github.com/ruby/ruby/pull/3034#pullrequestreview-395521033
Updated by salewski (Alan Salewski) over 4 years ago
- File allow-dir.home-for-non-login-procs-v7-rebased-2020-05-14.patch allow-dir.home-for-non-login-procs-v7-rebased-2020-05-14.patch added
- ruby -v changed from ruby 2.8.0dev (2020-04-24T10:02:57Z ads/b.r-l.o-issue-.. 66fa7717ab) [x86_64-linux] to ruby 2.8.0dev (2020-05-14T10:58:44Z master d7d0d01401) [x86_64-linux]
Attaching version "v7-rebased-2020-05-14" of the patch. This version corresponds to the rebase-only changes pushed to my ads/b.r-l.o-issue-16787 branch for PR 3034:
- https://github.com/ruby/ruby/commit/d7cf3c96b8a677bb93403fa0525d13e7f8ff7c4e
- https://github.com/salewski/ruby/commit/d7cf3c96b8a677bb93403fa0525d13e7f8ff7c4e
This one is a candidate for further review and/or merging.
There are not any code changes with this patch; it is just a refreshed version of the earlier "v7" patch, rebased on top of the current changes from the 'master' branch' as they looked earlier today (2020-05-14).
Updated by salewski (Alan Salewski) over 4 years ago
- Status changed from Open to Closed
Applied in changeset git|c15cddd1d515c5bd8dfe8fb2725e3f723aec63b8.
Allow Dir.home to work for non-login procs when $HOME not set
Allow the 'Dir.home' method to reliably locate the user's home directory when
all three of the following are true at the same time:
1. Ruby is running on a Unix-like OS
2. The $HOME environment variable is not set
3. The process is not a descendant of login(1) (or a work-alike)
The prior behavior was that the lookup could only work for login-descended
processes.
This is accomplished by looking up the user's record in the password database
by uid (getpwuid_r(3)) as a fallback to the lookup by name (getpwname_r(3))
which is still attempted first (based on the name, if any, returned by
getlogin_r(3)).
If getlogin_r(3), getpwnam_r(3), and/or getpwuid_r(3) is not available at
compile time, will fallback on using their respective non-*_r() variants:
getlogin(3), getpwnam(3), and/or getpwuid(3).
The rationale for attempting to do the lookup by name prior to doing it by uid
is to accommodate the possibility of multiple login names (each with its own
record in the password database, so each with a potentially different home
directory) being mapped to the same uid (as is explicitly allowed for by
POSIX; see getlogin(3posix)).
Preserves the existing behavior for login-descended processes, and adds the
new capability of having Dir.home being able to find the user's home directory
for non-login-descended processes.
Fixes [Bug #16787]
Related discussion:
https://bugs.ruby-lang.org/issues/16787
https://github.com/ruby/ruby/pull/3034
Updated by jeremyevans0 (Jeremy Evans) over 4 years ago
- Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN to 2.5: UNKNOWN, 2.6: REQUIRED, 2.7: REQUIRED
Updated by salewski (Alan Salewski) over 4 years ago
- File ruby-2.6-backport-allow-dir.home-for-non-login-procs.patch ruby-2.6-backport-allow-dir.home-for-non-login-procs.patch added
- File ruby-2.7-backport-allow-dir.home-for-non-login-procs.patch ruby-2.7-backport-allow-dir.home-for-non-login-procs.patch added
Attaching two separate backport patches, one for branch 'ruby_2_6' and one for branch 'ruby_2_7'.
The changes for each patch were tested separately as outlined in the original issue description, and separate pull requests have been created for each over on GitHub.
PR for the 'ruby_2_6' change:
PR for the 'ruby_2_7' change:
There are failures in the automated integration tests on GitHub, but they seem unrelated to the specific changes introduced by these PRs.
Updated by salewski (Alan Salewski) over 4 years ago
Just noting that I rebased (and re-tested) the Ruby 2.7 backport PR (PR 3293) on top of the latest changes in the 'ruby-2_7' branch. The rebasing did not result in any material changes to the 2.7 backport patch that had been previously attached to this issue, though, so that did not need updating here.
The Ruby 2.6 backport PR (PR 3292) did not need rebasing.
Updated by nagachika (Tomoyuki Chikanaga) almost 4 years ago
- Backport changed from 2.5: UNKNOWN, 2.6: REQUIRED, 2.7: REQUIRED to 2.5: UNKNOWN, 2.6: REQUIRED, 2.7: DONE
ruby_2_7 ef1ed1b53afdff80cb217d77f3fbcbe7906c729e merged revision(s) c15cddd1d515c5bd8dfe8fb2725e3f723aec63b8.
Updated by usa (Usaku NAKAMURA) over 3 years ago
- Backport changed from 2.5: UNKNOWN, 2.6: REQUIRED, 2.7: DONE to 2.5: UNKNOWN, 2.6: DONE, 2.7: DONE
backported into ruby_2_6 at r67931