Project

General

Profile

Actions

Feature #17883

closed

Load bundler/setup earlier to make `bundle exec ruby -r` respect Gemfile

Added by mame (Yusuke Endoh) 5 months ago. Updated 5 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
-
[ruby-core:104004]

Description

To reproduce the issue, prepare a Gemfile and run bundle install --path=vendor/bundle.

$ cat Gemfile
source "https://rubygems.org"
gem "activesupport"

$ bundle install --path=vendor/bundle

Kernel#require respects the Gemfile correctly.

$ bundle exec ruby -e 'require "active_support"'

However, bundle exec ruby -ractive_support -e '' does not.

$ bundle exec ruby -ractive_support -e ''
<internal:/home/mame/work/ruby/local/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require': cannot load such file -- active_support (LoadError)
        from <internal:/home/mame/work/ruby/local/lib/ruby/3.1.0/rubygems/core_ext/kernel_require.rb>:85:in `require'

We can work around the issue by explicitly passing -rbundler/setup before -ractive_support, but this is very confusing to me. The same issue was discussed in StackOverflow: https://stackoverflow.com/questions/59623068/correct-way-to-combine-bundle-exec-and-ruby-r


Here is my analysis. bundle exec sets RUBYOPT=-rbundler/setup which replaces Kernel#require with bundler's own definition. -e 'require "active_support"' correctly triggers bundler's definition. However, -ractive_support is evaluated before RUBYOPT=-rbundler/setup is evaluated, so it triggers rubygems' require definition which does not know vendor/bundle directory.

This is caused by the interpretation order of RUBYOPT and command-line arguments.

$ RUBYOPT=-r./a ruby -r./b -e ''
:b
:a

For compatibility, I don't think that changing the order is a good idea. IMO, it would be good for ruby interpreter to provide Bundler something special, because Bundler is now bundled with Ruby.

My naive idea is to make the interpreter load ENV["BUNDLE_BIN_PATH"] + "../../lib/bundler/setup" before any other -r options if BUNDLE_BIN_PATH is defined. Or another new dedicated environment variable that bundle exec sets may work (for example, RUBY_BUNDLER_SETUP or something).

deivid (David Rodríguez) nobu (Nobuyoshi Nakada) What do you think?

Updated by Eregon (Benoit Daloze) 5 months ago

Personally I find the order confusing, in many/most command-line executables, direct arguments have precedence over environment variables.

For -W, direct arguments actually have precedence, as expected (command line args should always be able to override env vars):

$ RUBYOPT=-W2 ruby -W0 -e 'p $VERBOSE'
nil
$ RUBYOPT=-W0 ruby -W2 -e 'p $VERBOSE'
true

and for -I too, i.e., RUBYOPT is processed first:

$ RUBYOPT=-Ia ruby -Ib -e 'puts $:'
/home/eregon/b
/home/eregon/a
/home/eregon/.rubies/ruby-3.0.1/lib/ruby/site_ruby/3.0.0
...

But for -r, the order is reversed, as you showed:

RUBYOPT=-r./a ruby -r./b -e ''
:b
:a

I think that's a much bigger source of confusion than this special case of -r + bundle exec, and IMHO what should be changed.
It's also inconsistent and very hard to understand, since the order seems to change per flag.

Updated by jeremyevans0 (Jeremy Evans) 5 months ago

Instead of adding BUNDLE_BIN_PATH, which would be specific to bundler, I would prefer to instead add something more generic such as RUBYEARLYOPT or RUBYPREOPT, which is just like RUBYOPT but processed before command line options.

Updated by mame (Yusuke Endoh) 5 months ago

I'm clearly against changing the order because of compatibility. Also, the current order is somewhat reasonable. In principle, explicit command-line arguments are "stronger" than RUBYOPT. For RUBYOPT=-W2 ruby -W0, -W0 wins; -W2 is just ignored. For RUBYOPT=-ra ruby -rb, -rb wins; b is loaded earlier than a; in theory, b should be able to cancel a.

IMO, it seems like overkill to introduce a generic option like RUBYEARLYOPT. I want to allow only bundle exec to use the feature. But I think it is much more acceptable than changing the order.

FYI: BUNDLE_BIN_PATH itself is not new. At the present time, bundle exec adds the environment variable.

$ ruby -e 'p ENV["BUNDLE_BIN_PATH"]'
nil

$ bundle exec ruby -e 'p ENV["BUNDLE_BIN_PATH"]'
"/home/mame/local/lib/ruby/gems/3.0.0/gems/bundler-2.2.15/exe/bundle"

Updated by deivid (David Rodríguez) 5 months ago

Hei!

I see how this is inconsistent, and if it was to be implemented from scratch, I think it would make sense to always process options in RUBYOPT first so that CLI args can always override them "Override" can have multiple meanings here like be able to set the Warn level (-W), get the most priority in the LOAD_PATH (-I), or get to run a ruby file last so it can make modifications on ruby code loaded before (-r). However, I do understand the backwards compatibility concern.

In any case, this is a known issue in bundler we got reported upstream a while ago: https://github.com/rubygems/rubygems/issues/4025. I opened a PR to fix it by special casing bundle exec ruby and adding -rbundler/setup before any other CLI args: https://github.com/rubygems/rubygems/pull/4616.

Updated by mame (Yusuke Endoh) 5 months ago

deivid (David Rodríguez) wrote in #note-4:

special casing bundle exec ruby and adding -rbundler/setup before any other CLI args: https://github.com/rubygems/rubygems/pull/4616.

deivid (David Rodríguez) Thanks for your work, but it looks too ad-hoc and incomplete. It does not work well via a shell script.

$ cat test.sh
#!/bin/sh

ruby -ractive_support -e 'p ActiveSupport::VERSION::STRING'

$ bundle exec ./test.sh
<internal:/home/mame/local/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require': cannot load such file -- active_support (LoadError)
        from <internal:/home/mame/local/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:85:in `require'

BTW, my original intention was to add the special handling in ruby core, but it may be implemented in rubygems' custom definition of Kernel#require.

Updated by deivid (David Rodríguez) 5 months ago

Indeed mame (Yusuke Endoh), good catch. It sounds like this can't really be fixed in general inside bundler.

In my opinion, making handling of RUBYOPT consistent as suggested by Eregon (Benoit Daloze) would be the best way to fix this. I can see it could cause issues in some cases, but they should be pretty rare because it needs users to be using -r both in RUBYOPT and as CLI args at the same time, which I believe is not very common. And not only that, it needs to be some usage that doesn't hit this particular issue. So in my opinion, while maybe not worth fixing on a patch level release, I'd say it could go into the next major version with a proper note in the changelog.

If the preferred way to fix it is to add a special case (specifically requiring bundler/setup) to the current special case (-r handling), do we need to check a bundler specific environment variable? Would it be enough to check whether -rbundler/setup is included in RUBYOPT?

Updated by mame (Yusuke Endoh) 5 months ago

deivid (David Rodríguez) wrote in #note-6:

Would it be enough to check whether -rbundler/setup is included in RUBYOPT?

I think it is a possible design choice, though it is very hacky.

But in the first place, I think Bundler no longer has to use the RUBYOPT hack because it is integrated with rubygems. What do you think about this proof-of-concept patch for rubygems?

diff --git a/lib/rubygems/core_ext/kernel_require.rb b/lib/rubygems/core_ext/kernel_require.rb
index 4b867c55e9..a565e09999 100644
--- a/lib/rubygems/core_ext/kernel_require.rb
+++ b/lib/rubygems/core_ext/kernel_require.rb
@@ -33,7 +33,15 @@ module Kernel
   # The normal <tt>require</tt> functionality of returning false if
   # that file has already been loaded is preserved.

+  @@bundler_setup = true
   def require(path)
+    if ENV["BUNDLE_BIN_PATH"] && @@bundler_setup
+      @@bundler_setup = false
+      gem_original_require("bundler/setup")
+      monitor_owned = false
+      return require(path)
+    end
+
     if RUBYGEMS_ACTIVATION_MONITOR.respond_to?(:mon_owned?)
       monitor_owned = RUBYGEMS_ACTIVATION_MONITOR.mon_owned?
     end

Updated by deivid (David Rodríguez) 5 months ago

In my opinion, all approaches not involving fixing -r priority feel hacky.

The BUNDLE_BIN_PATH environment variable is something I had considered to remove in the future since it's only used for monkeypatching rubygems helpers to select the proper bundler executable, and that's something that should just work without it. So I would prefer any approach that doesn't use this variable. To be honest, since this is about fixing the priority of bundler/setup inside RUBYOPT, explicitly detecting bundler/setup inside RUBYOPT feels more explicit and less hacky to me, even if we fix this in rubygems require.

I guess since bundler already has this feature we have to fix this edge case, but I tend to believe this feature should have never existed in the first place. Over the years we have received many issues where bundler putting -rbundler/setup in RUBYOPT and forcing all future ruby processes to use bundler as well leads to unexpected hard to debug issues. And it's also the reason why user scripts and rake tasks out there are full of Bundler.with_original_env blocks before shelling out. The way I see it, people should choose explicitly whether or not to use bundler in their subprocesses by using or not using bundle exec to invoke them. Anyways, I don't think that's something that can be changed now, but I wanted to mention it.

Updated by deivid (David Rodríguez) 5 months ago

I closed the bundler PR in favour of the approach being discussed here. Even if this was fixed in ruby, it would still make sense to add this logic to rubygems require so that this works on all supported rubies.

Updated by mame (Yusuke Endoh) 5 months ago

Sorry for repeating myself again, but I don't think the current behavior of RUBYOPT is wrong.
Command-line arguments are what a user passed explicitly, so it is natural to me to prioritize them over RUBYOPT.
The issue is that RUBYOPT is not what Bundler expected to be. But it does not mean RUBYOPT is wrong.

The BUNDLE_BIN_PATH environment variable is something I had considered to remove in the future

BUNDLE_BIN_PATH is just an example. You can use any other name you like.

BTW, I noticed that my patch could be even much simpler. If you like BUNDLER_SETUP:

diff --git a/lib/rubygems.rb b/lib/rubygems.rb
index 3585defd2d..65f34d8812 100644
--- a/lib/rubygems.rb
+++ b/lib/rubygems.rb
@@ -1366,3 +1366,7 @@ def default_gem_load_paths
 require 'rubygems/core_ext/kernel_warn'

 Gem.use_gemdeps
+
+require "bundler/setup" if ENV["BUNDLER_SETUP"]

I tend to believe this feature should have never existed in the first place.

I agree with this. To speak of extremes, if Bundler had been a built-in feature of Ruby, we would not need bundle exec at all.

(This is off-topic: Nowadays, many people are using Bundler, so it might be a good time to enable Bundler by default? We can start it with require "bundler/setup" unless ENV["NO_BUNDLER"] :-)

Updated by deivid (David Rodríguez) 5 months ago

Command-line arguments are what a user passed explicitly, so it is natural to me to prioritize them over RUBYOPT.

Yes, that's where we disagree. In my opinion, prioritizing -r scripts means letting them run last, so that they can change ruby code in other -r scripts loaded before. Just like prioritizing -I paths means prepending them to the $LOAD_PATH in the last place, so that they get the first position, or prioritizing -W options means processing them last, so that they "win". Anyways, I won't insist further on this.

BUNDLE_BIN_PATH is just an example. You can use any other name you like.

Hehe, well, I'd like to not add yet another bundler-specific environment variable either if we can avoid it. Anyways, if you have a very strong opinion about using BUNDLE_BIN_PATH rather that checking whether -rbbundler/setup is included in RUBYOPT, I won't get in the middle of that.

I agree with this. To speak of extremes, if Bundler had been a built-in feature of Ruby, we would not need bundle exec at all.

(This is off-topic: Nowadays, many people are using Bundler, so it might be a good time to enable Bundler by default? We can start it with require "bundler/setup" unless ENV["NO_BUNDLER"] :-)

I actually started taking some small steps in that direction. There's already a RUBYGEMS_GEMDEPS environment variable to opt into this, but it's not working very well. I'm trying to make it work better here: https://github.com/rubygems/rubygems/pull/4532.

Updated by mrkn (Kenta Murata) 5 months ago

mame (Yusuke Endoh) wrote in #note-10:

(This is off-topic: Nowadays, many people are using Bundler, so it might be a good time to enable Bundler by default? We can start it with require "bundler/setup" unless ENV["NO_BUNDLER"] :-)

I disagree this idea because I often use ruby without bundle exec prefix in my library development to use libraries not listed in Gemfile and gemspec file.
I can accept this idea if bundle exec can work while NO_BUNDLER environment variable is supplied.

Updated by Eregon (Benoit Daloze) 5 months ago

mame (Yusuke Endoh) wrote in #note-3:

In principle, explicit command-line arguments are "stronger" than RUBYOPT. For RUBYOPT=-W2 ruby -W0, -W0 wins; -W2 is just ignored. For RUBYOPT=-ra ruby -rb, -rb wins; b is loaded earlier than a; in theory, b should be able to cancel a.

How can b.rb cancel/override a.rb if it runs before? It seems impossible.
Like deivid (David Rodríguez) says, for me the -r order is just inconsistent.
I really think this is the fundamental bug here.

In other words, RUBYOPT should behave like RUBYOPT=$RUBYOPT ruby $CLI_ARGS -> ruby $RUBYOPT $CLI_ARGS.
That's true for -W and for -I, but it's untrue for -r.

Is there any advantage of the current confusing order for -r in RUBYOPT?

Updated by Eregon (Benoit Daloze) 5 months ago

I think the workaround to solve it for bundle exec ruby -r can be valuable (https://github.com/rubygems/rubygems/pull/4616).

It seems a much smaller issue that a script doing https://bugs.ruby-lang.org/issues/17883#note-5 does not work.
That script can simply use require "active_support"; ..., and only needs to be changed once.

Regarding fixing the -r order, maybe we could warn for a release if there is -r in both RUBYOPT and on the command-line, and then in the next release make it consistent?

Updated by nobu (Nobuyoshi Nakada) 5 months ago

Eregon (Benoit Daloze) wrote in #note-13:

In other words, RUBYOPT should behave like RUBYOPT=$RUBYOPT ruby $CLI_ARGS -> ruby $RUBYOPT $CLI_ARGS.
That's true for -W and for -I, but it's untrue for -r.

Not true for -I at least.
Paths by -I in RUBYOPT (and RUBYLIB) are placed after paths by -I in command-line.

$ RUBYOPT=-I/c ruby -I/a -I/b -e 'p $:[0,3]'
["/a", "/b", "/c"]

Updated by jeremyevans0 (Jeremy Evans) 5 months ago

mame (Yusuke Endoh) wrote in #note-10:

BTW, I noticed that my patch could be even much simpler. If you like BUNDLER_SETUP:

diff --git a/lib/rubygems.rb b/lib/rubygems.rb
index 3585defd2d..65f34d8812 100644
--- a/lib/rubygems.rb
+++ b/lib/rubygems.rb
@@ -1366,3 +1366,7 @@ def default_gem_load_paths
 require 'rubygems/core_ext/kernel_warn'

 Gem.use_gemdeps
+
+require "bundler/setup" if ENV["BUNDLER_SETUP"]

While I still prefer a more generic approach, I'm OK with this idea as it allows the user to opt-in to the use of bundler by setting an environment variable.

(This is off-topic: Nowadays, many people are using Bundler, so it might be a good time to enable Bundler by default? We can start it with require "bundler/setup" unless ENV["NO_BUNDLER"] :-)

I'm against this in the strongest possible terms. There are more ruby scripts not using bundler than there are ruby applications using bundler. We should not force the usage of bundler and require users to opt-out of it. We should use an opt-in approach.

Updated by Eregon (Benoit Daloze) 5 months ago

nobu (Nobuyoshi Nakada) wrote in #note-15:

Not true for -I at least.
Paths by -I in RUBYOPT (and RUBYLIB) are placed after paths by -I in command-line.

$ RUBYOPT=-I/c ruby -I/a -I/b -e 'p $:[0,3]'
["/a", "/b", "/c"]

Mmh, so it's more complex due to unshift'ing into $:, but still:

$ RUBYOPT=-Ia ruby -Ib -e 'puts $:'
/home/eregon/b
/home/eregon/a
/home/eregon/.rubies/ruby-3.0.1/lib/ruby/site_ruby/3.0.0
...

So "a" is added/processed first, then "b".

In your example, it behaves like $:.unshift("/c") (from RUBYOPT) and then $:.unshift("/a", "/b").
Interesting that the order of "/a" and "/b" is preserved like that, but still /a & /b are before /c in $LOAD_PATH, and /c is processed first, so it is expected behavior and consistent with -W.
The important thing in this case is that the -I from command line has priority/"override"/win, and that is the case, /a & /b are before /c in $LOAD_PATH.

Isn't it natural that -r in RUBYOPT is processed before -r on the command-line? I feel the vast majority of users would expect that.

Updated by mame (Yusuke Endoh) 5 months ago

jeremyevans0 (Jeremy Evans) wrote in #note-16:

I'm against this in the strongest possible terms. There are more ruby scripts not using bundler than there are ruby applications using bundler. We should not force the usage of bundler and require users to opt-out of it. We should use an opt-in approach.

Okay, thanks. As far as I know, most Ruby projects that are maintained are using bundler, but I trust you more than myself (I'm not kidding).

In regard to the original issue, I think it can be fixed on the rubygems/bundler side. I'll try to create a pull request to the upstream later. I close this ticket.

Eregon (Benoit Daloze) Please open another ticket about the priority of RUBYOPT and command-line arguments if you need.
jeremyevans0 (Jeremy Evans) I'm still a bit negative to RUBYEARLYOPT unless it is really needed. I'm afraid if introducing it may bring a new hassle in future, such as "I need a way to insert -rmylib before -rbundler/setup!" But if you think you need it, please open another ticket.

Thank you all for the discussion.

Actions #19

Updated by mame (Yusuke Endoh) 5 months ago

  • Status changed from Open to Rejected

Updated by austin (Austin Ziegler) 5 months ago

To reiterate what Jeremy said, I have several projects where I have a
Gemfile, but never run the scripts with bundle exec. The Gemfile is
just to make sure that other developers have the necessary gems
installed.

Updated by deivid (David Rodríguez) 5 months ago

Just to clarify my previous comment.

Unconditionally enabling bundler everytime ruby is used (i.e., from rubygems.rb) is not even on the table. I only made some progress on making the RUBYGEMS_GEMDEPS environment variable optionally enable bundler from rubygems binstubs, so that if you for example run rails -v from within an application with a Gemfile, it will print the Rails version of your app, whereas if you run it outside of a Gemfile context, it will work as it works today and print the highest rails version installed globally on your system.

But this is opt-in only through ENV and does not affect regular ruby scripts (just binstubs) nor non-Gemfile contexts.

Actions

Also available in: Atom PDF