Project

General

Profile

Actions

Bug #21096

open

`Process.fork` hangs up on QEMU when called multiple times.

Added by midnight (Sarun R) 24 days ago. Updated 8 days ago.

Status:
Open
Assignee:
-
Target version:
-
ruby -v:
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +PRISM [x86_64-linux]
[ruby-core:120817]

Description

Hello,

We are experiencing issues when using Bootsnap for production container image building, specifically when running bundle exec bootsnap precompile --gemfile on an emulated ARM64 environment on AMD64 hosts.

Here are more details:

  • Bootsnap is a Ruby gem for bytecode caching, which speeds up loading times. It achieves this by calling RubyVM::InstructionSequence.compile_file and RubyVM::InstructionSequence.load_from_binary.
  • When running bundle exec bootsnap precompile --gemfile in the environment described (using QEMU to emulate the AArch64 instruction set), the process can compile and generate some bytecode but eventually hangs.
  • The hang seems random, occurring at different points during the process.
  • According to a Rails GitHub issue, the issue affects both Docker and containerd, as well as native Linux and macOS environments.
  • From a Bootsnap GitHub issue, the problem likely doesn't occur in Ruby 3.2.6 but appears in later versions.
  • So far, the common factors observed are:
    • Emulated ARM64 on AMD64 CPUs (likely using QEMU)
    • Newer Ruby versions (e.g., 3.3 and 3.4)

As a user, I don't have the expertise to debug the issue but am willing to gather more information if provided with sufficient instructions.

Updated by byroot (Jean Boussier) 24 days ago

As mentioned on the Rails issue, this is unlikely to be debugged unless a reproducer is provided, which does include a Ruby script or application that cause the hang when compiled, and the precise host and container versions.

Updated by midnight (Sarun R) 24 days ago

You are one of the most responsive people I have ever encountered.

To test the reproduction, iiewad provided a branch of his application where this issue occurs.
(It’s essentially a standard Rails application with something like --verbose added to the Dockerfile for visualization.)

git clone -b debug https://github.com/iiewad/blog.git iiewad-blog
cd iiewad-blog

In my environment, I use the following command:

sudo nerdctl build --platform=arm64 .

(containerd has both rootless and rootful modes; multi-architecture building requires the rootful one.)

I’ve already confirmed that the code from the repository reproduces the issue in my environment.
Screenshot
Since he included --verbose in the Dockerfile, the screenshot shows where bootsnap is hanging.

Software versions:
iiewad-blog: commit-c8f6d15
nerdctl:

Client:
Version: 2.0.2
OS/Arch: linux/amd64
Git commit: unknown
buildctl:
Version: 0.18.2

Server:
containerd:
Version: v1.7.23
GitCommit: 57f17b0a6295a39009d861b89e3b3b87b005ca27
runc:
Version: 1.2.4
GitCommit: v1.2.4-0-g6c52b3fc541f

QEMU

sudo systemctl start containerd
sudo nerdctl run --privileged --rm tonistiigi/binfmt:qemu-v8.1.5 --install arm64

Feel free to ask if you run into any issues with the reproduction steps or need more details about the environment I’m using.

Updated by midnight (Sarun R) 8 days ago

  • Subject changed from Ruby hangs up when compiling for bytecode on AArch64 emulated by QEMU to `Process.fork` hangs up on QEMU when called multiple times.

Hello, I made progress somehow and the issue has been isolated.
Here is the minimal reproduction code without projects, 3rd party gems, or external dependencies apart from MRI on Linux/QEMU.

$stdout.sync = true
pid_list = 4.times.map do
  Process.fork do
    puts 'success!'
    exit!(true)
  end
end

pid_list.each do |pid|
  Process.wait2(pid)
end

On QEMU, you'll get only one success! printed and the Ruby interpreter hangs.

Updated by byroot (Jean Boussier) 8 days ago

Thank you for the repro, I wonder if maybe it is caused by https://gitlab.com/qemu-project/qemu/-/issues/285

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0