Project

General

Profile

Actions

Bug #18952

closed

rb_aligned_free: munmap failed

Added by mdomsch-sz (Matt Domsch) 2 months ago. Updated about 2 months ago.

Status:
Third Party's Issue
Priority:
Normal
Target version:
-
ruby -v:
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux-musl]
[ruby-core:109404]

Description

Over the past several weeks, our Linux ruby 3.1.2 application has failed repeatedly with:

[BUG] rb_aligned_free: munmap failed

Ruby tracebacks indicate failures in many libraries and functions, due to the garbage collector running at its discretion. Reproduction is not easily accomplished as this is a low-level ruby function being invoked from within a large web application, and not directly by our application, the servers of which have 24GB RAM and are not seeing any indication of out-of-memory concerns.

Unfortunately the munmap() call here does not catch the return value and errno to report why munmap() failed.

I append one example backtrace log.


Files

logs.gz (414 KB) logs.gz backtrace log mdomsch-sz (Matt Domsch), 08/01/2022 11:26 PM

Updated by mdomsch-sz (Matt Domsch) 2 months ago

Over the same 2-week period, I've seen two other logs that appear to be related:
[BUG] rb_aligned_malloc: munmap failed for end

Updated by mdomsch-sz (Matt Domsch) 2 months ago

I note that the process has 65531 regions in the process memory map, which is basically identical to the default runtime limit vm.max_map_count = 65530. Our runtime environment is AWS Fargate, so it's unlikely we can raise this limit. Other reporters hitting this limit have likewise determined it can't be changed in Fargate.

Updated by mame (Yusuke Endoh) 2 months ago

  • Assignee set to peterzhu2118 (Peter Zhu)

@peterzhu2118 (Peter Zhu) Can you check it out?

This is just my guess but I think it has something to do with musl's strategy on how to use mmap/munmap. Are you using Alpine Linux? I wonder if it won't happen if you use Ubuntu with glibc.

Updated by mdomsch-sz (Matt Domsch) 2 months ago

We are using the official Ruby Docker image ruby:3.1.2-alpine3.15 which provides musl. We came to the same conclusion today to try ruby:3.1.2-slim-bullseye which comes with glibc, and using the environment variable MALLOC_ARENA_MAX=2 (it would otherwise default to 32 in our Fargate configuration). We will try this out on 8/3/22 and report back, though it takes a few days to experience the memory fragmentation that leads to such high map count thus far. We have not taken the next step of using jemalloc, and will likely do so as a further test later in the week.

Updated by mdomsch-sz (Matt Domsch) 2 months ago

We were able to get our application with ruby:3.1.2-slim-bullseye and MALLOC_ARENA_MAX=2 into production last night, and there were no observed failures since. Without the change we would have expected several hundred failures. This appears to be an appropriate solution for us. The number of memory maps per process is between 600 and 3200, well below the 65530 limit we were reaching frequently before.

Actions #6

Updated by jeremyevans0 (Jeremy Evans) about 2 months ago

  • Status changed from Open to Third Party's Issue
Actions

Also available in: Atom PDF