Project

General

Profile

Actions

Bug #20863

open

`zlib.c` calls `rb_str_set_len` and `rb_str_modify_expand`(and others) without holding the GVL.

Added by ioquatix (Samuel Williams) 14 days ago. Updated 11 days ago.

Status:
Open
Assignee:
-
Target version:
-
[ruby-core:119724]

Description

Background

I was working on https://bugs.ruby-lang.org/issues/20876 and was investigating some problems with zlib.c and GVL, and noticed that zstream_run_func is executed without the GVL, but can invoke various rb_ string functions. Those functions in turn can invoke rb_raise and generally look problematic. However, maybe by luck, such code path does not appear to be invoked in typical usage.

However, even so, it is possible to cause zstream_run_func to segfault by a carefully crafted program which causes the internal buffer to be resized while the GVL is released: https://github.com/ruby/zlib/pull/88#issuecomment-2455772054

Proposal

I would like to modify zlib.c to only release the GVL around the CPU intensive compression/decompression operation: https://github.com/ruby/zlib/pull/88

In addition, I identified several more improvements to prevent segfaults and other related failures:

  • Use rb_str_locktemp to prevent the z->buf changing size while in use by the rb_nogvl code.
  • Expand the mutex to protect #deflate and #inflate completely, not just the internal operation.

In order to catch these issues earlier and find other bugs like this, I recommend we introduce additional checks: https://bugs.ruby-lang.org/issues/20877

Actions #1

Updated by ioquatix (Samuel Williams) 14 days ago

  • Description updated (diff)
Actions #2

Updated by ioquatix (Samuel Williams) 12 days ago

  • Description updated (diff)
Actions #3

Updated by ioquatix (Samuel Williams) 12 days ago

  • Description updated (diff)

Updated by byroot (Jean Boussier) 11 days ago

@ko1 (Koichi Sasada) Do we have a proper description of what is safe and what it unsafe to do with the GVL released?

Because obviously it's OK to use ruby_xmalloc / ruby_xfree with the GVL released, so methods which allocate aren't necessarily problematic?\

In this case I'm unclear on why rb_str_set_len / rb_str_modify_expand shouldn't be called with the GVL released, assuming the objects on which they operate aren't visible to any other thread.

I think it would be helpful to have more clear guidelines on these things (unless of course I missed some existing documentation).

Updated by ko1 (Koichi Sasada) 11 days ago

Quoted from rb_thread_call_without_gvl doc:

 * NOTE: You can not execute most of Ruby C API and touch Ruby
 *       objects in `func()' and `ubf()', including raising an
 *       exception, because current thread doesn't acquire GVL
 *       (it causes synchronization problems).  If you need to
 *       call ruby functions either use rb_thread_call_with_gvl()
 *       or read source code of C APIs and confirm safety by
 *       yourself.
 *
 * NOTE: In short, this API is difficult to use safely.  I recommend you
 *       use other ways if you have.  We lack experiences to use this API.
 *       Please report your problem related on it.
 *
 * NOTE: Releasing GVL and re-acquiring GVL may be expensive operations
 *       for a short running `func()'. Be sure to benchmark and use this
 *       mechanism when `func()' consumes enough time.
 *
 * Safe C API:
 * * rb_thread_interrupted() - check interrupt flag
 * * ruby_xmalloc(), ruby_xrealloc(), ruby_xfree() -
 *   they will work without GVL, and may acquire GVL when GC is needed.

Again:

 * NOTE: In short, this API is difficult to use safely.  I recommend you
 *       use other ways if you have.  We lack experiences to use this API.
 *       Please report your problem related on it.

Updated by byroot (Jean Boussier) 11 days ago

@ko1 (Koichi Sasada) Not sure how I didn't think to check that, thank you. So indeed allocations are fine. From what I understand, the issue is mostly exceptions and of course using an object concurrently.

Updated by ioquatix (Samuel Williams) 11 days ago

I think the issue is, those methods from a public interface POV, are not allowed to be called without the GVL.

Even if today the implementation follows a "safe" code path, in the future, it may not.

Adding these annotations will help to clarify that "this method is not safe to call without the GVL" - a form of internal and run-time documentation.

Updated by Eregon (Benoit Daloze) 11 days ago ยท Edited

ioquatix (Samuel Williams) wrote in #note-7:

Even if today the implementation follows a "safe" code path, in the future, it may not.

This is a good point.
I think we should consider all C API functions unsafe to be called without the GVL, except the functions listed in Safe C API.
So I think we should update the docs to remove or read source code of C APIs and confirm safety by yourself. as it's not a good idea as it may change and it's very hard to assess if safe.

Updated by byroot (Jean Boussier) 11 days ago

There would be quite a lot of value in having some nogvl save APIs though. e.g. if database clients could allocate Hash/Array/String to build the response while the GVL is still released, it could really help with throughput of threaded servers like Puma.

Updated by ioquatix (Samuel Williams) 11 days ago

There would be quite a lot of value in having some nogvl save APIs though. e.g. if database clients could allocate Hash/Array/String to build the response while the GVL is still released, it could really help with throughput of threaded servers like Puma.

I think it's a great idea (seriously great), but out of scope for this issue. Do you want to create a new issue to start that discussion?

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like1Like1Like1Like0Like1Like1Like0