Project

General

Profile

Actions

Bug #17023

closed

How to prevent String memory to be relocated in ruby-ffi

Added by larskanis (Lars Kanis) 9 months ago. Updated 8 months ago.

Status:
Closed
Priority:
Normal
Target version:
-
ruby -v:
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
[ruby-core:99115]

Description

ruby-ffi allows to pass String objects to C by using the :string argument type. This way the string memory returned by RSTRING_PTR is passed to the C function. The user has to ensure on Ruby level that the string isn't GC'ed - as long as it is used on C level. That's the contract and this worked with all past ruby versions, but ruby-2.7 introduced GC.compact, which can relocate strings to another memory location.

This example shows the situation and that the string is relocated although it is still referenced in ruby code:

File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  def initialize(count)
    proc {} # necessary to trigger relocation
    a = "a" * count
    set(a)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{a})"
  end
end

Foo.new(23)
Foo.new(24)

The output looks like so on ruby-2.7.1:

get(23):  (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)

So using GC.compact while a string parameter is in use, both on Ruby and on C level, can cause invalid memory access. How can this prevented?

A C extension is expected to use rb_gc_mark() in order to pin the VALUE to a memory location. But I couldn't find a way to pin a VALUE at the time the argument is passed to the C function, which is the only point in time ruby-ffi has access to it.


Files

string-relocate.rb (653 Bytes) string-relocate.rb larskanis (Lars Kanis), 07/10/2020 05:27 PM
0001-Only-marked-objects-should-be-considered-movable.patch (1.23 KB) 0001-Only-marked-objects-should-be-considered-movable.patch tenderlovemaking (Aaron Patterson), 07/13/2020 05:55 PM

Updated by chrisseaton (Chris Seaton) 9 months ago

The user has to ensure on Ruby level that the string isn't GC'ed

The contract has always been more than that - the user must also not modify the string, as this may cause the character pointer to be reallocated.

Updated by larskanis (Lars Kanis) 9 months ago

chrisseaton (Chris Seaton) Exactly, the string is considered const char * as mentioned here for the :string type: https://github.com/ffi/ffi/wiki/Types#user-content-types

Updated by duerst (Martin Dürst) 9 months ago

  • Assignee set to tenderlovemaking (Aaron Patterson)
  • Status changed from Open to Assigned

Updated by nobu (Nobuyoshi Nakada) 9 months ago

  • Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED
  • Status changed from Assigned to Closed

As it doesn't seem to occur in the master but only in 2.7, I'll close this and set "Backport to 2.7" required, although I'm not sure if it is a bug.
Aaron, could you tell which commit fixed this?

Updated by tenderlovemaking (Aaron Patterson) 9 months ago

Stack scanning should prevent a from moving, so you shouldn’t need to do anything to pin this. I will also try this against the master branch, and if it’s fixed there I’ll find the commit that fixed it. Regardless objects that are on the stack should be pinned, so I’m not sure off the top of my head why this object would move.

Updated by tenderlovemaking (Aaron Patterson) 9 months ago

This is fixed in 6e7e7c1e577d6c2276e9a8cc85c28c55c46c2618. I tried cherry picking the commit to ruby_2_7, but got a conflict. I've attached the patch with the conflict resolved.

Thanks for reporting this!

Updated by larskanis (Lars Kanis) 9 months ago

Thank you Aaron for looking into this! The patch avoids that the particular string is relocated, but unfortunately the patch is not sufficient to avoid relocation of any strings given to FFI. A slightly modified version of the script fails on ruby-2.7 and on master:

File.write "string-relocate.c", <<-EOC
  static char *g_str;

  void set(char* str) {
    g_str = str;
  }

  char* get() {
    return g_str;
  }
EOC
system "gcc -shared -fPIC string-relocate.c -o string-relocate.so"

require 'ffi'

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  A = ""

  def initialize(count)
    A.replace "a" * count

    set(A)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{A})"
  end
end

Foo.new(23)
Foo.new(24)

The output is something like:

get(23): h8�U (should be: aaaaaaaaaaaaaaaaaaaaaaa)
get(24): aaaaaaaaaaaaaaaaaaaaaaaa (should be: aaaaaaaaaaaaaaaaaaaaaaaa)

In fact we recommend using a constant for this kind of usage since years: https://github.com/ffi/ffi/wiki/Core-Concepts#string-memory-allocation

I don't think there is a generic way to fix this issue other than pinning all strings on RSTRING_PTR() usage. However RSTRING_PTR() would be massive over-pinning, since the pointer is most often used only for a very short time frame and the string is movable again afterwards.

So actually I was looking for something like rb_obj_mark_unmovable() to pin the string. FFI could call this function on all string pointers passed to C. In FFI we don't know how long the string pointer is in use in the C library, so that marking all argument strings as unmovable is over-pinning as well. But for sure this would pin way less strings than RSTRING_PTR().

Updated by tenderlovemaking (Aaron Patterson) 9 months ago

larskanis (Lars Kanis) wrote in #note-7:

So actually I was looking for something like rb_obj_mark_unmovable() to pin the string. FFI could call this function on all string pointers passed to C. In FFI we don't know how long the string pointer is in use in the C library, so that marking all argument strings as unmovable is over-pinning as well. But for sure this would pin way less strings than RSTRING_PTR().

Right, that makes sense. I really need to document this (and I apologize for not doing so already), but rb_gc_register_address will pin your objects. When you know you're done with the reference, you can release it with rb_gc_unregister_address. Of course if you don't call the unregister function, the reference will stay alive forever.

Currently the GC can't know for how long you want the reference to remain unmovable, so de-registering it is required. In the code example you've presented, if the constant were to be removed or redefined, the string buffer reference would go bad without compaction involved, so I'm not sure it's "safe" to do either.

One thought I have is that FFI could ensure T_STRING objects are not embedded. The underlying char buffer won't move, but the object still could.

Another thought: could FFI expose rb_gc_register_address? The code could become something like:

class Foo
  extend FFI::Library
  ffi_lib File.expand_path('string-relocate.so')

  attach_function :set, [:string], :void
  attach_function :get, [], :string

  ffi_register_global(A = "")

  def initialize(count)
    A.replace "a" * count

    set(A)

    GC.verify_compaction_references(toward: :empty, double_heap: true)

    puts "get(#{count}): #{get} (should be: #{A})"
  end
end

I know this would require people to change their code, but then we would be better aware of the lifetime of the object.

Updated by normalperson (Eric Wong) 9 months ago

tenderlove@ruby-lang.org wrote:

Right, that makes sense. I really need to document this (and
I apologize for not doing so already), but
rb_gc_register_address will pin your objects. When you know
you're done with the reference, you can release it with
rb_gc_unregister_address. Of course if you don't call the
unregister function, the reference will stay alive forever.

Btw, does rb_gc_register_mark_object pin? A quick glance at
gc.c tells me it doesn't, and I'll need to revert commit
2a6cb76d5010cb763ef5a2c305728465d15eb7c9 in unicorn:
https://yhbt.net/unicorn-public/20181226050857.6413-1-e@80x24.org/

Anyways, it takes me too long to compile Ruby so I'm back to
running whatever my distro ships. I haven't been able to
test GC.compact at all.

https://bugs.ruby-lang.org/issues/17023#change-86563

Updated by nagachika (Tomoyuki Chikanaga) 9 months ago

  • Backport changed from 2.5: DONTNEED, 2.6: DONTNEED, 2.7: REQUIRED to 2.5: DONTNEED, 2.6: DONTNEED, 2.7: DONE

ruby_2_7 e619178e52250ceda3a0fe32ff5addb16617b58c merged revision(s) 6e7e7c1e577d6c2276e9a8cc85c28c55c46c2618.

Updated by larskanis (Lars Kanis) 9 months ago

Sorry for not responding earlier! Unfortunately this issue still remains. So it is not closed and the backported commit doesn't fix it in any way.

I tried Aarons hint to change all passed strings to non-embedded content. The result is here: https://github.com/ffi/ffi/pull/811

Unfortunately this adds a significant slowdown (up to 70%) and is not suitable for frozen strings. But frozen strings are equally relocated. These both issues makes the PR a no-go.

Regarding rb_gc_register_address(): Enqueuing every string value passed to C would keep them alive forever, but we don't know how long the string is in use by the C library. We'd need some kind of finalizer hook to unregister them. But the finalizer is not called until rb_gc_unregister_address() was called.

Regarding ffi_register_global: This could solve the issue, but establishes a new API which is mostly superfluous and breaks the current contract. Both causes (GC.compact and long string usage in C) are rarely used in combination. But only if they're used together, this new call would be needed. This is quite hard to explain to people.

We would need a function that pins the VALUE for it's further lifetime, but without keeping it alive.

Updated by Hanmac (Hans Mackowiak) 9 months ago

the problem there is that ruby can't know how long the C lib is going to use the String. Depending on the C Function, might even take "ownership" of the string and does free it itself, so when Ruby would free the string too, it causes a Double Free crash.

Updated by larskanis (Lars Kanis) 9 months ago

Hanmac (Hans Mackowiak) The string must be kept as a ruby object in an instance variable or constant, etc. and must not be modified as long as it's in use in the C library. Similar on the C side the string must not be modified and freeing it is not allowed. The details are the same since 10 years and are described here: https://github.com/ffi/ffi/wiki/Core-Concepts#memory-management Now ruby-2.7+ breaks this contract in some cases by moving the string value around.

Raw String usage for :pointer or :string arguments is the fastest way to transfer data buffers from ruby to C. FFI::MemoryPointer is a more flexible alternative, but is way slower.

Updated by tenderlovemaking (Aaron Patterson) 9 months ago

One idea I had is that we could make RSTRING_PTR, StringValueCStr, StringValuePtr, etc ensure that the character buffer it returns is not embedded in the object. So, if it's an embedded string, move it to malloc. We could introduce "unsafe" functions that don't do that, then change MRI internals to use the unsafe versions. This seems like a big change, but I'm struggling to think of something smaller.

Another thought is rather than exposing rb_gc_register_address, just implement an object that has one reference and calls rb_gc_mark on the reference. Again, this would have to be a change for FFI's API, but I think it's worthwhile since anyone could just const_set(:A, nil) and the string would get collected and the saved char* pointer would go bad. (Obviously people probably aren't going to do that const_set thing, but relying on an indirect reference to keep your object alive is asking for trouble)

I'm willing to do the work to introduce "unsafe" versions of the string buffer APIs if we think it would be worthwhile and if we can't find a smaller solution.

Updated by Eregon (Benoit Daloze) 9 months ago

Maybe FFI should discourage cases where the native function saves the RSTRING_PTR() and still use it after the native call returned?
That seems quite dangerous if e.g. the Ruby String is mutable.
Is it commonly used, do we have a real example?

Maybe we need a pin function that doesn't mind if the String is frozen?
After all pinning doesn't change the String contents.

Updated by tenderlovemaking (Aaron Patterson) 9 months ago

Eregon (Benoit Daloze) wrote in #note-15:

Maybe we need a pin function that doesn't mind if the String is frozen?
After all pinning doesn't change the String contents.

I think Lars was having trouble expanding strings that are frozen, not pinning them. I suggested that strings could be expanded such that they cannot be embedded in the Ruby object, and this technique doesn't work on frozen strings.

Updated by Eregon (Benoit Daloze) 9 months ago

tenderlovemaking (Aaron Patterson) wrote in #note-16:

I think Lars was having trouble expanding strings that are frozen, not pinning them. I suggested that strings could be expanded such that they cannot be embedded in the Ruby object, and this technique doesn't work on frozen strings.

Right, what I wanted to mean is "pinning the RSTRING_PTR", so a new function to expand the char* so it's not embedded, and that would also work for frozen Strings.

Updated by larskanis (Lars Kanis) 9 months ago

I think there are several options to solve this issue:

  1. Change RSTRING_PTR() to pin the string to a fixed memory location.
  2. Add a new function kind of rb_obj_mark_unmovable() to ruby's C-API that pins string memory without keeping the string alive.
  3. Expose rb_gc_register_address() and rb_gc_unregister_address() to ruby and require users to explicit mark objects to be in use.
  4. Change the FFI contract, in such a way, that a parameter passed as String might not be accessed after the C call returns. As an alternative, FFI::MemoryPointer can be used, which works already without any modifications.

As mentioned above the issue is a very rare situation. An invalid access only happens in case of a parameter passed as String has been stored by a C function, GC.compact is then called somewhere in the application, the String in question is actually moved and is subsequently accessed on C level. In fact it is that rare, that we didn't receive a single bug report for this situation, but the example above is just constructed. Nevertheless I would like to fix it in a proactive manner, instead of keeping it as an undocumented Heisenbug. Given the rare probability this issue comes up, I don't think it's acceptable to get a measurable slowdown by the solution, however.

To 1: I don't think it's necessary to change RSTRING_PTR(). It is a function used by almost every C extension, but this particular issue is very specific to FFI. Only FFI depends on indirect marking of objects as a core feature. Even Fiddle is much less affected, since it requires the user to explicit allocate and deallocate memory by default and doesn't allow passing ruby strings as parameters. So we don't need a transparent solution, but only something that can be build into FFI and that can be used explicit where necessary. Also if this change slows down RSTRING_PTR() it would slow down all C-extensions.

To 2: This is my favorite: I would like to call a Ruby-API function from FFI's C-extension that marks the passed String object as unmovable without keeping it alive. So the pinning should be for the rest of the lifetime of the String object. Since this function must be called for every String passed to a FFI function, it must be fast. So copying the string to non-embedded memory is no good option, since https://github.com/ffi/ffi/pull/811 showed, that it results in a 70% slowdown for each and every string passed to C - regardless how it will be used afterwards.

To 3: As mentioned earlier I don't like this solution. It doesn't fit to the FFI API - it is ugly. The object must be explicit unregistered in contrast to most other FFI objects. Since the object must be stored in a Ruby variable for calling rb_gc_unregister_address(), the object is always marked twice - once by the the mark list and once by the ruby variable. So both references have to be released to get the object GC'ed. It's hard to describe when and how this function has to be called.

To 4: If 2. isn't possible to implement, I would prefer this one. Since values on the stack are not moved, it should be safe to use String objects by the C function, even if the function releases the GVL and GC.compact is called in the meanwhile. And when the function returns, any access to the pointer is prohibited.

tenderlovemaking (Aaron Patterson) Is it possible to implement something like 2?

Updated by larskanis (Lars Kanis) 8 months ago

tenderlovemaking (Aaron Patterson) The issue in FFI is still unsolved. Could you please have a look at the questions above? Or may I open a new issue for discussion?

Updated by tenderlovemaking (Aaron Patterson) 8 months ago

larskanis (Lars Kanis) wrote in #note-19:

tenderlovemaking (Aaron Patterson) The issue in FFI is still unsolved. Could you please have a look at the questions above? Or may I open a new issue for discussion?

Hi Lars,

Sorry I totally missed your last email. The main issue with solution 2 is that there is no place to store that information. The GC doesn't keep any persistent data about objects between (major) GC cycles. To support 2 we would have to store the data in the string itself, and I'm not sure if we have header bits available to do that.

The second problem with solution 2 is that this is not a problem unique just to strings. If an FFI extension holds a reference to any VALUE (maybe by stuffing it in a void* or something) and doesn't rb_gc_mark the value, the GC will assume it is safe to move that object. Fiddle has the same problem, so I introduced a "pinning reference" here:

https://github.com/ruby/fiddle/pull/44

All it does is hold a reference to the object and call rb_gc_mark on the reference at mark time.

I think adding something like ^ to the FFI gem and asking users to wrap objects they know will hold references is the safest solution. As you mention, this issue only impacts objects that are stored off the stack, so it should be pretty rare for anyone to need this. Another idea I had was to figure out some way to have FFI automatically wrap constants with the above proxy object, but I didn't get that far.

Actions

Also available in: Atom PDF