Project

General

Profile

Bug #16136

String corruption in 2.6.4

Added by deivid (David Rodríguez) 3 months ago. Updated 2 months ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.4p104 (2019-08-28 revision 67798) [x86_64-linux]
[ruby-core:94757]

Description

When trying to upgrade activeadmin's tests to use ruby 2.6.4, I got some very weird failures where some strings would end up containing random content. See https://circleci.com/gh/activeadmin/activeadmin/20329 for an example of those failures. Failures are somewhat random but the test suite seems to consistently fail on 2.6.4 and pass on 2.6.3. I also managed to reproduce the failures locally, sometimes very consistently, but now it's hardly ever happening... :/

I looked at the changes in 2.6.4 and it looks like https://bugs.ruby-lang.org/issues/16105 and/or https://bugs.ruby-lang.org/issues/15946 could be related, because the region of the Rails code where the corrupted string is created seems to be doing something similar to the reported issues: https://github.com/rails/rails/blob/b7e591a55f4de5f1511c3b9255b3b25159b8ba41/actionpack/lib/action_dispatch/routing/mapper.rb#L404-L412.

Sorry I'm not really sure how to create a reproducible example, or how to debug this issue. Just opening this in case it could help noticing something obviously wrong with the mentioned patches.


Related issues

Related to Ruby master - Bug #15916: Memory leak in Regexp literal interpolationClosedActions

Associated revisions

Revision ade1283c
Added by mame (Yusuke Endoh) 3 months ago

Fix a use-after-free bug by avoiding rb_str_new_frozen

str2 = rb_str_new_frozen(str1) seems to make str1 a shared string that
refers to str2, but str2 is not marked as STR_IS_SHARED_M nor
STR_NOFREE.
rb_fstring(str2) frees str2's ptr because it is not marked, and the
free'ed pointer is the same as str1's ptr.
After that, accessing str1 may cause use-after-free memory corruption.

I guess this is a bug of rb_str_new_frozen, but I'm completely unsure
what it should be; the string states and flags are not documented.
So, this is a workaround for [Bug #16136]. I confirmed that rspec of
activeadmin runs gracefully.

Revision a165a066
Added by nagachika (Tomoyuki Chikanaga) 3 months ago

merge revision(s) ade1283ca276f7d589ffd3539fbc7b9817f682d5: [Backport #16136]

    Fix a use-after-free bug by avoiding rb_str_new_frozen

    `str2 = rb_str_new_frozen(str1)` seems to make str1 a shared string that
    refers to str2, but str2 is not marked as STR_IS_SHARED_M nor
    STR_NOFREE.
    `rb_fstring(str2)` frees str2's ptr because it is not marked, and the
    free'ed pointer is the same as str1's ptr.
    After that, accessing str1 may cause use-after-free memory corruption.

    I guess this is a bug of rb_str_new_frozen, but I'm completely unsure
    what it should be; the string states and flags are not documented.
    So, this is a workaround for [Bug #16136].  I confirmed that rspec of
    activeadmin runs gracefully.

git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/branches/ruby_2_6@67803 b2dd03c8-39d4-4d8f-98ff-823fe69b080e

Revision 67803
Added by nagachika (Tomoyuki Chikanaga) 3 months ago

merge revision(s) ade1283ca276f7d589ffd3539fbc7b9817f682d5: [Backport #16136]

Fix a use-after-free bug by avoiding rb_str_new_frozen

`str2 = rb_str_new_frozen(str1)` seems to make str1 a shared string that
refers to str2, but str2 is not marked as STR_IS_SHARED_M nor
STR_NOFREE.
`rb_fstring(str2)` frees str2's ptr because it is not marked, and the
free'ed pointer is the same as str1's ptr.
After that, accessing str1 may cause use-after-free memory corruption.

I guess this is a bug of rb_str_new_frozen, but I'm completely unsure
what it should be; the string states and flags are not documented.
So, this is a workaround for [Bug #16136].  I confirmed that rspec of
activeadmin runs gracefully.

History

Updated by vcsjones (Kevin Jones) 3 months ago

Hello,

We started running in to this issue as well with Ruby 2.6.4. We are seeing consistent but non-deterministic string corruption in one of our applications. As an example, we see something like:

undefined method `"defense_profile_extension_id\x00eq"'

Note the \x00 in the string does not belong there, it should be a _. However these results are not consistent. Another run of the application with zero code changes produced this error:

undefined method `"\x00\x00\x00\x00\x00\x00\x00\x00f_profile_extension_id_eq"

This string is being passed in to public_send elsewhere in another library, hence the "undefined method" error.

Unfortunately, we have yet to be able to create a small reproduction of this issue. We can make it happen consistently in a very large code base though. I did manage to run git bisect on a locally compiled ruby and determined that the change that introduced is https://github.com/ruby/ruby/commit/da36d5700d9e0e66411d93595b6f654c85853fa1.

When we revert that change, the issue can no longer be reproduced.

Updated by alanwu (Alan Wu) 3 months ago

I wonder if these bugs exist on master or are specific to 2.6.4.

#3

Updated by naruse (Yui NARUSE) 3 months ago

  • Related to Bug #15916: Memory leak in Regexp literal interpolation added
#4

Updated by naruse (Yui NARUSE) 3 months ago

  • Backport changed from 2.5: UNKNOWN, 2.6: UNKNOWN to 2.5: DONTNEED, 2.6: REQUIRED

Updated by deivid (David Rodríguez) 3 months ago

vcsjones (Kevin Jones) You're right, my raw guess at the culprit was wrong since I just got the issue again with the ruby I built excluding the commits I mentioned. I'm going to build the same ruby you built now to confirm that the culprit is https://github.com/ruby/ruby/commit/da36d5700d9e0e66411d93595b6f654c85853fa1 from my side too.

If this is the culprit, I'm guessing this is happening on master too, but I'll run my tests against master as well to confirm.

Updated by hsbt (Hiroshi SHIBATA) 3 months ago

I faced the same error maybe.

Our company blog generated by middleman was failed with Ruby 2.6.4 and 2.7.0 built by master branch. This issue was completely reproduced with them. We could build with Ruby 2.5.6.

I will investigate the root causes of this issue.

#7

Updated by mame (Yusuke Endoh) 3 months ago

  • Status changed from Open to Closed

Applied in changeset git|ade1283ca276f7d589ffd3539fbc7b9817f682d5.


Fix a use-after-free bug by avoiding rb_str_new_frozen

str2 = rb_str_new_frozen(str1) seems to make str1 a shared string that
refers to str2, but str2 is not marked as STR_IS_SHARED_M nor
STR_NOFREE.
rb_fstring(str2) frees str2's ptr because it is not marked, and the
free'ed pointer is the same as str1's ptr.
After that, accessing str1 may cause use-after-free memory corruption.

I guess this is a bug of rb_str_new_frozen, but I'm completely unsure
what it should be; the string states and flags are not documented.
So, this is a workaround for [Bug #16136]. I confirmed that rspec of
activeadmin runs gracefully.

Updated by mame (Yusuke Endoh) 3 months ago

I've committed ade1283ca276f7d589ffd3539fbc7b9817f682d5, which is a workaround for the issue. I confirm that activeadmin test runs successfully with this patch.

I think that this issue is related to the fstring mechanism, but I have no idea what function is wrong. The internal state of a String object is too complicated and undocumented, so I cannot see the design, representation, and invariants. The fstring mechanism is too difficult for human (except normalperson (Eric Wong)).

Updated by vcsjones (Kevin Jones) 3 months ago

I applied the patch to ruby 2.6.4 and I cannot reproduce the issue any more. This patch appears to resolve the issue for 2.6.4. Thank you very much for taking the time to investigate the issue.

diff --git a/symbol.c b/symbol.c
index f3506a02dd..a408ee052e 100644
--- a/symbol.c
+++ b/symbol.c
@@ -743,7 +743,8 @@ rb_str_intern(VALUE str)
        enc = ascii;
     }
     else {
-       str = rb_str_new_frozen(str);
+        str = rb_str_dup(str);
+        OBJ_FREEZE(str);
     }
     str = rb_fstring(str);
     type = rb_str_symname_type(str, IDSET_ATTRSET_FOR_INTERN);

Updated by deivid (David Rodríguez) 3 months ago

Wow! Thanks so much for providing this solution in such a timely manner! <3

Updated by mame (Yusuke Endoh) 3 months ago

vcsjones (Kevin Jones) Good to hear! Thank you for confirming. And your investigation by bisect was very helpful.
deivid (David Rodríguez) The most difficult part for me was to read .circleci/config.yml to reproduce the issue on my machine :-)

Updated by deivid (David Rodríguez) 3 months ago

mame (Yusuke Endoh) I'm sorry you had to do extra work to figure that out! I'll post proper reproduction steps next time :)

Updated by graywolf (Gray Wolf) 3 months ago

Out of curiosity, why is this issue closed if it was not yet backported to 2.6 branch?

Updated by mame (Yusuke Endoh) 3 months ago

graywolf (Gray Wolf) wrote:

Out of curiosity, why is this issue closed if it was not yet backported to 2.6 branch?

Don't worry. The branch maintainers, nagachika (Tomoyuki Chikanaga) and usa (Usaku NAKAMURA), are going to check all tickets whose "Backport" field is "REQUIRED", regardless of their status.

Updated by nagachika (Tomoyuki Chikanaga) 3 months ago

  • Backport changed from 2.5: DONTNEED, 2.6: REQUIRED to 2.5: DONTNEED, 2.6: DONE

ruby_2_6 r67803 merged revision(s) ade1283ca276f7d589ffd3539fbc7b9817f682d5.

Updated by mame (Yusuke Endoh) 2 months ago

FYI: Ruby 2.6.5 has been released, which includes the fix for this issue.

https://www.ruby-lang.org/en/news/2019/10/01/ruby-2-6-5-released/

Updated by deivid (David Rodríguez) 2 months ago

Thanks so much! <3

Also available in: Atom PDF