Bug #15937
closedSegmentation fault when String#initialize given same string with capacity field
Description
Reproduction steps:
string buffer corruption:
s = "mystring"
s.__send__(:initialize, s, capacity: 1000)
puts s
segfault:
s = "mystring that can't be embedded because it's too long and therefore must be allocated"
s.__send__(:initialize, s, capacity: 1000)
Thanks for your time :)
        
           Updated by luke-gru (Luke Gruber) over 6 years ago
          Updated by luke-gru (Luke Gruber) over 6 years ago
          
          
        
        
      
      NOTE: I didn't attach a patch or proof of concept fix because I don't know how you want to handle this edge case. The easiest thing would just be to return the original string in rb_str_init and not honor the capacity change here.
        
           Updated by nobu (Nobuyoshi Nakada) over 6 years ago
          Updated by nobu (Nobuyoshi Nakada) over 6 years ago
          
          
        
        
      
      I can't reproduce the buffer corruption nor the segfault, but found the content was cleared on a short string.
Which version did you try?
        
           Updated by nobu (Nobuyoshi Nakada) over 6 years ago
          Updated by nobu (Nobuyoshi Nakada) over 6 years ago
          
          
        
        
      
      - Status changed from Open to Closed
Applied in changeset git|28678997e40869f5591eae60edd9757334426ffb.
Preserve the string content at self-copying
- string.c (rb_str_init): preserve the embedded content when
 self-copying with a capacity. [Bug #15937]
        
           Updated by luke-gru (Luke Gruber) over 6 years ago
          Updated by luke-gru (Luke Gruber) over 6 years ago
          
          
        
        
      
      Sorry, I forgot puts s in second script. I still get the segfault even in trunk with your patch, but the embedded small strings case is fixed.
Thank you.
        
           Updated by luke-gru (Luke Gruber) over 6 years ago
          Updated by luke-gru (Luke Gruber) over 6 years ago
          
          
        
        
      
      I no longer get the segfault if in rb_str_init, return str is added right after existing check of (orig == str), but then capacity of string won't change.
EDIT: I think I figured out what was going on. If str is a shared string, it must be made independent before realloc in rb_str_init.
check_str_modifiable(str);
if (orig == str && FL_TEST(str, STR_SHARED)) {
  str_make_independent(str);
}
        
           Updated by nobu (Nobuyoshi Nakada) over 6 years ago
          Updated by nobu (Nobuyoshi Nakada) over 6 years ago
          
          
        
        
      
      - Backport changed from 2.4: UNKNOWN, 2.5: UNKNOWN, 2.6: UNKNOWN to 2.4: REQUIRED, 2.5: REQUIRED, 2.6: REQUIRED
        
           Updated by nagachika (Tomoyuki Chikanaga) about 6 years ago
          Updated by nagachika (Tomoyuki Chikanaga) about 6 years ago
          
          
        
        
      
      - Backport changed from 2.4: REQUIRED, 2.5: REQUIRED, 2.6: REQUIRED to 2.4: REQUIRED, 2.5: REQUIRED, 2.6: DONE
ruby_2_6 r67738 merged revision(s) 28678997e40869f5591eae60edd9757334426ffb,8797f48373dcfa3ff8e748667732dea8aea4347e.
        
           Updated by usa (Usaku NAKAMURA) about 6 years ago
          Updated by usa (Usaku NAKAMURA) about 6 years ago
          
          
        
        
      
      - Backport changed from 2.4: REQUIRED, 2.5: REQUIRED, 2.6: DONE to 2.4: REQUIRED, 2.5: DONE, 2.6: DONE
ruby_2_5 r67769 merged revision(s) 28678997e40869f5591eae60edd9757334426ffb,8797f48373dcfa3ff8e748667732dea8aea4347e.