Bug #20998
openrb_str_locktmp() changes flags of frozen strings and string literals
Description
# frozen_string_literal: true
# BOILERPLATE START
require 'tmpdir'
require 'rbconfig'
def inline_c_extension(c_code)
Dir.mktmpdir('inline_c_extension') do |dir|
File.write("#{dir}/cext.c", c_code)
File.write("#{dir}/extconf.rb", <<~RUBY)
require 'mkmf'
create_makefile('cext')
RUBY
out = IO.popen([RbConfig.ruby, 'extconf.rb'], chdir: dir, &:read)
raise "ruby extconf.rb failed: #{$?.inspect}\n#{out}" unless $?.success?
out = IO.popen(['make'], chdir: dir, &:read)
raise "make failed: #{$?.inspect}\n#{out}" unless $?.success?
require "#{dir}/cext.#{RbConfig::CONFIG['DLEXT']}"
end
end
inline_c_extension <<~C
#include "ruby.h"
static VALUE foo(VALUE self, VALUE str) {
rb_str_locktmp(str);
return str;
}
void Init_cext(void) {
VALUE c = rb_define_class("Foo", rb_cObject);
rb_define_singleton_method(c, "foo", foo, 1);
}
C
# BOILERPLATE END
a = "str"
Foo.foo(a)
# imagine a million lines of code in between
b = "str"
b << "."
# can't modify string; temporarily locked (RuntimeError)
# What? Who "locked" that immutable frozen string literal?
# It should be: can't modify frozen String: "str" (FrozenError)
Same problem with:
Foo.foo("abc")
# imagine a million lines of code in between
Foo.foo("abc") # temporal locking already locked string (RuntimeError)
Related: https://github.com/oracle/truffleruby/issues/3752
It seems a clear bug to mutate a frozen string (with visible side effects), even more so for shared frozen string literals.
I think rb_str_locktmp() should raise (a FrozenError, as it's effectively attempting to mutate it, same as calling rb_str_modify()
) if called on a frozen string, because it makes little sense, I think "locking a string" only makes sense for mutable strings.
The alternative would be to noop on frozen strings for rb_str_locktmp() and rb_str_unlocktmp(), I think that's susprising, and potentially hiding more bugs, e.g. if one by mistake tries to mutate the RSTRING_PTR() or so, and also rb_str_locktmp() wouldn't imply the flag is set after it returns.
Any attempt to mutate a frozen string should fail, so might as well fail early.
Updated by nobu (Nobuyoshi Nakada) 10 days ago
There isn’t a case that locking a string just to avoid being modified or moved, but not to mutate it?
Updated by Eregon (Benoit Daloze) 10 days ago
I'm not aware of any such case, and .freeze
seems much better for that use case.
Updated by byroot (Jean Boussier) 10 days ago
Actually yes, I think it's not an uncommon use case?
e.g. if you need to write a given string into a socket or something like that, you may want to lock it to ensure a concurrent thread won't modify it.
I think zlic.c
uses it this way?
Updated by Eregon (Benoit Daloze) 10 days ago · Edited
@byroot (Jean Boussier) but that's mutating the string, @nobu (Nobuyoshi Nakada) is asking for a case not mutating the string, which I think there is none.
EDIT: I read too fast, "write a given string into a socket" does not imply mutating the string.
Updated by Eregon (Benoit Daloze) 10 days ago
https://github.com/ruby/ruby/blob/975461bf885b8fb791cf048ad39c79924d28e4e9/ext/zlib/zlib.c#L1056 but I'm not quite sure how that String is used.
It seems clearly wrong that two threads using Zlib with a frozen String literal (which happens to have the same contents/characters for both threads) would fail with temporal locking already locked string (RuntimeError)
though.
So what should we do then when rb_str_locktmp() is called on a frozen string?
Updated by byroot (Jean Boussier) 9 days ago
It seems clearly wrong that two threads using Zlib with a frozen String literal (which happens to have the same contents/characters for both threads) would fail with temporal locking already locked string (RuntimeError) though.
That I totally agree.
Updated by Eregon (Benoit Daloze) 7 days ago
@nobu (Nobuyoshi Nakada) @byroot (Jean Boussier) So do you think rb_str_locktmp() should noop on frozen strings (since they can't change anyway), or it should raise because the usage seems somewhat dubious?
I'm fine with either, I'd just like a decision so we can do the same in TruffleRuby.
(the current status is broken so that's worse than either fix)
Updated by byroot (Jean Boussier) 7 days ago
I prefer the noop solution, as it makes it a much easier to use API. But no strong opinion.