We are facing an issue only when running ruby on arm from amazon linux. In some cases when we puts a string we'll receive the above error message. However when we run the same data through puts on Intel we do not receive this error. I am not sure if this is a ruby issue maybe an iconv issue... but what would be the best way to capture more data to help from here?
I found some additional insight... on Intel we can puts File.read("this-file-contains-utf8") # and no crash
On arm in some cases when we do
puts File.read("this-file-contains-uf8") # it crashes with an encoding error ...
Adding encoding: 'UTF-8' # does resolve this but... still in some cases we have found that if we receive bytes say from an HTTP request... and puts it'll crash... on arm but not intel...
First, if the error says Encoding::UndefinedConversionError, then I think it's not related to iconv, because iconv only gets used when you explicitly say so. Ruby has its own internal character conversion code.
Second, it's very clear that you get a conversion error when you try to convert "\xE2" from ASCII-8BIT to UTF-8. In ASCII-8BIT, "\xE2" is just a binary byte, without any character defined on it. There's no way to convert that to a character in UTF-8.
The "\xE2" byte may be the start of an UTF-8 byte sequence, somewhere between U+2000 (E2 80 80) and U+2FFF (E2 BF BF). But in that case, there would be no need to convert, only a need to label the encoding correctly. Of course, the "\E2" byte may also be something else.
My bet would be the locale is not set properly on the arm machine. locale probably shows C or POSIX and many things don't work with that.
You probably need export LANG=en_US.UTF-8 or so.
I think CRuby should warn in that case. TruffleRuby already does.