Eregon (Benoit Daloze) wrote in #note-1: > I have wanted this feature too, how about adding an optional argument to `String#valid_encoding?`? > ... I'm partial to this one. Alternatively, it could be nice to have the inverse: `Encoding...nirvdrum (Kevin Menard)
> But on the other hand, ISO-2022-JP, UTF-16, and UTF-32 definitely have their uses. They are not so much used directly when processing strings (because indeed for these encodings, most string operations don't work, or don't work correct...nirvdrum (Kevin Menard)
My own take on three options, with no significance to the order, are: **Ignore the code point** The documentation for `lstrip` is "Returns a copy of the receiver with leading whitespace removed." It seems fairly straightforward and...nirvdrum (Kevin Menard)
When attempting to strip a string, there are three basic options when an invalid code point is encountered: 1) Ignore the code point 2) Strip the code point 3) Raise an exception For background, Ruby does not consider the string'...nirvdrum (Kevin Menard)
naruse (Yui NARUSE) wrote in #note-2: > The encoding of the resulted string depends "ascii only or not" and "ascii compatibility". > ... The rules are actually a bit more complicated than that because empty strings get special treatmen...nirvdrum (Kevin Menard)
I generally like the idea, but really from a semantics perspective rather than a memory savings one. It's confusing to both implementers and end users alike that Symbols take on a different encoding from Strings if they happen to be ASCI...nirvdrum (Kevin Menard)
I also tested some older Ruby releases. The issue is also present in `ruby 2.4.4p296 (2018-03-28 revision 63013) [x86_64-linux]` and `ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-linux]`.nirvdrum (Kevin Menard)
It's hard to write code that works properly with dummy encodings, so they should really be avoided altogether. However, I've come across a code path that I think yields inconsistent results when it comes to dummy encodings with a minimum...nirvdrum (Kevin Menard)
Calling `rb_str_set_len` on a `String` could alter the code range. I think this hasn't been much of an issue because of pure luck rather than anything that was deliberately designed. If called on a string that already has a `CR_UNKNOWN` ...nirvdrum (Kevin Menard)