Applied in changeset commit:git|a50fbc56a30a0665102781019029e9cf9ddb3576. ---------- Use mbuf instead of bitset for character class for small UTF. Fixes #16145mjrzasa (Maciek Rząsa)
I've tested it for Polish letters, the bug appears only for `ó`, all other work OK: ``` pry(main)> ['ą', 'ę', 'ó', 'ś', 'ł', 'ć', 'ź', 'ż', 'ń'].map { [_1, _1.bytes, /[x#{_1}]/i.match?("qwer#{_1.capitalize}")] } => [["ą", [196, 133...mjrzasa (Maciek Rząsa)
I believe the fix is ready for review https://github.com/ruby/ruby/pull/12714 Some CI jobs were failing (WebAssembly/Cygwin) but the failures seem not to be related to my changes and they're inconsistent (after rebasing Cygwin passed an...mjrzasa (Maciek Rząsa)
I rerun tests on 3.5.0 and it's indeed related to transcoding ``` puts "Hello dev-ruby! #{RUBY_VERSION}" require 'tempfile' Tempfile.open() do |f| f.write('0123456789') f.rewind f.ungetc('a') # Character buffer WILL NOT...mjrzasa (Maciek Rząsa)
It works OK with StringIO (unsurprisingly) ``` StringIO.open() do |f| f.write('0123456789') f.rewind f.ungetc('a') # Character buffer WILL NOT be cleared f.seek(2) f.getc end # => "1" ```mjrzasa (Maciek Rząsa)