General

Profile

jneen (Jeanine Adkisson)

  • Login: jneen
  • Registered on: 03/06/2019
  • Last sign in: 03/13/2026

Issues

open closed Total
Assigned issues 0 0 0
Reported issues 1 1 2

Activity

03/13/2026

04:35 PM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
I have submitted a patch to Onigmo with the naive strategy (A) as outlined above:
https://github.com/k-takata/Onigmo/pull/175
Contrary to my expectations, it does appear that `/[\p{Word}\p{Alpha}]/` and `/[a-fb-g]/` **do** continue...
jneen (Jeanine Adkisson)

03/10/2026

01:14 AM Ruby Misc #21877: DevMeeting-2026-03-17
* [Bug #21870] Regexp: Warnings when using slightly overlapping `\p{...}` classes (jneen)
* Warning spam on code that definitely isn't a mistake (`/[\p{Word}\p{S}]/` and other overlapping properties)
* Noted some possible ways forwar...
jneen (Jeanine Adkisson)

03/05/2026

04:03 AM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
If there are no objections, I'll submit a patch with strategy (a) next week. It's straightforward to implement and maintains the closest to the current behaviour as possible while fixing the issue. jneen (Jeanine Adkisson)

02/24/2026

05:52 AM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
A quick benchmark shows we are within error bars for matching performance:
```ruby
#!/usr/bin/env ruby
require 'benchmark'
NON_REPEAT = Regexp.new("[" + ("a-z" * 1) + "]")
YES_REPEAT = Regexp.new("[" + ("a-z" * 100000) + "]")
...
jneen (Jeanine Adkisson)
05:41 AM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
Having looked through the onigmo code a bit now, I can think of a few ways forward.
**a) Simply don't warn on overlapping ctype classes.**
I believe this would only involve removing the check on line 1860 from regparse.c. This woul...
jneen (Jeanine Adkisson)

02/17/2026

08:27 PM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
This isn't even possible to work around by targeting RUBY_VERSION, as Ruby warns even in unreachable cases:
```ruby
regex = if RUBY_VERSION < '4'
/[\p{Word}\p{Cf}]/
else
/[\p{Word}]/
end
```
still warns on Ruby 4+, even t...
jneen (Jeanine Adkisson)

02/10/2026

03:32 PM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
Some benchmarks:
```console
$ ruby --version
ruby 4.0.1 (2026-01-13 revision e04267a14b) +PRISM [arm64-darwin25]
```
```ruby
require 'benchmark'
LENGTH = 1000000
REPEAT = 100
TEST_STR = 'a' * LENGTH
Benchmark.bm do |bm|
bm.report "...
jneen (Jeanine Adkisson)
01:15 PM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
That's a very interesting find!
I do think it makes sense to warn if an explicitly written character repeats in a character class, or if the class begins and ends with a colon. But for overlapping unicode properties, there doesn't see...
jneen (Jeanine Adkisson)

02/09/2026

05:42 PM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
trinistr (Alexander Bulancov) wrote in #note-11:
> > Using `/(\p{Word}|\p{S})/` is kind of a workaround, but it is slower.
> ...
This is what I actually tested. Still much slower.
mame (Yusuke Endoh) wrote in #note-9:
> jneen (Jeanine A...
jneen (Jeanine Adkisson)
05:54 AM Ruby Bug #21870: Regexp: Warnings when using slightly overlapping \p{...} classes
That specific case also appears to have changed, e.g. on 3.4.1:
```ruby
[2] pry(main)> (0..0x10ffff).select{(s=[it].pack('U'); s=~/\p{Word}/&&s=~/\p{Cf}/) rescue false}.map{it.to_s 16}
=> []
```
Maybe for preset classes like `\p...
jneen (Jeanine Adkisson)

Also available in: Atom