Empty matches in Python’s `re` module

https://blog.narf.ssji.net/2025/04/30/empty-matches-in-pythons-re-module/

Python’s `re.sub` method has a weird, though documented, behaviour.

Replacements for empty-matching patterns such as `/.*/` applied to a non-empty string will lead to two matches. The replacement will therefore be applied twice.

A simple fix is to make sure the pattern is not empty-matching, e.g. `/.+/`.

#debugging #Python #regexps #sed

Empty matches in Python’s `re` module - Narf

Python's `re.sub` method has a weird, though documented, behaviour. Replacements for empty-matching patterns such as `/.*/` applied to a non-empty string will lead to two matches. The replacement will therefore be applied twice. A simple fix is to make sure the pattern is not empty-matching, e.g. `/.+/`.

Narf - The blagosphere got me...

Best answer on regexps ever? 🧐 😎

- What is the plural form of regex?

- If you've used more than one of them, you'll know that the plural of "regex" is "regrets." – cjs (Mar 14, 2022 at 23:46)

#regex #regexps #programmer #programming #devops #regularexpression

Extended #regexps are great for place names like "Wellesley" where you can not remember if it's Welesley, Welleseley, Wellesselley, Welleslley, or Welesslley to save your dadburn life. ;)

/wel+es+e?l+ey/ !!!

cc: @amin

I just noticed. #regexps are code. It's text matching and cutting language. You're writing a regexp, you're writing a function. And as such, of course, you have to test them.

I had a problem. I used #regexps... and I tested them. Througly. I cut them down in pieces. And tested the pieces first...

And now I don't have a problem!

Test! And don't forget to also test your regexps!

#til

* Base64 is not idempotent:`b64(x)` != `b64(b64(x))`. This is because it represents values 0-63 with values 65-90 (`A-Z`), 97-122 (`a-z`), 48-57 (`0-9`), 43 (`+`) and 47 (`/`); and 61 (`=`) for padding.
* `grep` not only has options to support more complex #regexps (`-E, --extended-regexp`, `-P, --perl-regexp`), it also has `-F,--fixed-strings` to treat the pattern as just a string. This not only makes it slightly faster, it's easier to write if you would need too much escaping.

@neustradamus #PCRE continues to be a misnomer; it’s a modified subset of #Perl #RegularExpressions with dozens of differences: https://pcre.org/current/doc/html/pcre2compat.html

It's not "(C)ompatible." Accept no substitutes: https://perldoc.perl.org/perlre

#PCRE2 #PerlIncompatibleRegularExpressions #RegularExpression #RegExes #RegExps #regex #regexp

pcre2compat specification

@hschne I think a lot of folks know about it, but there aren't too many use cases in my own code where it isn't easier just to assign capture groups to variables post facto. Plus, I'm one of those people that think #regex is often abused when simple matches followed by code is better than really complicated #regexps.

Plus, "shiny and cool" isn't always better than readable, and named capture groups just hurt my eyes. 😎

Greg Donald :ruby: :whyfox: (@[email protected])

Q. What do we want? A. Raku-compatible Regular Expressions (RCRE™) Q. When do we want it? A. 25 years ago? 😂 #Perl #Raku https://docs.raku.org/language/regexes

Ruby.social
Solved #bug today in #matomo import script.
Took me two weeks to find what went wrong.
Turns out if the host variable in the import script has no "http" in it, host is not recorded in database.
You gotta love #regexps