Alright nerds, who can guess what this RegEx matches?

https://sopuli.xyz/post/42387212

At first glance IP address or URL, embedded in HTML, whatever it is, it’s a doozy. I wonder what the performance of it is like.
It works out as O(regex^n)
Nothing. It’s invalid.
check out Regulex! it doesn’t support mode modifiers but it does lack some features but i really like how its graphs look
Nice. Is there terminal/native running software with something similar?
Other than just running the HTML+JS/TS project in a container.
Hold on, let me draw up the NFA
That’s John Gruber’s regex pattern for matching URL’s (⌐■_■).
truly a sunglasses moment indeed
Whatever this is supposed to match, I bet the bycatch is bigger than tuna fishing.
Looks like an URL matcher of some sorts, that isn’t limited to HTTP
URLs can have newlines too
!unlearn

It seems most browsers basically ignore them:

lemire.me/…/you-can-use-newline-characters-in-url…

So probably not worth remembering anyway.

You can use newline characters in URLs

We locate web content using special addresses called URLs. We are all familiar with addresses like https://google.com. Sometimes, URLs can get long and they can become difficult to read. Thus, we might be tempted to format them like so in HTML using newline and tab characters, like so: <a href="https://lemire.me/blog/2026/02/21/ how-fast-do-browsers-correct-utf-16-strings/">my blog post</a> It will … Continue reading You can use newline characters in URLs

Daniel Lemire's blog
Also no encoded basic auth or raw ip addresses (not that a useful website would likely use raw ipv4 or 6 since that causes huge CORS and sometimes even DNS issues…)
What. The. Fuck.
This is an example of the old adage that “When you use a regex to solve a problem, you end up with two problems.”
Probably documents from HP’s atrocious support site
URLs in an HTML document that aren’t namespaces or otherwise enclosed?
Looks like the hacking mini game in Fallout 4.