things that are much harder to describe accurately with a regex than you'd think, an incomplete list:

* floating point numbers
* IPv6 addresses
* IPv*4* addresses (depending on how you define them and how picky you are about the numeric ranges)
* ...

@zwol

Email addresses.
URLs.
Phone numbers.

@elithebearded @zwol

Dates and times (and how)
URLs and URIs (they contain some of the things already mentioned)

@elithebearded @zwol Oops, URLs was a duplicate.
@ancoghlan @elithebearded @zwol at least there is official guidance on the latter now, so people have something to move *to* instead of ad-hoc implementations (many of which will be sub-par regexes) https://www.unicode.org/reports/tr58/
UTS #58: Unicode Link Detection and Formatting: URLs and Email Addresses

@zwol Extra difficulty: recognize Fortran output of floating point number with less than perfect edit descriptor.

@pancomputans Day job brain is SCREAMING

(day job involves several file formats designed by Fortran programmers in the 1970s and possibly even earlier, y'see)

@zwol yeah, we used to use IPv4 addresses as an example when teaching regexes, because it seems like it'll be easy enough until you start actually doing it, the point of the lesson being "use regexes for what regexes are good at and then do the rest in some other language."
@nickzoic @zwol loopback? that's easy: 0x7f000001

@zwol

curse you for making me type test cases like this into a REPL

>>> socket.gethostbyname("10.010.0x10")
'10.8.0.16'

but heck even IBM is guilty of grossly oversimplifying things https://www.ibm.com/docs/en/ts4500-tape-library?topic=functionality-ipv4-ipv6-address-formats

Though it appears that the exact syntax of numeric IPv4 addresses is "whatever inet_aton does" and has never been well specified in a published rfc? https://datatracker.ietf.org/doc/html/draft-main-ipaddr-text-rep-02

IPv4 and IPv6 address formats

Octets or segments, or a combination of both, make up Internet Protocol version 4 (IPv4) and Internet Protocol version 6 (IPv6) addresses.

@stylus I wonder why that draft stalled out. The BNFs in there look quite sensible.
Just gonna leave this regexp here

How to handle emoji: Where other methods are not available, you can use the following regex (for Unicode 11.0 emoji). For clarity, it escapes all characters that can be invisible or are non-spacing -- otherwise you see some odd constructions like ([♀♂])?+ that are really (\\x{200D}[♀♂]\\x{FE0F})?+. ...