things that are much harder to describe accurately with a regex than you'd think, an incomplete list:

* floating point numbers
* IPv6 addresses
* IPv*4* addresses (depending on how you define them and how picky you are about the numeric ranges)
* ...

Just gonna leave this regexp here

How to handle emoji: Where other methods are not available, you can use the following regex (for Unicode 11.0 emoji). For clarity, it escapes all characters that can be invisible or are non-spacing -- otherwise you see some odd constructions like ([♀♂])?+ that are really (\\x{200D}[♀♂]\\x{FE0F})?+. ...