me: i need to validate some email addresses, so i am going to write a quick regex. how hard can it be?

*4 hours later*
me. i now have 2 problems and one of them is that I've accidentally summoned an ancient daemon. wht the actual fuck

@nixCraft, trust the process 
@nixCraft Just take a predefined one from someone else. Creating such complex regex is a waste of time and never totally right 🫠.
@clemensprill @nixCraft Shhhhh! 🤫 Let the normies think we write them like wizards
@nixCraft Email is easy. I don't know what your problem is. Just take this quiz and you'll see how easy it is.
https://e-mail.wtf/
Email is Easy

Everyone knows what an email address is, right?

e-mail.wtf

@paco @nixCraft

email is the bumblebee of protocols. any protocol expert would tell you that most of these email addrs could never work. ;)

@paco @nixCraft

I knew it was bad but I never imagined it was this bad, it makes javascript moments look like a pleasant field trip

@meeper JavaScript you say? The same person covered that too. This one made me want to punch my computer.

https://jsdate.wtf/

@nixCraft

new Date("wtf")

How well do you know JavaScript's Date class?

jsdate.wtf
@nixCraft give them slivovice, they'll sleep for 11 hours and you can fix your problem
@nixCraft I thought it's just me and I was embarrassed about it. Still haven't figured out how to send him back. That one is from validating https urls.
@nixCraft Hmm, looks like you forgot to protect your workspace with a circle of salt.
@nixCraft ah, yeah, been zhere, done that. That demon still owes me money.
@nixCraft RegEx, almost always the wrong choice
@zephyrxero @nixCraft
"I should have used logic, but this keeps the complexity rating down."
@nixCraft One thing I've learned over the years, is never try to validate an email address with regex !

@nixCraft

Copy a relatively good existing one, then just prompt for "are you sure" if it fails to match.

@nixCraft I think this is the right page from the Necronomicon: (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
@StealthyTango and it needs to be typed in a demonic font, or it won't work.
@nixCraft Ah, did the reghex by mistake huh?!
@nixCraft not to be that guy, but I've always wondered if this could be turned into a product or something
@nixCraft in the email sending world we literally just use *@* or %@% as the case may be. I kid you not, don't go any further than that.

@nixCraft If we ever get a Ten Commandments for programming, one of them should be:

"Thou shalt not write thy own email regex."

@mahryekuh @nixCraft I thought it was reserved as a karmic punishment,
you shall be cast into the darkness with the regex nought to return until the email be atuned
type stuff
Or was that a nightmare. . .?
@nixCraft
At your point it might be easier to just search haveibeenpwned for the address
@nixCraft Unwanted Summonings in Cyberspace
@nixCraft
I gave up when I realised emojis were valid characters in email addresses.
@nixCraft The plural of "regex" is "regrets" :D

@nixCraft

Did you find the needle in the haystack yet? /s

@nixCraft Either you use something simple like .*@.*\..* (at least an at-sign and a dot after it) and send an e-mail with an link or you will summon a debate war on if the complete e-mail standards should be followed or not.

And, if you choose to follow the standards, I have to warn you (and I guess some people had warned you already) that most e-mail servers do not follow the standard (like Cloudflare).

On the other hand, I just sent a message from "example+';DROP/**/TABLE/**/users;#"@gmail.com to example@some-domain-of-mine and it arrived. I hope you never need to deal with addresses like those.

@qgustavor @nixCraft you have successfully summoned a debate war by forgetting that user@localhost is a valid email, so is user@::1, and any other locally resolved name and ipv6 address. Just containing a @ is more than enough if you already try to send an email to it. ;)

Seriously though, enforcing a dot in the domain is probably reasonable for most publicly accessible email servers.
@kawazoe @nixCraft I got my share of self-hosted things that don't allow using localhost as the domain (e.g. Pocketbase). ::1 I guess it's invalid, it should be user@[::1]. I have a sending email address that works (sometimes) with four at-signs.
@nixCraft My experience tells me that if you have a problem and you solve it with regular expressions, you have two problems.
@nixCraft
This reads like @cstross parody fiction and I mean that as a compliment.

@nixCraft

At least the ancient demon is bound to serve you, and not eat you until it provides you with a perfect email parsing library, right?

@nixCraft https://regex101.com/ is your friend, your very very good friend
regex101: build, test, and debug regex

Regular expression tester with syntax highlighting, explanation, cheat sheet for PHP/PCRE, Python, GO, JavaScript, Java, C#/.NET, Rust.

regex101
@nixCraft Can the daemon validate things for you?
@nixCraft now toss moving between BSD/macOS and GNU into the mix.

@nixCraft Everytime someone uses a regex to validate an email, god kills truckloads of kittens

Split on last @ ensuring the previous character is not itself an @

Check if the domain part resolves, make sure you allow for international domain names.

You can check if the localpart is well formed. IIRC this is not possible with regexes. You need a parser.

With a valid domain, sending an email is the only real way to verify if the address is actually valid. Well formed doesn't mean accepted.

@nixCraft Reminded of this old gem: