protip: ALWAYS use regular expression literals in JavaScript and TypeScript and any other language that supports it, instead of writing your regex out in a string. I cannot count how many critical security bugs I have found over the years from someone writing a regex like "^en\.wikipedia\.org$", which is incorrect because the \. is treated as *string* escape sequence (an invalid one that just produces .) which then results in the regex being "^en.wikipedia.org$" which matches "enowikipedia.org".

@gsuberland

Are you saying it should be...

"^en\\.wikipedia\\.org$",

... Instead?

@xdydx @gsuberland Depends on the language, which is another reason this is such a thorny problem. In many languages, yes, the expression should be as you wrote it.

In some languages (PowerShell, for example), the escape character for strings is something other than backslash, so the expression as you wrote it would be incorrect. In most of these cases, the expression wouldn’t match a real domain, which would be noticed in an allowlist entry but probably not in a blocklist entry.

To write correct expressions, you need to know implementation details like that, and most vendors hate giving those out.

@xdydx if you put it in a string, yes. but you should use regex string literals like /^en\.wikipedia\.org$/ if your language supports them so you don't need to double escape. so in TS/JS:

const reg = /^en\.wikipedia\.org$/;

@gsuberland
Ah. Ok. I follow.

Thanks for the clear example!