Recently a developer spottted a regex in our code base to detect if a string is a date, but noticed it was wrong. It was something like:
^20[0-9][0-9]-[0-9]+-[0-9]+$
which means dates like "2000-0000000-0000000" would be accepted, which is useless. I also mention how our unit tests that run against the function with this regex follows the happy path, but never really checks for the bad paths (including this one).
He then went to fix the regex, but then started to take into account things like days ending in 30/31, February ending in day 28, etc. TBH, it was very much overkill for what was needed (this was to effectively check if the string matched the "shape" of a date) but I was like "sure, why not".
He then turns around and says he has factored in leap years. At this point, I'm getting a little bit worried as he seems to be getting side-tracked. But whatever, it is done, he has got a little webpage that shows how the regex works, I say if you can export that as an image and annotate it, that would be great for future devs.
So I think that is pretty much the end of it, right? No more.
He then comes back later in the day and tells me he wants the regex to consider ROMAN FUCKING NUMERALS.
I-
SERIOUSLY.
WHO PUTS DATES INTO COMPUTERS IN THE FUCKING ROMAN NUMERAL FORMAT?!?!?!?
I try my best to kindly state that you can do that for fun and personal reasons, but I doubt our system will accept or even present dates in roman numerals.
He seems incredibly sad about this.
I then ask him has he updated the unit tests to test for all these cases.
"What unit tests?"
@rootfs Oh gosh.
Don't get me wrong, I truly love regexes, they are one of my favorite programming tool in existence... But that one is probably a nightmare to understand, not speaking of performance...
May I ask about the reason why a regex was considered there and implemented in the first place? I'm assuming there were some constraints I don't know about, because it's not really the most effective way to do that kind of validation...
@rootfs Yeah, that was my first thought as well, but I know that old legacy systems often have their own rules that might prevent such things to be implemented, hence my question...
Anyways, good luck!
@rootfs
...
I'd rather not ask. May future you not be too mad at future past (present?) you. Though a validator that only checks on regexes is... Odd? I mean... As a RegexValidator subclass of an abstract Validator class (or implementing a Validator interface or protocol, depending on the language and version), maybe I can understand ?
Edit: ugh. Sorry, I actually asked.
@rootfs Haha, no problems. Sorry if I offended you in any way. I understand very well the complexity of managing an ever increasing backlog and of course if there are other more critical issues to deal with it, it makes sense to focus on them.
Anyways, whatever version of you takes care of this, good luck to them. Thanks for the discussion ^^