Recently a developer spottted a regex in our code base to detect if a string is a date, but noticed it was wrong. It was something like:

^20[0-9][0-9]-[0-9]+-[0-9]+$

which means dates like "2000-0000000-0000000" would be accepted, which is useless. I also mention how our unit tests that run against the function with this regex follows the happy path, but never really checks for the bad paths (including this one).

He then went to fix the regex, but then started to take into account things like days ending in 30/31, February ending in day 28, etc. TBH, it was very much overkill for what was needed (this was to effectively check if the string matched the "shape" of a date) but I was like "sure, why not".

He then turns around and says he has factored in leap years. At this point, I'm getting a little bit worried as he seems to be getting side-tracked. But whatever, it is done, he has got a little webpage that shows how the regex works, I say if you can export that as an image and annotate it, that would be great for future devs.

So I think that is pretty much the end of it, right? No more.

He then comes back later in the day and tells me he wants the regex to consider ROMAN FUCKING NUMERALS.

I-

SERIOUSLY.

WHO PUTS DATES INTO COMPUTERS IN THE FUCKING ROMAN NUMERAL FORMAT?!?!?!?

I try my best to kindly state that you can do that for fun and personal reasons, but I doubt our system will accept or even
present dates in roman numerals.

He seems incredibly sad about this.

I then ask him has he updated the unit tests to test for all these cases.

"What unit tests?"

@rootfs Oh gosh.

Don't get me wrong, I truly love regexes, they are one of my favorite programming tool in existence... But that one is probably a nightmare to understand, not speaking of performance...

May I ask about the reason why a regex was considered there and implemented in the first place? I'm assuming there were some constraints I don't know about, because it's not really the most effective way to do that kind of validation...

@zeolith I honestly have no idea. This was legacy code written a few years ago. I think if one of us stared at the code hard enough, we would all go "wtf".

Even writing it out loud I was like "... why not use something like a LocalDateTime/Instant/Date object to try and parse it, and if it throws an exception, just throw it out?" but I'm also sure the base system has some weird oddities with dates.

@rootfs Yeah, that was my first thought as well, but I know that old legacy systems often have their own rules that might prevent such things to be implemented, hence my question...

Anyways, good luck!

@zeolith I just found out through a code review - it's because the validator part of the system seems to mainly accepts Regex as validation.

I'm sure there is a way to beat it into submission but that's a problem for future me
😃

@rootfs

...

I'd rather not ask. May future you not be too mad at future past (present?) you. Though a validator that only checks on regexes is... Odd? I mean... As a RegexValidator subclass of an abstract Validator class (or implementing a Validator interface or protocol, depending on the language and version), maybe I can understand ?

Edit: ugh. Sorry, I actually asked.

@zeolith 🤣 It's fine

So I see your point, and after a bit of looking I can see there is a Validator interface which one of my team can look into overriding and ensuring that dates can be verified using something like LocalDate.parse and then catch the exception on failure and return bool.

However, our team have a massive JIRA backlog of much more pressing issues to deal with (especially me). Quite frankly, I find it baffling someone decide to spend about a day on this, but that's not up to me to pass judgement.

And this Regex issue deals with quite a small part of our system that - given the original code was 3 years ago and nobody has found faults with it (compared to other aspects of our system that is much more trouble-prone and needs attention, but the complexity is quite baffling), I think future me wouldn't be mad at past me for not focussing on fixing the Regex.

@rootfs Haha, no problems. Sorry if I offended you in any way. I understand very well the complexity of managing an ever increasing backlog and of course if there are other more critical issues to deal with it, it makes sense to focus on them.

Anyways, whatever version of you takes care of this, good luck to them. Thanks for the discussion ^^

@zeolith No worries - I did give a very small window into my world, and there wasn't an awful lot of context to be obtained from it. You do raise some very good points, and if it wasn't for you, I never would've taken a look and digged deeper into the validation class, so thank you for that!