Spent ages today trying to figure out why the dates in my series weren’t being handled correctly by #PythonPandas to_datetime function.
The default error function for invalid parsing is to raise an exception, and two other values can be set.
The documentation says:
- If 'coerce', then invalid parsing will be set as NaT.
- If 'ignore', then invalid parsing will return the input.
I took ‘ignore’ to mean that an invalid value within a series would be skipped.
It turns out the *entire input series* is skipped, returning the original, unmodified series.
But the *opposite* is true for ‘coerce’, which returns NaT for the invalid value in the series, and processes the valid values.
It felt unintuitive, to say the least (maybe just me?). And I find it odd that there’s not a ‘skip’ option?
Re-reading the docs with the knowledge of what the function actually does, it kinda makes sense? But doesn’t feel sufficiently clear.