A lot of discussions at #tpdl2023 about OCR, tesseract and the post-processing steps