New blog post, in which I review and test some options for extracting unformatted text from #EPUB files in Python, using #Apache #Tika (via #Tika-python), #Textract and #EbookLib.

Includes link to Git repo with demo scripts.

https://www.bitsgalore.org/2023/03/09/extracting-text-from-epub-files-in-python

Extracting text from EPUB files in Python

This post gives an introduction to extracting unformatted text from EPUB files in Python.

bitsgalore.org