MarkItDown: Python Tool for Converting Files and Office Documents to Markdown
https://daringfireball.net/linked/2024/12/13/markitdown
https://daringfireball.net/linked/2024/12/13/markitdown
@daringfireball You might know it already but there is a command line tool called pandoc that can convert Markdown both ways.
It is really versatile and allow us to use Markdown for most our needs, also when the customer wants Word or PDF.
The PDF generation can make front matter, header/footer with page numbers, table of contents, numbered headings. I think it is great because it allows us to have very to the point source file.
I build a lot of RAG apps that are dependent on Azure Document Intelligence for this purpose. Azure DI can get expensive, so itβs interesting to see MS put out this library which could potentially reduce the need for it.