MarkItDown: Python Tool for Converting Files and Office Documents to Markdown
https://daringfireball.net/linked/2024/12/13/markitdown
MarkItDown: Python Tool for Converting Files and Office Documents to Markdown

Link to: https://github.com/microsoft/markitdown

Daring Fireball
@daringfireball 🀌🏾🀌🏾🀌🏾
@daringfireball This is timely! I just searched for something like this earlier today. I am putting together some documentation and the Word doc is getting kinda long. Was thinking it would work better as a series of linked Markdown files in a Git repo.

@daringfireball You might know it already but there is a command line tool called pandoc that can convert Markdown both ways.

It is really versatile and allow us to use Markdown for most our needs, also when the customer wants Word or PDF.

The PDF generation can make front matter, header/footer with page numbers, table of contents, numbered headings. I think it is great because it allows us to have very to the point source file.

@daringfireball

I build a lot of RAG apps that are dependent on Azure Document Intelligence for this purpose. Azure DI can get expensive, so it’s interesting to see MS put out this library which could potentially reduce the need for it.