In the past few years, we’ve seen a cambrian explosion of new columnar formats, challenging the hegemony of Parquet. Presumably, the design of yore is not going to cut it moving forward. I spent some time to understand a bit better how things actually changed.

https://sympathetic.ink/2025/12/11/Column-Storage-for-the-AI-era.html

Column Storage for the AI Era

In the past few years, we’ve seen a Cambrian explosion of new columnar formats, challenging the hegemony of Parquet: Lance, Fastlanes, Nimble, Vortex, AnyBlox, F3 (File Format for the Future). The thinking is that the context has changed so much that the design of yore (the previous decade) is not going to cut it moving forward. This seemed a bit intriguing to me, especially since the main contribution of Parquet has been to provide a standard for columnar storage. Parquet is not simply a file format. As an open source project hosted by the ASF, it acts as a consensus building machine for the industry. Creating six new formats is not going to help with interoperability. I spent some time to understand a bit better how things actually changed and how Parquet needs to adapt to meet the demands of this new era. In this post I’ll discuss my findings.

The Sympathetic Ink Blog
Column Storage for the AI era

(illustration hand generated in 1958) “Column Storage for the AI era” © 2025 by Julien Le Dem is licensed under CC BY-NC-SA 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/4.0/ Julien Le Dem ([email protected]) Column Storage for the AI era M.C. Escher, Be...

Google Docs