Mastodawn

Munquet 0.2.1 just landed on Flathub 🚀

Fixed a small race condition when canceling a conversion — turns out the process could finish right before you clicked “Yes” 😅

Two lines later… all good.

https://flathub.org/en/apps/io.gitlab.zulfian1732.munquet

#Flatpak #GTK4 #OpenSource #Parquet #DataScience #Linux #Python #PyArrow

Install Munquet on Linux | Flathub

Convert to Parquet

Zulfian Feb 26

Munquet is now officially on Flathub 🎉

A native Linux app to convert datasets into Apache Parquet using PyArrow backend. Perfect for data science workflows, analytics, and anyone needing fast local conversions.

Get it here: https://flathub.org/en/apps/io.gitlab.zulfian1732.munquet

@gnome @xfce @kde @GTK @linux @flathub

#apache #pyarrow #datascience #parquet #csv #OpenSource #Python #GNOME #GTK4 #Adwaita

Install Munquet on Linux | Flathub

Convert to Parquet

Zulfian Feb 17

🚀 Munquet — Convert, merge, rename & validate tabular data into Parquet, fully offline & batch-ready.

GitLab: https://gitlab.com/zulfian1732/munquet

Featured in: @severo 's Awesome Parquet: https://github.com/severo/awesome-parquet 🙏

#Parquet #OpenSource #Python #GNOME #GTK4 #Adwaita #PyArrow

Zulfian / munquet · GitLab

Lightweight desktop tool for converting and validating tabular datasets into efficient Parquet format.

GitLab

Zulfian Feb 16

🚀 Sneak peek Munquet!
Convert, merge, rename, and validate tabular data safely into Parquet. Works offline, with batch processing and progress feedback.

GitLab repo:

https://gitlab.com/zulfian1732/munquet

Flathub release coming soon!

#Python #GTK4 #GNOME #PyArrow #Parquet #DataScience #Libadwaita

Jörn Franke Nov 1, 2025

Released scrapy-contrib-bigexporter 1.0.0 (https://codeberg.org/ZuInnoTe/scrapy-contrib-bigexporters) - additional export formats for the webscraping framework Scrapy.

Migrated parquet export from fastparquet to pyarrow as fastparquet is deprecated (https://docs.dask.org/en/stable/changelog.html#fastparquet-engine-deprecated)

Migrated orc export from pyorc to pyarrow to reduce the number of dependencies

#scrapy #crawling #python #parquet #orc #pyarrow #webcrawling #scraping

scrapy-contrib-bigexporters

Scrapy exporter for Big Data formats

Codeberg.org

Kai Jun 30, 2025

If the purpose of a library is to "process and transport large data sets" but the code base contains an error message like "array cannot contain more than 2147483646 bytes" then there must be a big misunderstanding somewhere. #pyarrow

Spatialists Jun 22, 2025

Easily obtain OSM and OMF data: #Python and CLI tools #QuackOSM and #OvertureMaestro offer easier access to data from #OpenStreetMap (#OSM) and the Overture Maps Foundation (#OMF) through #PyArrow, #GeoParquet, or #DuckDB. These tools can simplify large-scale geospatial data...
https://spatialists.ch/posts/2025/05/23-easily-obtain-osm-and-omf-data/ #GIS #GISchat #geospatial #SwissGIS

Easily obtain OSM and OMF data – Spatialists – geospatial news

#Python and CLI tools #QuackOSM and #OvertureMaestro offer easier access to data from #OpenStreetMap (#OSM) and the Overture Maps Foundation (#OMF) through #PyArrow, #GeoParquet, or #DuckDB. These tools can simplify large-scale geospatial data tasks for seamless data engineering and analysis.

Spatialists – geospatial news

Spatialists May 23, 2025

Easily obtain OSM and OMF data: #Python and CLI tools #QuackOSM and #OvertureMaestro offer easier access to data from #OpenStreetMap (#OSM) and the Overture Maps Foundation (#OMF) through #PyArrow, #GeoParquet, or #DuckDB. These tools can simplify large-scale geospatial data...
https://spatialists.ch/posts/2025/05-23-easily-obtain-osm-and-omf-data/ #GIS #GISchat #geospatial #SwissGIS

Easily obtain OSM and OMF data – Spatialists – geospatial news

Spatialists – geospatial news

Alexandre B A Villares 🐍May 13, 2025

#TalkPythonToMe 503: The #PyArrow Revolution

Episode webpage: https://talkpython.fm/episodes/show/503/the-pyarrow-revolution

Media file: https://talkpython.fm/episodes/download/503/the-pyarrow-revolution.mp3

#PyConUS #Python #pandas #FLOSS

The PyArrow Revolution

Pandas is at a the core of virtually all data science done in Python, that is virtually all data science. Since it's beginning, Pandas has been based upon numpy. But changes are afoot to update those internals and you can now optionally use PyArrow. PyArrow comes with a ton of benefits including it's columnar format which makes answering analytical questions faster, support for a range of high performance file formats, inter-machine data streaming, faster file IO and more. Reuven Lerner is here to give us the low-down on the PyArrow revolution.

Nic Crane May 7, 2025

Currently taking a look at refreshing some of the #ApacheArrow and #PyArrow docs, so if you use Arrow in #rstats or Python and there's any areas you'd like to understand better, give me a shout, and we'll see what we can do!