Stuff I would like to read when I have time: [1] Notebook examples about the query optimization [2] the slide deck going into more technical details, [3] other links from the homepage

[1] https://fireducks-dev.github.io/posts/sourav_cse_demo_20240701/
[2] https://fireducks-dev.github.io/files/20241003_PyConZA.pdf
[3] https://fireducks-dev.github.io/

#toread #pandas #fireducks #needforspeed

Have you ever thought of speeding up your data analysis in pandas with a compiler?

In general, a Data Scientist spends significant efforts in transforming the raw data into a more digestible format before training an AI model or creating visualizations. Traditional tools such as pandas have long been the linchpin in this process, offering powerful capabilities but not without limitations. With the pitfall of its single-core implementation and inefficient data structures, often we face performance issues when dealing with pandas for relatively larger data, but its performance is also highly impacted by the choice of APIs, their parameters and execution orders.

Just learned there's a drop-in replacement for pandas that can be 16x faster, it's called fireducks. You call it with `import fireducks.pandas as pd` & it's built by some supercomputing folks (NEC) specifically with parallelization and automatic query optimization in mind. https://hwisnu.bearblog.dev/fireducks-pandas-but-100x-faster/

Disclaimer: I haven't tried it yet. Also: tragically I really dislike the name ;__;

#pandas #fireducks #needforspeed #datascience

FireDucks : Pandas but 100x faster

My main background is a hedge fund professional, so I deal with finance data all the time and so far the Pandas library has been an indispensable tool in my...

*ฅ^•ﻌ•^ฅ* ✨✨  HWisnu's blog  ✨✨ о ฅ^•ﻌ•^ฅ
Import を変更するだけで高速化!? Pandas 互換ライブラリ FireDucks を検証する - Qiita

1. はじめに 本記事では、Import を書き換えるだけで高速化できる、Pandas 互換のライブラリ FireDucks が公開されたので、実行速度やメモリ消費量等を検証してみたいと思います!https://fireducks-dev.github.io/ja…

Qiita

NEC、Pythonを用いたデータ分析を高速化するソフトウェア「FireDucks」の無償提供を開始 (2023年10月19日): プレスリリース | NEC
https://jpn.nec.com/press/202310/20231019_01.html

#python #pandas #data #analytics #datascience #fireducks

NEC、Pythonを用いたデータ分析を高速化するソフトウェア「FireDucks」の無償提供を開始

NEC は、プログラミング言語「Python」を用いたデータ分析において標準的に使用されているテーブルデータ分析用ライブラリ「pandas」を高速化するソフトウェア「FireDucks」を開発しました。

NEC