MS SQL Arrow

mssql-python 드라이버가 이제 Apache Arrow 구조를 직접 지원하여 SQL Server에서 데이터를 Polars, Pandas, DuckDB 등 Arrow 네이티브 라이브러리로 빠르고 메모리 효율적으로 가져올 수 있게 되었다. 이 기능은 Python 객체 생성과 가비지 컬렉션 부담을 줄여 특히 DATETIMEOFFSET 같은 시간 관련 타입에서 큰 성능 향상을 제공한다. 기존 fetch API와 호환되며, 배치 단위 또는 스트리밍 방식으로 데이터를 처리할 수 있어 대용량 데이터 처리에 적합하다. 현재 Linux에서 NVARCHAR 타입의 성능 개선 작업이 진행 중이다.

https://devblogs.microsoft.com/python/introducing-apache-arrow-support-in-mssql-python/

#mssqlpython #apachearrow #python #sqlserver #dataframe

Introducing Apache Arrow Support in mssql-python - Microsoft for Python Developers Blog

Efficient Data Fetching from SQL Server via Apache Arrow

Microsoft for Python Developers Blog
GitHub - ModernRelay/omnigraph: Lakehouse-native graph engine with git-style workflows

Lakehouse-native graph engine with git-style workflows - ModernRelay/omnigraph

GitHub

Show HN: Bundlebase – Docker for Data

Bundlebase는 버전 관리되고 자체 설명이 가능한 데이터 컨테이너를 제공하는 도구로, 서버나 별도의 인프라 없이 Python, SQL, CLI, BI 도구에서 접근할 수 있습니다. 데이터셋의 스키마, 변환 이력, 출처를 포함해 공유하며, 데이터 정제 규칙을 번들에 내장해 반복 작업을 자동화합니다. Apache Arrow, DataFusion, Parquet 등 최신 기술을 활용해 대용량 데이터도 효율적으로 처리하며, LLM 에이전트가 상태를 유지하는 데도 적합합니다. 이는 데이터 파이프라인과 협업을 간소화하는 혁신적 데이터 관리 솔루션입니다.

https://nvoxland.github.io/bundlebase/

#dataengineering #python #sql #apachearrow #dataversioning

Bundlebase — Data Packaging - Bundlebase

Bundlebase packages data into versioned, self-describing containers. Attach CSV, Parquet, or JSON from S3, HTTP, or local files. Query with SQL via Python, CLI, or any BI tool. Share via a path. No database required.

Announcing the first ever Apache Arrow and Parquet meetup in Paris, kindly hosted by @datadoghq .

If you’re using Arrow or Parquet, looking for insights, or wanting to meet other community members, this meetup is for you. Please register if you plan to attend!

https://luma.com/6ed1oko1

#apachearrow #apacheparquet

Apache Arrow / Parquet - June 2026 meetup in Paris · Luma

Details We’re excited to announce the first ever Apache Arrow and Parquet meetup in Paris! This meetup will be hosted on June 18th by Datadog, in their…

We're excited to announce the release of {arrow} 24.0.0 🏹📦

Here's a roundup of the new features and changes in a 🧵

Full details can be found at https://arrow.apache.org/docs/r/news/

#rstats #apachearrow

Changelog

"Arrow has the intricacy of a fine Swiss watch." The co-creator of Apache Arrow on why AI agents cannot replicate decade-long infrastructure design.

#ApacheArrow #DataRenegades

Wes McKinney built pandas in a mouse-infested NYC apartment on founder hours. Now he runs parallel Claude Code sessions and says AI is forcing "radical accountability" on every software vendor shipping mediocre products. Full conversation: https://youtu.be/Uso8-yaERkE

#DataRenegades #pandas #ApacheArrow

What happens after you outgrow your memory limits? 🤔
Creator of pandas and Apache Arrow, Wes McKinney, takes the stage at #positconf 2026 to discuss the next frontier of analytical computing and agentic software engineering. 🏗️
Don't just use the tools—meet the person building the foundation of the modern data stack.
👉 Grab your spot: pos.it/conf
#positconf #DataScience #ApacheArrow #Ibis #Python
So Apache Arrow is a wrapper around mmap with a data format? #ApacheArrow

R Consortium webinar: Scale up data analysis in R with Arrow—fast, memory-efficient analytics without a DB or cluster.

With Dr Nic Crane (Arrow R maintainer, Apache Arrow PMC).

Register: https://r-consortium.org/webinars/scaling-up-data-analysis-in-r-with-arrow.html

#rstats #apachearrow