A few weeks ago, I needed to explain why it's difficult to say the number of CPU cores needed for #Flink to process 100k #Kafka events/sec (in the absence of any info about the data, the environment, the nature of the processing, etc.)

Two dozen graphs and one Doctor Who meme later, that turned into a blog post 😀

https://dalelane.co.uk/blog/?p=6001

"How many Kafka events will Flink process per second?"

I'm often asked this. The specific question varies, but it's typically some variation of asking how quickly a single CPU of Flink processes events from a Kafka topic. Why "per CPU"? Maybe because enterprise software is typically charged per CPU? Maybe because I tend to talk to people who run ever

dale lane

Volga: движок обработки real-time данных для AI/ML — аналог Spark и Flink на Rust (Arrow + DataFusion)

Volga — open-source движок обработки данных, созданный как альтернатива Apache Spark и Apache Flink и ориентированный на требования real-time AI/ML систем: консистентное вычисление фичей между online и offline режимами, point-in-time корректные агрегации, длинные скользящие окна, а также ML-ориентированные функции, такие как top- и категориальные агрегации. В статье рассматриваются мотивация и история разработки, архитектура системы и её ключевые компоненты, а также проводится сравнение с ML-ориентированными решениями (Chronon, OpenMLDB) и универсальными стриминговыми движками (Apache Flink, Apache Spark, Arroyo).

https://habr.com/ru/articles/1021290/

#rust #spark #flink #ai #ml #mlops #kubernetes #sql #python #streaming

Volga: движок обработки real-time данных для AI/ML — аналог Spark и Flink на Rust (Arrow + DataFusion)

TL;DR Volga — open-source движок обработки данных, созданный как альтернатива Apache Spark и Apache Flink и ориентированный на требования real-time AI/ML систем: консистентное вычисление фичей между...

Хабр

Stop answering today’s questions with yesterday’s data! 📰

Join Sandon Jacobs at #ArcOfAI to learn a streams-first approach for low-latency RAG using Kafka & Flink, ensuring your AI always has fresh context.

https://www.arcofai.com/speaker/520a73e3ba72494192f204d90ab8b7a9

🎟️ https://arcofai.com

#AI #Kafka #Flink #RAG #MachineLearning #AIEngineering

I've written up some #Flink thoughts : event-stream processing project don't have to be accessible SQL *or* powerful Java. UDFs let you work in SQL, and write Java to extend what SQL can do when you need.

I've come up with six examples of when it helps to extend Flink SQL with Java functions.

https://dalelane.co.uk/blog/?p=5920

Extending Flink SQL

In this post, I’ll share examples of how writing user-defined functions (UDFs) extends what is possible using built-in Flink SQL functions alone. I'll share examples of how UDFs can: make complex SQL readable and maintainable add algorithmic logic that SQL cannot express reshape complex events in

dale lane

I presented at our annual online #IBMTechCon event yesterday about Deploying an Apache #Flink job into production.

I gave a walk-through of what is involved taking an event processing flow, and deploying it into Kubernetes.

I tried to fit a lot in the hour!

https://dalelane.co.uk/blog/?p=5906

Deploying Apache Flink jobs into Kubernetes

IBM TechCon is an annual online technical event for engineers, creators, and integration specialists. One of our sessions for this year was Deploying an Apache Flink job into production: You’ve maybe seen the low-code canvas in Event Processing or the simple expressiveness of Flink SQL, and how

dale lane

creating #Kafka and #Flink demos for TechCon 2026 - our virtual event later this month

TechCon is about live and in-depth technical sessions on integration, automation, messaging, and event driven architectures - we can't get away with just slides! ;-)

https://www.ibm.com/events/reg/flow/ibm/l2ma0vmb/landing/page/landing

IBM Event Registration

#Flink has largely replaced #Kafka Streams for me.

I do miss Streams... maybe I'm getting old and prone to nostalgia, or maybe I'm sad to move on because of the amount of time I spent learning Streams' quirks!

This is a better intro to how they compare than my gut gives :)

https://xebia.com/blog/making-right-choice-flink-or-kafka-streams/

Making The Right Choice: Flink Or Kafka Streams? | Xebia

Flink vs Kafka Streams: Which real-time stream processing engine is best for your data pipeline? Explore key differences in architecture, state management, performance, and scalability.

Xebia

🚀 Big Data meets AI—powered by Iceberg, Spark & LLMs

At #ArcOfAI, Pratik Patel shows how to build a real architecture that lets users query massive datasets with natural language—no dashboards, no SQL, just questions & insights.

https://www.arcofai.com/speaker/1c241471d7f04018a0da70efffd35b32

🎟️ Get tickets: https://arcofai.com

#ArtificialIntelligence #BigData #DataArchitecture #ApacheSpark #ApacheIceberg #LLM #GenAI #EventStreaming #Kafka #Flink #AIEngineering #TechLeadership

Open Table Formats — Iceberg vs Paimon — практика использования

Привет, Хабр. Меня зовут Василий Мельник, я product owner решения для потоковой обработки данных Data Ocean SDI в компании Data Sapience. Наша команда приобрела большой практический опыт работы с Apache Iceberg в задачах на стыке традиционной пакетной обработки и near real-time и конкретно с использованием технологий на базе Flink, поэтому мы не могли пройти мимо нового открытого табличного формата (OTF) Paimon от разработчиков Apache Flink. В этой статье я опишу наш опыт и те практические выводы, которые мы сделали на промышленных средах, в виде репрезентативного тестирования, на котором проиллюстрирую ключевые практические сценарии.

https://habr.com/ru/companies/datasapience/articles/988308/

#iceberg #Paimon #Open_Table_Format #тесты_производительности #flink #spark

Open Table Formats — Iceberg vs Paimon — практика использования

Привет, Хабр. Меня зовут Василий Мельник, я product owner решения для потоковой обработки данных Data Ocean SDI в компании Data Sapience. Наша команда приобрела большой практический опыт работы с...

Хабр

I've started using (generated) click tracking events as a data source when introducing #FlinkSQL - it's simple, yet still rich enough to let me explain a variety of nice #Flink features.

I've written up a few examples at https://dalelane.co.uk/blog/?p=5806

Flink SQL examples with click tracking events

In this post, I introduce a few core Flink SQL functions using worked examples of processing a stream of click tracking events from a retail website. I find that a practical, real-world (ish) example can help to explain how to use Flink SQL in a way that abstract descriptions, such as processing co

dale lane