#Pinterest launched a next-gen CDC-based ingestion framework.

Using #ApacheKafka, #ApacheFlink, #ApacheSpark & #ApacheIceberg, they achieved:
• Latency cut from 24+ hours to 15 minutes
• Processing of only changed records
• Support for incremental updates & deletions
• Petabyte-scale data across 1,000+ pipelines

Win: optimized cost & efficiency!

Read the architectural deep dive on InfoQ 👉 https://bit.ly/4rMJB2H

#SoftwareArchitecture #ChangeDataCapture

Mirroring for SQL Server in Microsoft Fabric (Generally Available) | Microsoft Fabric Blog | Microsoft Fabric

In today’s AI driven world, analytics platforms are only as good as their data. With the ever-increasing amount of data being collected in various applications, databases, and data warehouses in an enterprise, managing and ingesting data into a central platform for purposes of analytics and AI is a cumbersome and costly process. Databases and data &hellip; <p class="link-more"><a href="https://blog.fabric.microsoft.com/en-us/blog/mirroring-for-sql-server-in-microsoft-fabric-generally-available/" class="more-link">Continue reading<span class="screen-reader-text"> &#8220;Mirroring for SQL Server in Microsoft Fabric (Generally Available)&#8221;</span></a>

Announcing Copy Job Activity in Data Factory Pipeline (Generally Available) | Microsoft Fabric Blog | Microsoft Fabric

This milestone marks a major step forward in unifying and simplifying data movement experiences across Data Factory. With Copy Job Activity, users can now enjoy the simplicity and speed of Copy Job while leveraging the orchestration power and flexibility of Data Factory pipelines. What is the Copy job Activity&nbsp; Copy Job Activity allows you to &hellip; <p class="link-more"><a href="https://blog.fabric.microsoft.com/en-us/blog/announcing-copy-job-activity-now-general-available-in-data-factory-pipeline/" class="more-link">Continue reading<span class="screen-reader-text"> &#8220;Announcing Copy Job Activity in Data Factory Pipeline (Generally Available)&#8221;</span></a>

Debezium 3.4 Final: A Feature-Packed Release for Modern Data Pipelines

Debezium 3.4.0.Final arrives with Kafka 4.1.1 support, Quarkus DevServices, geometry transformations, enhanced Oracle metrics, and memory protection features for enterprise-scale CDC deployments.

TechLife
GitHub - debezium/debezium: Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ. - debezium/debezium

GitHub

Legacy Systems = tightly coupled architectures hard to scale, change & maintain.

National Grid tackled this with 4 paradigms:
1️⃣ #DomainDrivenDesign
2️⃣ #TeamTopologies
3️⃣ #EventDrivenArchitecture
4️⃣ #ChangeDataCapture

Details in the #InfoQ article ⇨ https://bit.ly/44UplDs

#SoftwareArchitecture #LegacyCode

Legacy Modernization: Architecting Real-Time Systems Around a Mainframe

The transformation journey is about breaking dependencies. Many enterprises face similar challenges with legacy systems, tightly coupled architectures that are difficult to scale, change, or maintain.

InfoQ
Using watermarks for change data capture in Postgres

Explore the challenges of building a change data capture (CDC) pipeline for Postgres. And see how Sequin used a watermark design inspired by Netflix's DBLog.

Sequin blog

The #OneBillionRowChallenge (#1BRC) went viral in the Java community earlier this year.

In this #InfoQ talk, Gunnar Morling dives into some of the tricks employed by the fastest solutions for processing the challenge’s 13 GB input file within less than two seconds.

Expect insights into:
• Parallelization and efficient memory access
• Optimized parsing routines with SIMD/SWAR
• Custom map implementations

He also shares personal stories and key takeaways from leading this challenge for and with the community.

A must-watch video: https://bit.ly/3Yl7y3v

#transcript included

#Java #EventStreming #ChangeDataCapture #SoftwareArchitecture #DataEngineering

1BRC–Nerd Sniping the Java Community

Gunnar Morling discusses some of the tricks employed by the fastest solutions for processing a 13 GB input file within less than two seconds through parallelization and efficient memory access.

InfoQ
PeerDB Streams - Simple, Native Postgres Change Data Capture

We spent the past 7 months building a solid experience to replicate data from Postgres to Data Warehouses such as Snowflake, BigQuery, ClickHouse and Postgres. Now, we want to expand and bring a similar experience for Queues. With that spirit, we are...

PeerDB Blog

Uncover the secrets of CacheFront – Uber’s innovative #caching solution for its in-house distributed database, Docstore!

Learn how it achieves over 40M reads per second & significant latency reductions: https://bit.ly/3T0l7mf

#InfoQ #DistributedCache #Database #SQL #DistributedData #ChangeDataCapture

Uber's CacheFront: Powering 40M Reads per Second with Significantly Reduced Latency

Uber developed an innovative caching solution, CacheFront, for its in-house distributed database, Docstore. CacheFront enables over 40M reads per second from online storage and achieves substantial performance improvements, including a 75% reduction in P75 latency and over 67% reduction in P99.9 latency, demonstrating its effectiveness in enhancing system efficiency and scalability.

InfoQ