A data review flagged 99.999% row-count variance. The PR was two lines.

Base was five years of production history. Current was a one-hour CI build. Neither was wrong. They were built for different jobs.

False alarms like this train reviewers to scroll past real variance. That is the damage.

https://blog.reccehq.com/session-base-per-pr-why-data-reviews-lie
#dbt #DataEngineering #AnalyticsEngineering

Session Base per PR: Why Data Reviews Lie

Data PR review breaks when the base and current environments are built differently. Here is why, and how session base per PR fixes the false alarms.

Auto-diff every model on every PR? Tempting.
But you’ll get ⚠️ dozens of alerts, most irrelevant.

CI without context = alert spam.

Real-world data work needs more than diffs: what changed, why, and what to do.

Human judgment matters.
Recce helps automate with opinions.

πŸ‘‰ https://datarecce.io/blog/more-than-data-diff/

#dataengineering #datadiff #analyticsengineering #datavalidation

Data diff tells you something changed. but NOT:

❓ Was it expected?
⛏️ Worth investigating?
πŸ”— What depends on it?

Let’s stop mistaking output for insight.
See how πŸ‘‰ https://datarecce.io/blog/more-than-data-diff/

#dataengineering #datadiff #analyticsengineering #datavalidation

"The value didn't justify the effort."
A Sr. Data Engineer at Swedish MediaTech after their Datafold PoC.

❌ Heavy setup β†’ Noisy results β†’ Alert fatigue β†’ No way to start small
Comparison: https://datarecce.io/blog/recce-vs-datafold/

#DataEngineering #DataValidation #analyticsengineering

Don’t start with what changed. Start with what SHOULD change!

Because not every diff is a problem, and not every problem shows up as a diff.

πŸ‘‰ https://datarecce.io/blog/more-than-data-diff/

#dataengineering #datadiff #analyticsengineering #datavalidation

Want to better understand how your data models work, and what might break when they change?

Here are 5 types of column transformations in #dbt models:
1. Pass-through
2. Renamed
3. Derived
4. Source
5. Unknown

Each one helps you assess impact and trace data through your pipeline more clearly

We use these types in Recce to power column-level lineage and breaking change analysis

Read the deep dive
https://datarecce.io/blog/column-level-lineage-internals/

#SQL #Data #OpenSource #DataEngineering #AnalyticsEngineering

How Recce Performs Column-Level Lineage: Our Approach to SQL Transformations

A technical deep dive into how Recce constructs column-level lineage using SQLGlot. We break down scope traversal, AST analysis, transformation classification, and the challenges involved in building reliable lineage across complex SQL models.

Recce

Stop duplicating dashboards to preview the impact of dbt model changes!

With Recce, you can:
βœ… Diff models
βœ… Record data impact
βœ… Share results instantly β€” no dashboards, no SQL, no screenshots

One click. Instant clarity:

https://medium.com/inthepipeline/stop-duplicating-dashboards-just-to-preview-dbt-changes-21ff0ba8dd21

#dbt #DataOps #SQL #data #analyticsengineering #DataEngineering

Stop duplicating dashboards just to preview dbt changes

If you’re building temporary dashboards to validate changes to dbt models, you’re not alone. This is a common practice by data teams when sharing the impact of new modeling changes. The issue with…

In the Pipeline

GenAI just got real for analytics engineers. See how dbt Copilot is changing workflows in our latest blogπŸ‘‡

https://dataroots.ghost.io/genai-is-taking-over-the-modern-data-stack-and-dbt-copilot-is-leading-the-way-in-analytics-engineering-2/

#GenAI #dbt #AnalyticsEngineering #DataStack

Dataroots

Thoughts, stories and ideas.

Dataroots

Column-level lineage is now available in Recce 0.57

Add it to your dbt data validation workflow:

1. Lineage Diff - Focus on impacted models
2. Breaking Change Analysis - Eliminate irrelevant changes
3. Column-Level Lineage - Track column evolution

#dbt #DataEngineering #Data #SQL #Analytics #AnalyticsEngineering

❌ No more manually cross-referencing dbt docs from dev and prod

❌ No more manually checking schemas in your data warehouse

❌ No more manually comparing row-counts on models.

See a trend?

Read how an experienced data professional validates zero regression on a #dbt PR:

https://www.linkedin.com/posts/abdelm_recce-dbt-activity-7300436694808358914-aKiF?rcm=ACoAAAVLBgkB6r61wCsOcKgFDrtf_EEEe3UjdXs

#DataEngineering #dbt #Data #Analytics #AnalyticsEngineering #SQL #BigQuery #OpenSource

Abdel. M. on LinkedIn: #recce #dbt

πŸš€ AccΓ©lΓ©rer la data validation avec Recce Contexte J’ai 2 envs : preprod et prod avec un Airflow qui run des centaines de modΓ¨les dbt et gΓ©nΓ©re des…