Mastodawn

A data review flagged 99.999% row-count variance. The PR was two lines.

Base was five years of production history. Current was a one-hour CI build. Neither was wrong. They were built for different jobs.

False alarms like this train reviewers to scroll past real variance. That is the damage.

https://blog.reccehq.com/session-base-per-pr-why-data-reviews-lie
#dbt #DataEngineering #AnalyticsEngineering

Session Base per PR: Why Data Reviews Lie

Data PR review breaks when the base and current environments are built differently. Here is why, and how session base per PR fixes the false alarms.

Show thread

Recce - Trust, Verify, Ship Apr 29

And listen at: https://www.heavybit.com/library/podcasts/data-renegades/ep-11-contrarian-bets-and-ai-skepticism-with-michael-stonebraker

Data Renegades | Ep. #11, Contrarian Bets and AI Skepticism with Michael Stonebraker | Heavybit

On episode 11 of Data Renegades, CL Kao sits down with Michael Stonebraker, legendary database pioneer and creator of Ingres and Postgres.

Heavybit

Show thread

Recce - Trust, Verify, Ship Apr 29

Check out the conversation on our blog: https://blog.reccehq.com/benchmarks-lie-what-a-turing-award-winner-found-when-he-tested-text-to-sql-on-real-data

Benchmarks Lie: What a Turing Award Winner Found When He Tested Text-to-SQL on Real Data

Text-to-SQL benchmarks show 80% accuracy. A Turing Award winner tested the same models on a real 1,400-table warehouse and got 10%. Here is why.

Show thread

Recce - Trust, Verify, Ship Apr 29

The demos show new code being generated from scratch. The hard, high-value problem is navigating systems that have been migrated, patched, and extended by dozens of people over 30 years.

That's the opportunity. It's just not the one getting the press.

Recce - Trust, Verify, Ship Apr 29

90% of enterprise programmers spend their time on maintenance, not greenfield development.

Michael Stonebraker's take on where AI actually earns its keep inverts the marketing narrative completely.

Recce - Trust, Verify, Ship Apr 28

Michael Stonebraker was right about CODASYL. Right about NoSQL. Now he's run text-to-SQL on a real enterprise warehouse and got 10% accuracy against an 80% benchmark.

The pattern is hard to ignore.

https://blog.reccehq.com/benchmarks-lie-what-a-turing-award-winner-found-when-he-tested-text-to-sql-on-real-data

#DataEngineering #TextToSQL #AI

Benchmarks Lie: What a Turing Award Winner Found When He Tested Text-to-SQL on Real Data

Text-to-SQL benchmarks show 80% accuracy. A Turing Award winner tested the same models on a real 1,400-table warehouse and got 10%. Here is why.

Recce - Trust, Verify, Ship Apr 21

Before you let agents touch your codebase, build these gates.

Not because you don't trust the agent - but because you wouldn't trust anyone without them. Including yourself.

https://blog.reccehq.com/before-you-let-agents-touch-your-codebase-build-these-gates #ClaudeCode #AIAgents #DevWorkflow

Recce - Trust, Verify, Ship Apr 8

A wiki is something you look at. A shared AI system is something you work through.

When knowledge lives inside the workflow, it stays current. Every time someone runs a skill, outdated entries get noticed. Gaps get filled.

https://blog.reccehq.com/we-didnt-set-out-to-build-a-team-ai-plugin

#ClaudeCode #DataEngineering

We Didn't Set Out to Build a Team AI Plugin

How Recce built a Claude Code plugin to share team knowledge, voice, product context, and workflows, across every AI session.

Recce - Trust, Verify, Ship Apr 7

"Arrow has the intricacy of a fine Swiss watch." The co-creator of Apache Arrow on why AI agents cannot replicate decade-long infrastructure design.

#ApacheArrow #DataRenegades

Recce - Trust, Verify, Ship Apr 1

AI coding tools generate plausible but wrong SQL constantly. The fix isn't waiting for a smarter model.

AI skills are markdown files that encode domain knowledge into coding tools. No framework, just structured text in a repo.

The loop: code → review → handoff → skills update. Every session makes the next one smarter. One aggregation bug became a permanent rule enforced automatically.

Dori Wilson broke it all down at Data Debug SF. Full writeup:
https://blog.reccehq.com/ai-skills-for-analytics-eng

#DataEngineering #AI

A Practical Guide to AI Skills for Analytics Engineering

I built a self-improving AI skill system for analytics engineering at Recce. Here's the framework, a real bug it caught, and how we scaled it.

Website	https://datarecce.io
LinkedIn	https://www.linkedin.com/company/datarecce
Bluesky	https://bsky.app/profile/datarecce.bsky.social