Recce - Trust, Verify, Ship

@DataRecce
31 Followers
318 Following
212 Posts

Helping data teams preview, validate, and ship data changes with confidence.

https://datarecce.io

Websitehttps://datarecce.io
LinkedInhttps://www.linkedin.com/company/datarecce
Blueskyhttps://bsky.app/profile/datarecce.bsky.social

A data review flagged 99.999% row-count variance. The PR was two lines.

Base was five years of production history. Current was a one-hour CI build. Neither was wrong. They were built for different jobs.

False alarms like this train reviewers to scroll past real variance. That is the damage.

https://blog.reccehq.com/session-base-per-pr-why-data-reviews-lie
#dbt #DataEngineering #AnalyticsEngineering

Session Base per PR: Why Data Reviews Lie

Data PR review breaks when the base and current environments are built differently. Here is why, and how session base per PR fixes the false alarms.

Data Renegades | Ep. #11, Contrarian Bets and AI Skepticism with Michael Stonebraker | Heavybit

On episode 11 of Data Renegades, CL Kao sits down with Michael Stonebraker, legendary database pioneer and creator of Ingres and Postgres.

Heavybit
Benchmarks Lie: What a Turing Award Winner Found When He Tested Text-to-SQL on Real Data

Text-to-SQL benchmarks show 80% accuracy. A Turing Award winner tested the same models on a real 1,400-table warehouse and got 10%. Here is why.

The demos show new code being generated from scratch. The hard, high-value problem is navigating systems that have been migrated, patched, and extended by dozens of people over 30 years.

That's the opportunity. It's just not the one getting the press.

90% of enterprise programmers spend their time on maintenance, not greenfield development.

Michael Stonebraker's take on where AI actually earns its keep inverts the marketing narrative completely.

Michael Stonebraker was right about CODASYL. Right about NoSQL. Now he's run text-to-SQL on a real enterprise warehouse and got 10% accuracy against an 80% benchmark.

The pattern is hard to ignore.

https://blog.reccehq.com/benchmarks-lie-what-a-turing-award-winner-found-when-he-tested-text-to-sql-on-real-data

#DataEngineering #TextToSQL #AI

Benchmarks Lie: What a Turing Award Winner Found When He Tested Text-to-SQL on Real Data

Text-to-SQL benchmarks show 80% accuracy. A Turing Award winner tested the same models on a real 1,400-table warehouse and got 10%. Here is why.

Before you let agents touch your codebase, build these gates.

Not because you don't trust the agent - but because you wouldn't trust anyone without them. Including yourself.

https://blog.reccehq.com/before-you-let-agents-touch-your-codebase-build-these-gates #ClaudeCode #AIAgents #DevWorkflow

A wiki is something you look at. A shared AI system is something you work through.

When knowledge lives inside the workflow, it stays current. Every time someone runs a skill, outdated entries get noticed. Gaps get filled.

https://blog.reccehq.com/we-didnt-set-out-to-build-a-team-ai-plugin

#ClaudeCode #DataEngineering

We Didn't Set Out to Build a Team AI Plugin

How Recce built a Claude Code plugin to share team knowledge, voice, product context, and workflows, across every AI session.

"Arrow has the intricacy of a fine Swiss watch." The co-creator of Apache Arrow on why AI agents cannot replicate decade-long infrastructure design.

#ApacheArrow #DataRenegades

AI coding tools generate plausible but wrong SQL constantly. The fix isn't waiting for a smarter model.

AI skills are markdown files that encode domain knowledge into coding tools. No framework, just structured text in a repo.

The loop: code → review → handoff → skills update. Every session makes the next one smarter. One aggregation bug became a permanent rule enforced automatically.

Dori Wilson broke it all down at Data Debug SF. Full writeup:
https://blog.reccehq.com/ai-skills-for-analytics-eng

#DataEngineering #AI

A Practical Guide to AI Skills for Analytics Engineering

I built a self-improving AI skill system for analytics engineering at Recce. Here's the framework, a real bug it caught, and how we scaled it.