Recce - Trust, Verify, Ship

@DataRecce
33 Followers
318 Following
203 Posts

Helping data teams preview, validate, and ship data changes with confidence.

https://datarecce.io

Websitehttps://datarecce.io
LinkedInhttps://www.linkedin.com/company/datarecce
Blueskyhttps://bsky.app/profile/datarecce.bsky.social

AI coding tools generate plausible but wrong SQL constantly. The fix isn't waiting for a smarter model.

AI skills are markdown files that encode domain knowledge into coding tools. No framework, just structured text in a repo.

The loop: code → review → handoff → skills update. Every session makes the next one smarter. One aggregation bug became a permanent rule enforced automatically.

Dori Wilson broke it all down at Data Debug SF. Full writeup:
https://blog.reccehq.com/ai-skills-for-analytics-eng

#DataEngineering #AI

A Practical Guide to AI Skills for Analytics Engineering

I built a self-improving AI skill system for analytics engineering at Recce. Here's the framework, a real bug it caught, and how we scaled it.

Subagents return summaries to orchestrator, not raw payloads. Built with Claude Agent SDK and MCP.

https://blog.reccehq.com/designing-reliable-ai-agents-for-dbt-data-reviews

#dbt #DataEngineering #AI #MCP

Designing Reliable AI Agents for dbt Data Reviews

Code changes have AI review tools. Data changes don' - until now. Here's how we went from a single prompt to an AI agent that performs the first pass on data validation in every PR.

Our own Kent Chen wrote up the multi-agent architecture the team built for Recce's AI Data Review.

Single agent kept forgetting findings as PRs got complex. Fix: orchestrator + two specialists, each with its own 200k context window.

One subagent fetches full PR context via a single GitHub GraphQL MCP call (replaced 5-10 gh CLI round-trips). The other explores data through 6 Recce MCP tools: lineage_diff, schema_diff, row_count_diff, custom queries.

"Pandas in 2011 was essentially book-driven development, quite literally." Wes wrote features because he needed them for the book chapter.

#pandasPython #DataRenegades

"If I had taken three years longer to do things the right way, it would have been too late." -- Wes McKinney on why pandas shipped imperfect and won.

#pandasPython #DataRenegades

"This is bad. Why haven't you fixed this yet? I would have already fixed this today with Claude code." -- Wes McKinney on radical accountability for software vendors.

#AIcodingagent #DataRenegades

Starting soon: Bauplan x Recce live session on safe AI automation from branch to production. https://luma.com/mm3gsalo?tk=S6sKK7

#DataEngineering #AI

Trusting AI with Your Data: Safe Automation from Branch to Production · Zoom · Luma

AI coding assistants already help you ship software faster. But when it comes to data, the stakes are different. One bad pipeline change can corrupt production…

"I would just wake up and write Python code." Wes McKinney on the founder hours that created pandas.

#DataRenegades #pandasPython

Tomorrow 9 AM PT | Bauplan and Recce walk through a branch-to-production workflow where AI-generated pipeline changes run on isolated branches and get reviewed automatically before hitting production. https://luma.com/mm3gsalo?tk=S6sKK7
Trusting AI with Your Data: Safe Automation from Branch to Production · Zoom · Luma

AI coding assistants already help you ship software faster. But when it comes to data, the stakes are different. One bad pipeline change can corrupt production…

Scott Breitenother jumped into every Slack thread at Brooklyn Data. It made the company fast and him the bottleneck. His fix: subscribe to replies, don't comment, check back in 3 hours.

#Leadership #DataTeams