Leanstral: Open-source agent for trustworthy coding and formal proof engineering

Lean 4 paper (2021): https://dl.acm.org/doi/10.1007/978-3-030-79876-5_37

https://mistral.ai/news/leanstral

It’s great to see this pattern of people realising that agents can specify the desired behavior then write code to conform to the specs.

TDD, verification, whatever your tool; verification suites of all sorts accrue over time into a very detailed repository of documentation of how things are supposed to work that, being executable, puts zero tokens in the context when the code is correct.

It’s more powerful than reams upon reams of markdown specs. That’s because it encodes details, not intent. Your intent is helpful at the leading edge of the process, but the codified result needs shoring up to prevent regression. That’s the area software engineering has always ignored because we have gotten by on letting teams hold context in their heads and docs.

As software gets more complex we need better solutions than “go ask Jim about that, bloke’s been in the code for years”.

AI is the reality that TDD never before had the opportunity to live up to

Not just TDD. Amazon, for instance, is heading towards something between TDD and lightweight formal methods.

They are embracing property-based specifications and testing à la Haskell's QuickCheck: https://kiro.dev

Then, already in formal methods territory, refinement types (e.g. Dafny, Liquid Haskell) are great and less complex than dependent types (e.g. Lean, Agda).

Kiro: Agentic AI development from prototype to production

Kiro helps you do your best work by bringing structure to AI coding with spec-driven development.

Kiro is such garbage though
If you add why you think so we might learn something.
The same prompt in the same project gives different results/slightly worse results compared to Claude Code, both using Opus model.