


StarkWare's S-two prover provides an efficient means for establishing, on blockchain, that a program written in the Cairo virtual machine language runs to completion. The latter claim is encoded by an algebraic intermediate representation (AIR) that captures the semantics of the Cairo language. The AIR asserts the existence of tables of values from a finite field satisfying certain algebraic constraints. A cryptographic interactive proof system, circle STARK, provides an efficiently-checked certificate that the AIR is satisfied. We describe our verification, using the Lean 4 proof assistant, that the AIR encoding is sound, which is to say, the satisfiability of the AIR implies the computational claim.

While Large Language Models (LLMs) have shown strong performance in generating formal proofs, their outputs often remain less readable, modular, maintainable, and reusable than proofs in mature formal mathematics libraries. We argue that this gap stems in part from the compile-first objective implicit in most proof-generation pipelines, which encourages monolithic or ad hoc proof scripts rather than library-quality artifacts. Existing approaches to proof-quality improvement often rely on explicit, computable optimization objectives. In practice, however, the most tractable and experimentally validated objectives are largely length-based, while higher-level qualities such as readability, modularity, maintainability, and reusability are difficult to reduce to reliable automatic metrics. Instead of optimizing proof improvement against a single proxy metric, we take a process-guided approach inspired by human proof-refactoring workflows. We propose an agentic framework $\textbf{Proof-Refactor}$ that decomposes proof refactoring into four phases: extracting candidate proof fragments, designing helper declarations, formally proving the extracted and designed components, and repairing the original proof using the verified components. On generated Lean proofs from PutnamBench and Putnam2025, Proof-Refactor improves rubric-based refactoring scores over a strong Claude Code refactoring baseline, with the largest gains in signature quality and human readability. These results suggest that process-guided refactoring can improve proof structure without treating proof length as the primary objective.
𝗟𝗲𝗮𝗻 𝗣𝗼𝗼𝗹
"Lean Pool is a repo for preserving substantial sorry-free formalizations that are valuable but do not fit naturally into mathlib's scope."
Read the rest of the post by Vasily Ilin on LinkedIn:
https://www.linkedin.com/posts/vasilyilin_lean-mathlib-lean4-share-7467787563861266432-CEpZ/

Today I am releasing Lean Pool - the result of 100+ human hours and 1000+ agent hours. Lean Pool is a repo for preserving substantial sorry-free formalizations that are valuable but do not fit naturally into mathlib’s scope. At the moment we live in the paradigm of "Mathlib or it didn't happen" (to quote Oliver Nash and Kevin Buzzard), which leads to Mathlib getting overwhelmed. It does not have to be this way. Informal mathematics has arXiv, and I believe that #Lean should have something similar. Lean Pool currently contains 60 imported projects, all licensed with Apache-2 and MIT, all compiling under v4.31-rc1, with a total of 400k LOC, with docs built on each PR. The CI consists of deterministic linters that are stricter than mathlib, LLM review for PRs, and profiling for compilation speed. The philosophy is to ensure as much code quality as possible without spending the valuable time of mathlib maintainers. I hope that Lean Pool gives a place to pool together large one-off formalizations, including AI formalizations, which will likely never get merged in mathlib. I also hope that the Lean Pool encourages code reuse among non-mathlib projects, with some new formalizations using Lean Pool as a dependency. If you want to contribute your formalization project, simply tell your agent to make a PR and get it to pass CI; in my experience this takes the agent 1-5 hours for a ~10k LOC project. At the moment this is a personal project. In particular, I am paying for the LLM reviews and agents myself. If anyone is interested in a collaboration, I am happy to talk. Lean Pool was started by myself and Justin Asher as part of the UW Lean Hackathon hosted by the UW Math AI lab. Any feedback is welcome! Big thanks to Leonardo de Moura, the #mathlib maintainers and the broader OSS Lean community! Let's pool our Lean together! Repo: https://lnkd.in/g6TFNAXF Docs: https://lnkd.in/gyWc2QTY #Lean4 #mathlib4 #formalization | 10 comments on LinkedIn

We present Lean-GAP (Lean-Graduate Agebra Problems), 430 formalized graduate-level algebra problems from the textbook Abstract Algebra by Dummit and Foote. We develop a scalable pipeline consisting of PDF-to-LaTeX preprocessing, autoformalization into Lean 4, and verification of informal-formal correspondence. While the preprocessing and autoformalization stages can be largely automated, we find that verification remains the most subtle and labor-intensive component, requiring careful human oversight. Our contributions include (i) the construction of a structured dataset of formalized exercises, (ii) a systematic methodology for formalizing textbook mathematics, and (iii) an analysis of recurring challenges in the formalization process. We also compare the performance of different autoformalization models and highlight key bottlenecks in translating informal statements into formal language.

El reto de esta semana consiste en demostrar con Lean4 que existen infinitos números primos. Para ello, completar la siguiente teoría de Lean4: import Mathlib.Tactic import Mathlib.Data.Nat.Prime.Defs example (n : ℕ) : ∃ p, n ≤ p ∧ Nat.Prime p := by sorry El enunciado se encuentra en Lean 4 Web. Las soluciones pueden publicarse hasta las 21:00 del domingo 7 de junio de 2026. #RetoLean4 #31May26