Mastodawn

formal proof verification ( #Lean, #Coq integration with #LLMs) was nascent in 2023. #AlphaProof-class systems crossing IMO gold in 2024 accelerated this. By 2029, novel theorem discovery is AI-led. #Schmidt points to this as the domain where #ASI first produces genuinely superhuman output
Is this true ?

davidak Jun 6

DeepMind’s New AI Found A Strange New Way To Think

https://www.youtube.com/watch?v=Dkqzqw8rxXI

The discussed paper:

https://arxiv.org/html/2605.22763v1
https://github.com/google-deepmind/alphaproof-nexus-results

#AI #LLM #DeepMind #AlphaProof #Math #FormalProof #LeanProver #ErdosProblems #Erdős

DeepMind’s New AI Found A Strange New Way To Think

YouTube

Winbuzzer May 26

https://winbuzzer.com/2026/05/26/google-deepmind-says-alphaproof-nexus-is-still-not-agi-xcxwbn/

DeepMind’s AlphaProof Nexus solves nine Erdős problems using Lean-checked proofs, signaling a new phase in AI math after OpenAI’s geometry claim.

#AI #AlphaProofNexus #AlphaProof #GoogleDeepMind #Google #GoogleGemini #AIResearch #Erdős #Math

José A. Alonso Apr 5

Readings shared April 4, 2026. https://jaalonso.github.io/vestigium/posts/2026/04/04-readings_shared_04-04-26 #AI #AI4Math #ATP #Agda #AlphaProof #Autoformalization #CategoryTheory #CoqProver #FunctionalProgramming #ITP #IsabelleHOL #LLMs #LambdaCalculus #LeanProver #Lisp #Logic #LogicProgramming #LLMs #Math #Physics #Programming #Prolog #Racket #RocqProver #Vampire

Readings shared April 4, 2026

The readings shared in Bluesky on 4 April 2026 are: Why Lean?. ~ Leonardo de Moura. #LeanProver #ITP A formalization of the Gelfond-Schneider theorem. ~ Michail Karatarakis, Freek Wiedijk. #LeanProve

Vestigium

José A. Alonso Apr 3

Reseña de «How we achieved an IMO medal, one year before any other AI system». https://jaalonso.github.io/vestigium/posts/2025/11/14-how-we-achieved-an-imo-medal-one-year-before-any-other-ai-system/ #AI #Math #ITP #LeanProver #AlphaProof

Reseña de «How we achieved an IMO medal, one year before any other AI

En el artículo «How we achieved an IMO medal, one year before any other AI system» se explica cómo AlphaProof logró un nivel de medalla de plata en la Olimpiada Internacional de Matemáticas. Este hito

Vestigium

José A. Alonso Nov 24, 2025

Readings shared November 23, 2025. https://jaalonso.github.io/vestigium/posts/2025/11/24-readings_shared_11-23-25 #AI #Agda #AlphaProof #FunctionalProgramming #ITP #IsabelleHOL #LLMs #LeanProver #Math #OCaml #Rocq

Readings shared November 23, 2025

The readings shared in Bluesky on 23 November 2025 are: DeepMind’s latest: An AI for handling mathematical proofs. ~ Jacek Krywko. #AI #Math #LLMs #ITP #LeanProver #AlphaProof Verified certification

Vestigium

José A. Alonso Nov 21, 2025

Reseña de «DeepMind’s latest: An AI for handling mathematical proofs». https://jaalonso.github.io/vestigium/posts/2025/11/21-deepminds-latest-an-ai-for-handling-mathematical-proofs/ #AI #Math #ITP #LeanProver #AlphaProof

Reseña de «DeepMind’s latest - An AI for handling mathematical proofs»

En el artículo «DeepMind’s latest: An AI for handling mathematical proofs», se presenta a AlphaProof, un sistema de inteligencia artificial capaz de razonar y realizar demostraciones matemáticas compl

Vestigium

José A. Alonso Nov 21, 2025

DeepMind’s latest: An AI for handling mathematical proofs. ~ Jacek Krywko. https://arstechnica.com/ai/2025/11/deepminds-latest-an-ai-for-handling-mathematical-proofs/ #AI #Math #LLMs #ITP #LeanProver #AlphaProof

DeepMind’s latest: An AI for handling mathematical proofs

AlphaProof can handle math challenges but needs a bit of help right now.

Ars Technica

Agnieszka Serafinowicz Nov 20, 2025

AlphaProof od DeepMind: AI zdobyła srebrny medal na Olimpiadzie Matematycznej. Ma to swoją cenę

Komputery są świetne w liczeniu, ale słabe w rozumowaniu. Teraz zespół Google DeepMind ogłosił przełom: AlphaProof, nowy system AI, dorównał srebrnym medalistom Międzynarodowej Olimpiady Matematycznej (IMO) 2024.

Twór DeepMind osiągnął wynik 28 punktów, stając się siódmym podmiotem (obok sześciu ludzi), który rozwiązał najtrudniejsze zadanie.

Jak zauważa ArsTechnica, to ogromny sukces. Do tej pory modele AI nie radziły sobie z dowodami matematycznymi, ponieważ polegały na statystycznym przewidywaniu, co „brzmi” poprawnie, a nie na zrozumieniu struktury matematyki.

TTRL: uczenie się jak człowiek

DeepMind wykorzystało architekturę znaną z AlphaZero (tej od gier: go, szachy), ale dodało trzeci, unikalny element: Test-Time Reinforcement Learning (TTRL). Ten komponent naśladuje podejście człowieka do trudnych problemów.

Kiedy AlphaProof nie potrafi rozwiązać zadania, tworzy setki jego wariacji – uproszczonych, uogólnionych lub luźno powiązanych. Następnie uczy się, próbując rozwiązać te łatwiejsze wersje, aby zdobyć praktykę i nabyć „praktyczne doświadczenie” w trakcie trwania zadania.

Cena srebrnego medalu: dni i setki TPU

Ten sukces ma jednak gigantyczną cenę, co jest kluczowym elementem krytycznej oceny. Po pierwsze, czas. Ludzie uczestniczący w Międzynarodowej Olimpiadzie Matematycznej mieli na rozwiązanie sześciu problemów dwie sesje po cztery i pół godziny. AlphaProof zmagał się z problemami przez… kilka dni, zużywając jednocześnie wiele jednostek TPU (Tensor Processing Unit).

W efekcie cały system potrzebował setek TPU-dni na problem. Jak przyznaje DeepMind, wymagania obliczeniowe są „najprawdopodobniej zbyt kosztowne dla większości grup badawczych”.

Jakby tego było mało AlphaProof nie działał całkowicie autonomicznie. Potrzebował ludzi do przetłumaczenia problemów na formalny język Lean oraz musiał wywołać drugą, wyspecjalizowaną AI (AlphaGeometry 2) do rozwiązania problemu z geometrią.

Podsumowując, AlphaProof to dowód na to, że AI osiągnęło poziom rozumienia logiki, ale do zastąpienia ludzkiego matematyka brakuje mu jeszcze… szybkości, elegancji i pieniędzy. Zatem, gdy ktoś powie, że AI jest dużo szybsza od człowieka, to niniejszy przypadek jest dowodem, że wciąż mamy przewagę.

AWS rzuca wyzwanie Copilotowi. Kiro już dostępne, a startupy dostaną je za darmo

#aiWMatematyce #alphaproof #deepmind #lean #miedzynarodowaOlimpiadaMatematyczna #nature #news #ttrl

José A. Alonso Nov 15, 2025

Readings shared November 14, 2025. https://jaalonso.github.io/vestigium/posts/2025/11/15-readings_shared_11-14-25 #AI #Agda #AlphaProof #CoqProver #FunctionalProgramming #Haskell #ITP #LeanProver #Math #OCaml #Rocq

Readings shared November 14, 2025

The readings shared in Bluesky on 14 November 2025 are: An introduction to formal real analysis (Lecture 18: Rearrangements). ~ Alex Kontorovich. #ITP #LeanProver #Math Choice trees: Representing and

Vestigium