[New Blog Post] State of Knuckledragger III: Kernel Changes, Symbolic Union, AI, and more https://www.philipzucker.com/state_o_knuck_3/ #python #logic #theoremproving
State of Knuckledragger III: Kernel Changes, Symbolic Union, AI, and more

Maybe a good idea to take stock of how Knuckledragger https://github.com/philzook58/knuckledragger is going for myself and whomever might be interested.

Hey There Buddo!

Mehtaab Sawhney (@mehtaab_sawhney)

OpenAI 내부 모델이 수학 문제 'Erdos #846'의 증명을 생성해 논문으로 공개됐습니다(링크 포함). 해당 문제는 기존 문헌에서 파생될 수도 있지만, 내부 모델이 직접 증명을 도출한 사례라는 점에서 주목받고 있습니다. 작성자는 모델의 증명을 보고 감탄했다고 언급했습니다.

https://x.com/mehtaab_sawhney/status/2026716221933343147

#openai #theoremproving #airesearch #mathematics

Mehtaab Sawhney (@mehtaab_sawhney) on X

We just posted a paper solving Erdos #846, which was solved by an internal model at OpenAI (https://t.co/TXz7cPCQRH). While the problem can also be derived from an earlier paper in the literature, the proof by the internal model was one of the first instances where I smiled

X (formerly Twitter)

Terminal now can help you with formal proofs and theorem provers 🤯

📐 **lean-tui** — A TUI for visualizing Lean programs and proofs

💯 Live proof trees, data/effect flow views & real-time updates from your editor

🦀 Written in Rust & built with @ratatui_rs

⭐ Source: https://codeberg.org/wvhulle/lean-tui

#rustlang #ratatui #tui #lean #theoremproving #cli #devtools #terminal

Lean 4 is apparently the new secret sauce of #AI dominance, because who knew that theorem proving could be so *riveting*? 🤔✨ But don't worry, before you can learn how to take over the world with math, you'll need to pass the Vercel Security Checkpoint IQ test, where only the chosen ones with #JavaScript enabled may proceed. 🛂🔒
https://venturebeat.com/ai/lean4-how-the-theorem-prover-works-and-why-its-the-new-competitive-edge-in #Lean4 #TheoremProving #VercelSecurity #HackerNews #ngated

Kimon Fountoulakis (@kfountou)

작성자는 모델이 기존 논증의 짧은 연쇄를 올바르게 결합해 내는 능력을 과소평가하지 않으며, 이는 인상적이고 수학자들의 작업 방식을 바꿀 것이라 말합니다. 다만 이러한 성과가 인간이 믿었던 것처럼 문제들이 본질적으로 훨씬 더 어렵다는 관점의 정당성을 완전히 입증하는지는 불확실하다고 지적합니다.

https://x.com/kfountou/status/2022671762173854080

#ai #theoremproving #automatedreasoning #research

Kimon Fountoulakis (@kfountou) on X

@harshit_sikchi I am not discounting the model's ability to correctly combine a small sequence of existing arguments. This is impressive and will change how mathematicians work. All I am saying is that it remains unclear whether humans were right to believe these problems required significantly

X (formerly Twitter)

Jakub Pachocki (@merettm)

저자는 'First Proof' 챌린지에 큰 기대를 표하며, 차세대 AI 모델 능력 평가에 있어 새로운 최전선 연구가 중요하다고 강조합니다. 내부적으로 제한된 인간 감독 하에 제안된 10개 문제에 대해 자사 모델을 실행했다고 밝히며, 이는 AI의 수학적 증명 능력과 자율성 평가에 관한 중요한 실험임을 시사합니다.

https://x.com/merettm/status/2022517085193277874

#firstproof #ai #theoremproving #research #ml

Jakub Pachocki (@merettm) on X

Very excited about the "First Proof" challenge. I believe novel frontier research is perhaps the most important way to evaluate capabilities of the next generation of AI models. We have run our internal model with limited human supervision on the ten proposed problems. The

X (formerly Twitter)

Kimon Fountoulakis (@kfountou)

자율 에이전트가 자체적으로 수학적 추측(conjecture)을 생성하고 증명하는 능력을 갖추게 될 것이라는 전망. 이는 박사학위 과정과 연구자의 역할이 AI 기반 자동화 도구의 등장으로 바뀔 수 있음을 경고하는 내용으로, 자동화된 정리 증명과 연구 보조의 확산을 시사한다.

https://x.com/kfountou/status/2022756585131429994

#aiagents #theoremproving #automatedresearch #ai

Kimon Fountoulakis (@kfountou) on X

@Bayesprof @SebastienBubeck @boazbaraktcs Soon we will also have agents that autonomously generate conjectures and prove them. So PhDs will need to get better at this too.

X (formerly Twitter)
First Proof (#1stProof): We ran an AI-only workflow (no human mathematical input) and published a writeup + outputs.
Report: https://althofer.de/first-proof-competition/first-proof-report.html
Official: https://1stproof.org/
I’d appreciate critique—especially rigor/correctness checks and suggestions for better verification.
#1stProof #Mathematics #TheoremProving #AI
Team Wolz & Althofer

First Proof Competition

Quoc Le (@quocleix)

새 연구 'Semi-Autonomous Mathematics Discovery with Gemini' 공개: Gemini를 사용해 Erdős Problems 데이터베이스의 700개 공개 추측을 체계적으로 평가했고, 13개 문제를 다루어 그중 5건에서 자율적으로 새로운 해법을 찾아낸 성과를 보고함. 자율적 수학 발견 연구의 중요한 사례.

https://x.com/quocleix/status/2018402933193539735

#gemini #theoremproving #airesearch #mathdiscovery

Quoc Le (@quocleix) on X

Excited to share our latest work: "Semi-Autonomous Mathematics Discovery with Gemini." We used Gemini to systematically evaluate 700 "open" conjectures in the Erdős Problems database. The result? We addressed 13 problems marked as open—finding 5 novel autonomous solutions and

X (formerly Twitter)