Mastodawn

As a crude first approximation, the problem-solving component of mathematical research (which, one should stress, is not the *only* aspect of such research) can be decomposed into three subcomponents:

1. Proof generation (finding a solution to a given problem);
2. Proof verification (checking that a proposed solution actually works); and
3. Proof digestion (understanding the essence of a solution, placing it in context with previous literature, summarizing and explaining it effectively, and gaining insights on other related problems and topics).

Until recently, all three of these subtasks were rather difficult and time-intensive to perform; but a human mathematician (or a collaboration between several mathematicians) who had invested the effort to both generate a proof and verify it usually gained enough understanding into the structure of that proof that they could also digest it effectively. Because of this, our community has been generally content to emphasize the proof generation and verification aspect of problem solving, as the proof digestion tended to be created naturally as an organic byproduct of these first two aspects. This was also convenient for assessing proof efforts, as the generation and verification tasks had well-defined objectives, whilst proof digestion was a more subjective and open-ended process. [Though "the ability to present the result at a research conference and take questions" is a rough first approximation of a metric for whether a proof has been digested,] (1/3)

Monoid Mary Apr 23

Show thread

Terence Tao

However, recent advances in both AI and proof formalization have begun to vastly accelerate and automate the first two components of this process. This is leading to a new type of "impedance mismatch": problems for which solutions can be rapidly generated and verified in a mostly automated process, but for which no human author has understood the arguments well enough to initiate the (much slower) digestion process.

In fact, with the current cultural incentives that reward the first authors to "solve" the problem, rather than the later authors who "digest" the solution, one may end up with the perverse situation in which an AI-generated (and formally verified) solution to an problem that is presented to the community without any significant digestion may actually *inhibit* the progress of the field that the problem lies in, by discouraging any further attempts to work on the problem, simplify and explain the proof, and extract broader insights. (2/3)

Show thread

Terence Tao Apr 22

Perhaps having more refined terminology to describe solutions would help. For instance, a proof that is proposed but unverified should be viewed as 1/3 of a solution at best, while a proof that is both proposed and verified remains only 2/3 of a solution until a proper, well-digested presentation of the results has been made.

As a concrete example: I would currently rate the recent progress on Erdos problem #1169 (EDIT: this should be #1196) which has created some buzz on social media as currently constituting 2/3 of a solution in this framework: a proof has been generated and verified, but the broader impacts of the argument on the rest of the field are still being assessed. Eventually, I espect a fully satisfactory solution to this problem to be obtained, but this may take more time that one is now accustomed to in this era of rapid proof generation and proof verification. (3/3)

Show thread

Raphael Apr 22

@tao https://bcantrill.dtrace.org/2026/04/12/the-peril-of-laziness-lost/ this went around the other day day too as a thought process that I think applies here too: tooling makes awkward, messy solutions possible but can totally lead us to … engineering dead ends so to speak.

Human laziness searches for simpler solutions, because otherwise we just struggle to move forward. Losing laziness as a force will mean less easy-to-use foundations… maybe

The peril of laziness lost | The Observation Deck

Show thread

domotorp Apr 23

@tao Do you mean problem # 1196?

Show thread

Krystal Guo Apr 23

@domotorp @tao Also wondering if #1196 is meant. If so, how are you thinking about the note at https://www.ulam.ai/research/erdos1196-note.pdf relative to the 2/3 framing? It's dated 15 April and does a lot of what step 3 asks for: history back to Erdős 1935, discussion of potential applications, etc, also crediting the forum discussion, including @tao For broader field impact, we'll have to wait and see, the same as for a slowly written and developed paper with a human-generated proof.

Show thread

theHigherGeometer Apr 24

@krystalguo It's definitely 1196, that's the one that was solved with ChatGPT and the proof was simplified and much improved by people digesting it.

@domotorp @tao

Show thread

Terence Tao Apr 25

@krystalguo @domotorp Yes, I meant #1196; this is now corrected.

This AI-generated summary has the outward form of proof digestion, but is incomplete, as it is drawn mostly from the web forum discussion at a snapshot in time and omits several prior references, as well as subsequent developments (such as the further simplification of the argument using flow networks). These shortcomings are not readily apparent from the internal content of that summary itself, but instead require expert comparison with the relevant literature and discussion, so these sorts of AI-generated documents do not truly solve the digestion problem, and treating them as such may in fact cause readers to be inadequately informed about the full relationship between a given argument and the rest of the literature.

As I mentioned in the original post, my rule of thumb as to whether proof digestion has truly occurred is if a human can present the proof at a research talk, and give intelligent answers to followup questions on that talk. Having an AI-generated summary such as this may help such a human gain a superficial understanding of a problem, its solution, and its broader context, but may not necessarily provide a deep enough understanding to be able to answer such follwup questions adequately.

Show thread

Victor Lu Apr 26

@tao I created my mathstodon account today - April 26, 2026 to follow the interesting threads where Terence post about mathematics and AI.

Show thread

Uriel Fanelli Apr 23

@tao

I see your argument, but... given the amount of proofs which happens every year, to assess them against "the rest of the knowledge" doesn't scales that well. So get prepared, AI will enter the game hard, and more and more, every year.

Show thread

Rocky Apr 24

@tao we are at the start of a New Enlightenment. LLMs make it easy to get empirical results quickly, form theories, then verify them with predictive tests.
The issue is we can verify theories before we understand them. Personally: I have an unpublished formal theory for my own public work on agent reliability, thanks to AI. But it is at the edge of my understanding.
I'm handing this by posting one concept at a time. Each one builds my understanding. Once done, they make the paper I will publish.

Show thread

Mikhled Apr 25

@tao, when u gonna announce sair 01
Mathematics Distillation Challenge - Equational Theories - Stage1 ?????

Show thread

Anshar_I Apr 26

@tao Definitely, colleague. However, sooner or later regulations will do the essential effect where the most probable will be a change or for politics or for corporations. We, as universal defenders of freedom, have to demonstrate that it is our horse, and as our horse AI needs to use for universal motives social, moral, ethic and scientifically right. A greeting. 🤵👍🍻🍻🍻