Mastodawn

As a crude first approximation, the problem-solving component of mathematical research (which, one should stress, is not the *only* aspect of such research) can be decomposed into three subcomponents:

1. Proof generation (finding a solution to a given problem);
2. Proof verification (checking that a proposed solution actually works); and
3. Proof digestion (understanding the essence of a solution, placing it in context with previous literature, summarizing and explaining it effectively, and gaining insights on other related problems and topics).

Until recently, all three of these subtasks were rather difficult and time-intensive to perform; but a human mathematician (or a collaboration between several mathematicians) who had invested the effort to both generate a proof and verify it usually gained enough understanding into the structure of that proof that they could also digest it effectively. Because of this, our community has been generally content to emphasize the proof generation and verification aspect of problem solving, as the proof digestion tended to be created naturally as an organic byproduct of these first two aspects. This was also convenient for assessing proof efforts, as the generation and verification tasks had well-defined objectives, whilst proof digestion was a more subjective and open-ended process. [Though "the ability to present the result at a research conference and take questions" is a rough first approximation of a metric for whether a proof has been digested,] (1/3)

Show thread

Terence Tao Apr 22

However, recent advances in both AI and proof formalization have begun to vastly accelerate and automate the first two components of this process. This is leading to a new type of "impedance mismatch": problems for which solutions can be rapidly generated and verified in a mostly automated process, but for which no human author has understood the arguments well enough to initiate the (much slower) digestion process.

In fact, with the current cultural incentives that reward the first authors to "solve" the problem, rather than the later authors who "digest" the solution, one may end up with the perverse situation in which an AI-generated (and formally verified) solution to an problem that is presented to the community without any significant digestion may actually *inhibit* the progress of the field that the problem lies in, by discouraging any further attempts to work on the problem, simplify and explain the proof, and extract broader insights. (2/3)

Show thread

Terence Tao Apr 22

Perhaps having more refined terminology to describe solutions would help. For instance, a proof that is proposed but unverified should be viewed as 1/3 of a solution at best, while a proof that is both proposed and verified remains only 2/3 of a solution until a proper, well-digested presentation of the results has been made.

As a concrete example: I would currently rate the recent progress on Erdos problem #1169 (EDIT: this should be #1196) which has created some buzz on social media as currently constituting 2/3 of a solution in this framework: a proof has been generated and verified, but the broader impacts of the argument on the rest of the field are still being assessed. Eventually, I espect a fully satisfactory solution to this problem to be obtained, but this may take more time that one is now accustomed to in this era of rapid proof generation and proof verification. (3/3)

Show thread

Uriel Fanelli

@tao

I see your argument, but... given the amount of proofs which happens every year, to assess them against "the rest of the knowledge" doesn't scales that well. So get prepared, AI will enter the game hard, and more and more, every year.