Mastodawn

Michael Kinyon Feb 10, 2024

"A matrix with zeros on the diagonal and positive numbers elsewhere is called a zero-diagonal matrix. Zero-diagonal matrices are not invertible. This means that there is no matrix that can be multiplied by the zero-diagonal matrix to produce the identity matrix." -- Google's AI

Show thread

Greg Egan Feb 10, 2024

@ProfKinyon Good grief, how does it churn out this nonsense? I really doubt there was anything quite so horribly wrong in its training data, but I guess it must be taking a correct discussion of the non-invertibility of diagonalised matrices with at least one zero on the diagonal and garbling it, while blending the result into something smoothly grammatical, utterly confident, and totally wrong.

Show thread

Matt McIrvin Feb 10, 2024

@gregeganSF @ProfKinyon I think this kind of language model is simply incapable of capturing the logical connections involved in doing real mathematics. Wrong arguments can sound as plausible as correct ones, to the level of plausibility it's capable of achieving.

Show thread

Greg Egan Feb 10, 2024

@mattmcirvin @ProfKinyon

Absolutely! But the boosters are in denial about that, and churn out endless excuses for the LLMs: “Humans make mistakes too!” “You gave it the wrong prompt!” “This will all be fixed in the next iteration!”

Show thread

Robin Adams

@gregeganSF @mattmcirvin @ProfKinyon They call it "hallucinations" which makes it sound like a glitch that just shows up now and then, rather than an LLM's core function. Their one job is "If a response to this prompt appeared in your training data, guess what it would be". It's the same algorithm that produces correct answers and hallucinations.

And (with the caveat that introspection does not tell us how the brain really works): that's not how I think when I understand something. That's how I think when I'm bluffing, when I have to write about something I don't understand.

Show thread

Matt McIrvin Feb 10, 2024

@robinadams @gregeganSF @ProfKinyon It's possible that you could train this kind of neural network to do better, but it wouldn't be via the "LLM" route of just letting it loose on a giant corpus of data. You might have to actually teach it, like a human--give it some kind of lived experience.

I am not recommending that anyone try this, mind you. But I don't think there will be a lot of effort put into it by the people who are funding this stuff, anyway, because it makes the whole process labor-intensive and obviously unprofitable. We have enough trouble educating humans.

And maybe that wouldn't even work, because it seems like LLMs only get as good as they are by being exposed to a larger corpus than a human being ever encounters. Which implies to me that they're not inherently as good at learning as we are (probably not a surprise, we have an evolutionary head start of many millions of years).

Show thread

David Feb 10, 2024

That sounds like reinforcement learning with human feedback

Show thread

Amir Livne Bar-on Feb 10, 2024

@mattmcirvin @robinadams @gregeganSF @ProfKinyon
Deepmind trained a similar type of model with only correct proofs in Euclidean geometry in the training data (which they generated randomly), and it became quite good at suggesting auxiliary constructions in geometry
https://www.nature.com/articles/s41586-023-06747-5
This is not entirely relevant as it doesn't do full mathematical reasoning but I think it shows that "all the text in the internet" might not be the best data source for all uses.

Solving olympiad geometry without human demonstrations - Nature

A new neuro-symbolic theorem prover for Euclidean plane geometry trained from scratch on millions of synthesized theorems and proofs outperforms the previous best method and reaches the performance of an olympiad gold medallist.

Nature

Show thread

Matt McIrvin Feb 10, 2024

@Pashhur @robinadams @gregeganSF @ProfKinyon I do think a lot of the bad phenomena were seeing come from this sort of automatic "throw a ChatGPT at it" approach to LLMs. The existing ones are out there, seem to be able to generate output about anything, and the temptation to just use them instead of training on something specialized is intense.