Mastodawn

Patrick Fasano Feb 10, 2024

"A matrix with zeros on the diagonal and positive numbers elsewhere is called a zero-diagonal matrix. Zero-diagonal matrices are not invertible. This means that there is no matrix that can be multiplied by the zero-diagonal matrix to produce the identity matrix." -- Google's AI

Greg Egan Feb 10, 2024

@ProfKinyon Good grief, how does it churn out this nonsense? I really doubt there was anything quite so horribly wrong in its training data, but I guess it must be taking a correct discussion of the non-invertibility of diagonalised matrices with at least one zero on the diagonal and garbling it, while blending the result into something smoothly grammatical, utterly confident, and totally wrong.

Greg Egan Feb 10, 2024

@ProfKinyon I was curious as to how committed Gemini was to this, or whether it might just be random noise, but without knowing the original prompt I still got the same claim about non-invertibility:

Michael Kinyon Feb 10, 2024

@gregeganSF As you said, the training data is probably fine, but it's like trying to cook a gourmet meal by taking every good ingredient in your fridge, throwing it in your blender and seeing what happens.

Greg Egan Feb 10, 2024

If you push it, it gives a masterclass in how to be right, and then wrong, and then right for the wrong reasons …

I especially like this bit:

“However, there's a special case for matrices with a determinant of 0. If the off-diagonal elements are all equal to 1, the inverse still exists and is simply the original matrix itself.”

Matt McIrvin Feb 10, 2024

@gregeganSF @ProfKinyon I think this kind of language model is simply incapable of capturing the logical connections involved in doing real mathematics. Wrong arguments can sound as plausible as correct ones, to the level of plausibility it's capable of achieving.

Greg Egan Feb 10, 2024

@mattmcirvin @ProfKinyon

Absolutely! But the boosters are in denial about that, and churn out endless excuses for the LLMs: “Humans make mistakes too!” “You gave it the wrong prompt!” “This will all be fixed in the next iteration!”

Matt McIrvin Feb 10, 2024

@gregeganSF @ProfKinyon I think people probably will discover some good uses for this technology. But... not this.

The output still sounds like a student who paid enough attention in class to learn the terminology but didn't do any of the homework.

Clifford Adams Feb 10, 2024

@mattmcirvin @gregeganSF @ProfKinyon
A useful exercise when asking LLMs any fact-based question is to reply with something like "That is wrong. Please give the correct answer." (Saying "please" may be more correlated with training data that complies with the request, while using rude language may generate an argument.) Quite often the LLM will "admit" or "apologize" for the wrong (or right!) answer and give a different one.

Shae Erisson Feb 10, 2024

@mattmcirvin @gregeganSF @ProfKinyon my friends and I have settled on: large language models are roughly "a student that's really good at cheating"

Matt McIrvin Feb 10, 2024

@shapr @gregeganSF @ProfKinyon And it's not as if this sort of thing is beyond a machine--computer algebra systems that "understand" the rules of linear algebra in the sense of modeling them internally have existed for half a century. But they don't work this way.

DLS Feb 10, 2024

@mattmcirvin @gregeganSF @ProfKinyon I think this is currently the perfect sort of example of why you can't trust any output of an LLM to be right -- they're good at right sounding answers but not good at actually being able to determine if they're right or have made it up completely.

Robin Adams Feb 10, 2024

@gregeganSF @mattmcirvin @ProfKinyon They call it "hallucinations" which makes it sound like a glitch that just shows up now and then, rather than an LLM's core function. Their one job is "If a response to this prompt appeared in your training data, guess what it would be". It's the same algorithm that produces correct answers and hallucinations.

And (with the caveat that introspection does not tell us how the brain really works): that's not how I think when I understand something. That's how I think when I'm bluffing, when I have to write about something I don't understand.

Matt McIrvin Feb 10, 2024

@robinadams @gregeganSF @ProfKinyon It's possible that you could train this kind of neural network to do better, but it wouldn't be via the "LLM" route of just letting it loose on a giant corpus of data. You might have to actually teach it, like a human--give it some kind of lived experience.

I am not recommending that anyone try this, mind you. But I don't think there will be a lot of effort put into it by the people who are funding this stuff, anyway, because it makes the whole process labor-intensive and obviously unprofitable. We have enough trouble educating humans.

And maybe that wouldn't even work, because it seems like LLMs only get as good as they are by being exposed to a larger corpus than a human being ever encounters. Which implies to me that they're not inherently as good at learning as we are (probably not a surprise, we have an evolutionary head start of many millions of years).

David Feb 10, 2024

That sounds like reinforcement learning with human feedback

Amir Livne Bar-on Feb 10, 2024

@mattmcirvin @robinadams @gregeganSF @ProfKinyon
Deepmind trained a similar type of model with only correct proofs in Euclidean geometry in the training data (which they generated randomly), and it became quite good at suggesting auxiliary constructions in geometry
https://www.nature.com/articles/s41586-023-06747-5
This is not entirely relevant as it doesn't do full mathematical reasoning but I think it shows that "all the text in the internet" might not be the best data source for all uses.

Solving olympiad geometry without human demonstrations - Nature

A new neuro-symbolic theorem prover for Euclidean plane geometry trained from scratch on millions of synthesized theorems and proofs outperforms the previous best method and reaches the performance of an olympiad gold medallist.

Nature

Matt McIrvin Feb 10, 2024

@Pashhur @robinadams @gregeganSF @ProfKinyon I do think a lot of the bad phenomena were seeing come from this sort of automatic "throw a ChatGPT at it" approach to LLMs. The existing ones are out there, seem to be able to generate output about anything, and the temptation to just use them instead of training on something specialized is intense.

Chaucerburnt Feb 10, 2024

@gregeganSF @ProfKinyon Asking ChatGPT to multiply two three-digit numbers and show its working is a fun one. It will very often get the answer right; first and last couple of digits are easy to do via pattern learning. But the working along the way will be nonsense that could never have produced the correct answer.

Useful as a way to show people that just because LLMs can pretend to show how they got to an answer doesn't mean that's actually what they did.

TheSecondVariation Feb 10, 2024

I think this property is true for upper or lower trirangular matrices because there the det is the product of the eigenvalues if my mind serves me correctly.

@gregeganSF @ProfKinyon

Eric Carroll Feb 10, 2024

@gregeganSF
Weapons of Mass Misinformation
@ProfKinyon

GLC Feb 10, 2024

@ProfKinyon
It's basically a Markov process on tokens. The theory is from Charles Dodgson: take care of the sounds, and the sense will take care of itself. Suitable for giving financial advice, for example.

Shannon wrote about this in 1948, and gave examples. Lacking a large database, or the computer power to work with it, he randomly took words out of books on his shelves to get the right transition probabilities.

The LLM will play chess too. Not legally, but according to its own lights.

Thamizh Kudimagan Feb 10, 2024

@ProfKinyon
I think AI will be good in domains where some small error is tolerable. Math is not one of those domains, unless one is dealing with just numerical approximations in computations.

Tony Fisk Feb 10, 2024

@ProfKinyon
"You are false data."
- bomb to crew. Dark Star

Michael Kinyon Feb 10, 2024

@arfisk "Talk to the bomb. You have to talk to it, Doolittle. Teach it phenomenology!" -- Commander Powell

Stargeezer Smith Feb 10, 2024

@ProfKinyon
Maybe no "other" matrix, that is.

Dan Piponi Feb 10, 2024

@ProfKinyon @mattmcirvin Compared to a human it's pretty good. I don't know if you ever read the comments on mathematics on forums with a diverse audience...

RobJLow Feb 10, 2024

@dpiponi @ProfKinyon @mattmcirvin Pretty low bar, though.

Dan Piponi Feb 10, 2024

@RobJLow @ProfKinyon @mattmcirvin No, it's a high bar. It's an extremely high bar. But people get complacent quickly.

Virginicus Feb 10, 2024

@ProfKinyon Well, that puts a different spin on things.

Saagar Jha Feb 11, 2024

@ProfKinyon @fay59 And they say that this AI is going to be able to improve itself