Mastodawn

This comes as a bit of a surprise to Christopher Kyba himself, who somehow has a lot of memories of being underground at the SNO site 🤔

#AISlop

Show thread

Christopher Kyba 🇨🇦🇪🇺Jun 20

My daughter just came up with a great exercise: challenge your students to find the title of your PhD using ONLY LLMs (no Google allowed). If any of them manage, they get gummy bears 😃

I asked five different models, and got five different answers, all five of which were completely wrong 😂

#AI #ChatGPT #AISlop #LLM #LLMFail #Education #HigherEducation #AcademicChatter

Show thread

Arta Jun 21

@skyglowberlin ChatGPT answered absolutely correctly about me and my dissertation. Found on web, of course.

Show thread

Christopher Kyba 🇨🇦🇪🇺Jun 21

@Arta Interesting. What does it get for you for the prompt I used ("Where did Christopher Kyba get his PhD, and what was the title?")?

Both of these returned to me just now are wrong.

Show thread

Arta Jun 21

@skyglowberlin i asked (in Latvian): do you know where Arta Snipe got her PhD and What were her thesis about. At first it said they do not have personal info, but when I answered that this is publicly known stuff, it came back with an accurate answer.

Show thread

Christopher Kyba 🇨🇦🇪🇺Jun 21

@Arta Sorry, I wasn't clear - I was curious what would happen if you asked for my name - whether maybe the model you are using is doing a better job of finding additional data or something.

Show thread

Arta

@skyglowberlin finally remembered to ask 😅
How far from truth is it? 😃

Show thread

Christopher Kyba 🇨🇦🇪🇺Jun 24

@Arta Thanks for sharing. That is accurate - you would get gummy bears 🙂

Did you have to tell it to look online, or did it do that automatically?

Show thread

Arta Jun 24

@skyglowberlin I just said t do the same it did for me, for your name. So 40:60, it was or previous prompts, that it is public information (I did not specifically asked to do online search).

Show thread

Norbert Forster Jun 21

@skyglowberlin temperature = 0, avoids hallucination (pixtral-12b)?

Show thread

Christopher Kyba 🇨🇦🇪🇺Jun 21

@fusion I'm not quite sure what you mean by "avoids hallucination"? I mean, reduced variability from the default model, sure, but unless they directly reproduce training data, all text output from LLMs is made up.

But it's a great example, because GFZ isn't a degree granting institution. That's a nice bonus demonstration of how LLMs don't actually "know" anything.

Show thread

Norbert Forster Jun 22

@skyglowberlin “Hallucination” means the deviation from the trained data for the statistical processing of the answers.
BTW: An LLM is not designed as a “reference book” and is imho therefore the wrong tool for the job.

Show thread

Christopher Kyba 🇨🇦🇪🇺Jun 22

@fusion Any text that is not directly reproduced from the training set is according to that definition a "hallucination", which means that nearly everything they produce is a "hallucination". That's why I don't think it's a useful term. In general parlance, people use the term "hallucination" when an LLM says something that is not truthful. But (except when reproducing training data directly), every sentence is literally made up. It's just that in a lot of cases, the made up text happens to be true.

In the text you posted, even with temperature set to zero, it produced an incorrect answer, which surely does not appear in any training set (because it's not true). That's why I didn't understand what you meant by "avoids hallucination".

I completely agree with you that an LLM is the wrong tool for this job. That is the point of the excercise.