This comes as a bit of a surprise to Christopher Kyba himself, who somehow has a lot of memories of being underground at the SNO site ๐Ÿค”

#AISlop

My daughter just came up with a great exercise: challenge your students to find the title of your PhD using ONLY LLMs (no Google allowed). If any of them manage, they get gummy bears ๐Ÿ˜ƒ

I asked five different models, and got five different answers, all five of which were completely wrong ๐Ÿ˜‚

#AI #ChatGPT #AISlop #LLM #LLMFail #Education #HigherEducation #AcademicChatter

@skyglowberlin "(no Google allowed)" That makes this a pretty foolish exercise to suggest IMO. The bot's guess will be no different than anyone else who's never heard your bio.
@Jay42 And what do you suppose I hope the students might learn from such an exercise?
@skyglowberlin They'll learn their teacher gives them arbitrary tasks that have no bearing in reality.
@Jay42 Over the years I have taught there is always a subset of students who have difficulty in understanding that the overt task you set is merely a vehicle to carry a much more valuable growth/learning lesson. When you lift weights in the gym - is the point of that exercise to just arbitrarily get those weights higher off the floor? Or are they just a simple representative example, to help you get stronger, so that you can lift something for real when you have to?
@autovectis We see in current day that some people can use trial and error and still not learn (example: https://www.theguardian.com/technology/2025/jun/18/whatsapp-ai-helper-mistakenly-shares-users-number ) which further reinforces that it's a foolish exercise. With your analogy, would you teach them the incorrect way to lift weights? It's the equivalent of letting them hurt themselves lifting incorrectly to "teach them a lesson." Are you saying it's a smart exercise? It's a fool's errand.
โ€˜Itโ€™s terrifyingโ€™: WhatsApp AI helper mistakenly shares userโ€™s number

Chatbot tries to change subject after serving up unrelated userโ€™s mobile to man asking for rail firm helpline

The Guardian
@Jay42 - That's not a good example because it carries real health risks. Getting people to look something up online using, a deliberately bad strategy, carries no risk. But I suspect you're making a couple of assumptions here - that this is the only strategy that is being used (Rather than one strand out of a bunch that are being used) And, secondly, that the reason behind the task has not been explained to the students? Is this approach something that frustrates you as a student?
@autovectis "That's not a good example because it carries real health risks." I was correcting your bad analogy. Seems your ego couldn't take it.
@autovectis I like the weight lifting analogy. We're watching @skyglowberlin telling his students to lift with their backs so some might learn that it's incorrect form. If any of his students are so naive to believe him he's also potentially taught them nothing.
@Jay42 I wish you good luck with your studies and look forward to buying burgers off you in the future.
@autovectis Hilarious ad hominem, you really won the argument with that one. Yet I'm the one being called rude and thick.

@Jay42 @skyglowberlin

Welcome, reply guy.

Muting, as I don't have patience (too old), but with best wishes for your personal journey.

@glc@mastodon.online @skyglowberlin Then why say anything? You didn't have to be a weirdo.
@Jay42 @skyglowberlin "it's an excellent FPS game as long as you don't try to walk through the walls" (because it has no wall collision. you can just walk through.)
@skyglowberlin I think it is a useful exercise! I think some useful takeaways would be: 1) (Without search tool turned on, and sometimes even with search tool available) even the best LLMs are usually willing to guess answers to extremely obscure questions, like yours. 2) they are very likely to get them wrong.
@Jay42 it should give out "No result" instead of phantasizing. Wrong is worse than none. @skyglowberlin
@Dodo_sipping @skyglowberlin Many do when you stop roleplaying fantasies with the bot.

@Dodo_sipping What I hope they would learn is that LLMs don't actually know anything, so they can't know when to not give an answer. All texts are made up fantasy, it's just that for some topics the fantasy happens to be close to reality or even true.

But you can never tell unless you do your own research.

@skyglowberlin

The chatbot is better. But you used the API or the model directly, right?

@tinoeberl U of A is still wrong. I did a bachelor degree there. Try telling it I didn't get a degree in Canada - when I tried that, it said I got my degree in Heidelberg, then when I told it my degree was in the USA it said my PhD was from Berkeley, and when I said no, it was in the eastern US, then it said my PhD was from Brown.

I was using the 4.1 model.

I have seen "better" results in the past, meaning the probabilistically generated text was closer to the truth, but it's never actually been correct. And every time I have tried the models have always gotten wrong who my collaborators from that time were, despite about a dozen papers where we're listed together. If anything, they seem to be doing worse than they manged 6-9 months ago.

@skyglowberlin ChatGPT answered absolutely correctly about me and my dissertation. Found on web, of course.

@Arta Interesting. What does it get for you for the prompt I used ("Where did Christopher Kyba get his PhD, and what was the title?")?

Both of these returned to me just now are wrong.

@skyglowberlin i asked (in Latvian): do you know where Arta Snipe got her PhD and What were her thesis about. At first it said they do not have personal info, but when I answered that this is publicly known stuff, it came back with an accurate answer.
@Arta Sorry, I wasn't clear - I was curious what would happen if you asked for my name - whether maybe the model you are using is doing a better job of finding additional data or something.

@skyglowberlin

Wait until youโ€™re famous. Then all LLMs will know you. ;)

@skyglowberlin

This exercise won't work for my students - at least with brave AI:

The brave AI was pretty good with mine. It took me only one extra specification to find the correct title.

@skyglowberlin temperature = 0, avoids hallucination (pixtral-12b)?