Mastodawn

Angafirith Jul 31, 2023

A jargon-free explanation of how AI large language models work

Want to really understand large language models? Here’s a gentle primer.

https://arstechnica.com/science/2023/07/a-jargon-free-explanation-of-how-ai-large-language-models-work/?utm_brand=arstechnica&utm_social-type=owned&utm_source=mastodon&utm_medium=social

A jargon-free explanation of how AI large language models work

Want to really understand large language models? Here’s a gentle primer.

Ars Technica

Show thread

Becca 🏳️‍⚧️ 🇺🇦 🇸🇩 🇵🇸Jul 31, 2023

@arstechnica it's autocorrect, but more trained.
It doesn't "know", it doesn't "understand", it doesn't "think"..
It's just using probability (see statistics) to make sentences up (like autocorrect when it suggests other words as you type).

Show thread

cetan Jul 31, 2023

@rebeccafinn @arstechnica several times the article says that researchers don't understand how the LLM "does" something. I find that very strange. How could they not know what it is doing? Is this just the author not differentiating between outside researchers that don't have access to proprietary information and internal researchers working to build the LLM?

Show thread

Urethramancer🐀Jul 31, 2023

@cetan @rebeccafinn @arstechnica It's mathemagics created by neural networks, probably. They are known to create code which baffles humans, but somehow works :)

Show thread

Becca 🏳️‍⚧️ 🇺🇦 🇸🇩 🇵🇸Jul 31, 2023

@Urethramancer @cetan @arstechnica this is a good video from 5 years ago... It explains this.
https://youtu.be/R9OHn5ZF4Uo

How AI Learns

YouTube

Show thread

Becca 🏳️‍⚧️ 🇺🇦 🇸🇩 🇵🇸Jul 31, 2023

@Urethramancer @cetan @arstechnica and then follow it with this video from the same guy:
https://youtu.be/wvWpdrfoEv0

How AI, Like ChatGPT, Really Learns

YouTube

Show thread

cetan Jul 31, 2023

@rebeccafinn @Urethramancer @arstechnica thank you!

Show thread

Conor O'Neill Jul 31, 2023

@arstechnica
This article about how Large Language models work is well worth reading.

Show thread

Michael Gemar Jul 31, 2023

@arstechnica @dangillmor I would respectfully suggest that if an article uses “vector”, it’s not jargon-free (at least for most audiences).

Show thread

George Aug 1, 2023

@arstechnica
The article is 90% usable if one is trying to understand the LLM, By the end. When it starts to discuss GPT-3 performance the author radically pivots to examples of users anthropomorphizing the LLM model. One example even compares GPT-3 to 3 to 7/ year old kids. Kinda killed the whole thing for me.
A logical explanation of LLM structures suddenly morphs into witchcraft.

Show thread

khays Aug 1, 2023

@arstechnica nice explanation

A jargon-free explanation of how AI large language models work

How AI Learns

How AI, Like ChatGPT, *Really* Learns

How AI, Like ChatGPT, Really Learns