Mastodawn

derpyunicorn Mar 29, 2023

I don't disagree, but what he leaves out is the data. LLM are trained on all the data they can find, not just factually correct statements. Any model that relies on vast amounts of data has this annotation problem, doesn't matter whether it's autoregressive or not.

Show thread

derpyunicorn Mar 29, 2023

@remixtures

Good article, but I don't understand his hedging in your quote (Do they understand in this constructive sense? Probably not).

There is absolutely no math behind transformers that maps onto generating understanding, they are just word generators. I think it's dangerous to interview non-ML people about what they think LLMs do, they simply don't have the background and it will not help the discussion.

derpyunicorn Mar 28, 2023

@joytrek @quotesofnote @mdekstrand

Yeah, agreed, that's why I personally don't have any smart devices...

derpyunicorn Mar 27, 2023

@joytrek @quotesofnote @mdekstrand

But wouldn't that mostly be actors? Or do you mean based on Alexa/Google Assistant/Siri... I think for the latter they would have to significantly expand their data recording, right now they only save ~30 second snippets. Scary thought.

Show thread

derpyunicorn Mar 27, 2023

Of course all living beings are also born with sets of behavioral patterns, so they are "pre-trained" in that sense. But it doesn't seem like LeCun is talking about those.

derpyunicorn Mar 27, 2023

@Riedl

LeCun's arguments about why LLMs are limited seem fine to me. However his last slide is just plain wrong. He claims that "almost nothing is learned through supervision or imitation". Babies/toddlers learn almost everything by imitation, and later of course comes a lot of supervision (schools). Animals also learn through imitation and some light supervision.

https://drive.google.com/file/d/1BU5bV3X5w65DwSMapKcsr0ZvrMRU_Nbi/view?usp=drivesdk

Show thread

derpyunicorn Mar 27, 2023

@Techronic9876

I highly doubt it. Eventually the web-scraping-based datasets will be so degenerate that it leads to new innovations, probably as one step further towards AGI (i.e. training without huge datasets, as is the case for all living creatures).

Show thread

derpyunicorn Mar 26, 2023

@quuux

Ah no, it's worse than that! Structured text with decent grammar will quickly become suspect. It used to be that it was a sign somebody was putting some effort into their writing, now it will increasingly make me think this someone was lazy.

derpyunicorn Mar 26, 2023

@quotesofnote @mdekstrand

I also wonder what it will mean for the development of better language models/bots. I actually think that this could lead to the next breakthrough in AI - if you cannot train on webcrawl-based data anymore (because they will mostly have been created by AI), people will need to invent other language-generation methods than simply relying on huge datasets.

Show thread

derpyunicorn Mar 26, 2023

@PopTarts @mdekstrand

Sounds like another instance of Doctorow's enshittificatin theory (https://pluralistic.net/2023/01/21/potemkin-ai/#hey-guys).

I am already mostly reading a curated list of webpages that I trust, I expect this will be even more necessary in the future.

Pluralistic: Tiktok’s enshittification (21 Jan 2023) – Pluralistic: Daily links from Cory Doctorow