@danielpunkass My degree is in automation. I've read all the literature. Gradient descent is not pixie dust.
Large Language Models are calculators you need to negotiate with to approximate a correct answer. It is not a question of scale or resources. It is a question of state -- a thing they are, by design, not capable.
Unless you're discussing neuromorphic computing involving materials far beyond what technology can currently develop, there is no AI technology that resembles your argument.
I hate to be the fly in this ointment but...
I get what Daniel is saying in an abstract sense, but in addition to the public largely not understanding GAI/LLMs they also don't understand its limitations, weaknesses, and inherent flaws. Largely because there's a bill of sale for vested interests that requires continual selling to the public. And certainly trying to not let them hear the 'what if this is as good as it gets' arguments. OpenAI and Anthropic need you to believe panacea is right around the corner, otherwise they can't get more funding. Some of it's snake oil, some of it is legit gain, and some of it is similar to perpetual motion: just around the corner, we're almost there, just need some more money. Many in the industry, myself among them, are expecting the bottom to fall out of that... soon. ish.
LLMs are not the future (although they're certainly not going away until somethign can replace them - but they are, at their core, built on the same concept as T9 on your old dumb phone. Just with a ton more branching and grunt, and things like ToT instead of just CoT that modern hardware makes possible. Aside: I was gonna use an emdash here but then someone would accuse me of using GAI to write this) they're a horse, not the airplane that is SI or the rocket that is AGI. All have similarities, similar minds working on them, but they are all very different beasts, to mix my metaphors.
We're dumping more and more energy into a technology that is barely improving and which even companies like OpenAI admit will likely be flawed (i.e. - hallucinations, reliability of results) forever. As someone else pointed out, silver bullets like gradient descent and are not paying off in this space. The increased energy cost is just to move the needle a little, not get to some new pinnacle. We don't even have anything at this point that has persistent memory and the ability to learn from that. None of them, meaning there is a stochastic element to every response. They're all dumb and deaf and forget you when they start again.
1/2
@hotdogsladies @danielpunkass
Or to put it another way, even the successor to that Commodore 64 is never going to be able to scan your photos. An an entirely new thing will be required to do that and we don't know yet if we can or how to make it.
Addendum: Yes, some cool things exist and more will. Generative tools and whatnot, but then we're into the whole semantics of 'what is AI versus ML' and neither of you absolutely has the will or good sense to listen to me get up on that soapbox. 😄
@danielpunkass the problem is that most people conflate interactions with LLMs as "AI". becuase the LLM is a trained model, it gives reasonable sounding answers and the people interacting with it generally assume it is absorbing new info from the conversation. but it doesn't.
meanwhile there is all the controversy over the information used to train the model.
these two problems have to be solved; the first to make them generally useful and the second for ethics. pointing out ethical issues out is not the same as complaining a Commodore-64 couldn't scan photos.
meanwhile... I will also point out that in 1991 or so the Amiga was really really good at visual tracing and generating video. so advances come in fits and starts.
@danielpunkass hmm, I dunno. I wasn’t alive to witness it (though I did use a C64 in the early 90s), but I don’t think personal computing of the 80s had the same feel of a financial bubble. If anything, there was underinvestment at the time because computers were “weird”.
Dot-com bubble was perhaps more similar: the Web was extremely useful and hadn’t even reached its peak, but many companies were impatient or didn’t quite understand it.