The climate cost of the AI revolution

On the energy cost of Large Language Models, what their widespread adoption could mean for global COâ‚‚ emissions, and what could be done about it.

Wim Vanderbauwhede
@csepp Thanks for the article (and consultation link). That article appears to use the cost of training current LLMs, and scales up to 100 concurrent active models with such a training cost. This paper discusses an extrapolated increase in the energy use of training models to obtain improved performance moving forward: https://limits.pubpub.org/pub/wm1lwjce/release/1 I'm curious if accounting for that increase would affect the conclusion that training LLMs isn't the problem.
The Computational Limits of Deep Learning

Computing within Limits

@tbsp @csepp if the #LLMs lack a semantic layer – and my understanding is that all of the current ones lack one – then they're just toys, capable only of producing credible-seeming nonsense. They cannot know whether their answers are true or false, since they do not map either your prompt or their answer onto meaning, nor that meaning onto reality.

A semantic layer will certainly come – it isn't rocket science – but until it does there's nothing to see here.

#AI
#StochasticParrots

@simon_brooke @tbsp That doesn't fix the issue that's being discussed: environmental destruction.
@simon_brooke @tbsp I'm also not sure if semantics not being rocket science is accurate. Google did recently demonstrate that they could use an LLM to optimize code, but it required many prompts (I think they quoted O(10^6) API calls). It also just used testing and scoring, not a true formal correctness definition.
I haven't seen demonstrations of truly semantically correct output being generated, like, you can't ask an LLM to prove a theorem in Coq.
@simon_brooke @tbsp But even if such a model was created, it's unlikely that it would be small enough to be practical to train and operate for anyone but a megacorp, so it would only serve their interests.
And if it's anything like that Google project, it will still be orders of magnitude more environmentally destructive than current models.

@csepp @tbsp Why do you say that? The #OpenNLP English Parts of Speech model (linked below) is 1.1Mb and runs happily on my laptop. It's less than perfect; but it's not hugely less than perfect.

#LLMs are huge, granted; but they also won't solve this problem. The software which will solve this problem is probably not huge.

https://www.apache.org/dyn/closer.cgi/opennlp/models/ud-models-1.0/opennlp-en-ud-ewt-pos-1.0-1.9.3.bin

Apache Download Mirrors

Home page of The Apache Software Foundation

@simon_brooke @tbsp It's true that the code optimization "paper" by Google claims that performance is pretty good with smaller models. Maybe I'm wrong and an "llm" Coq tactic that runs on a mid-range desktop PC is just a few years away. 🤷

@csepp @tbsp it won't be an #LLM, I think. I really think that #LLMs are an evolutionary dead end: for all their remarkable ability to produce results that look plausible, they don't advance us towards machines which can display actual intelligence.

By which I mean, machines which can make generally good decisions in the face of uncertain and incomplete data.

@simon_brooke @tbsp Yeah, in this case, it was basically brute forcing, although a well guided brute force code optimizer is not something I want to outright dismiss. If used in small and targeted ways, ML models that can optimize or reverse engineer code can be a net positive. There is a Mandiant report about their use of an internally trainer large-ish language model (runs on two beefy GPGPUs) that they say made analysis malware samples easier. I want that, but for pmOS drivers.
@simon_brooke @tbsp Although the dark side of this is that this tech will 100% be used to find vulnerabilities and to generate exploits. But I don't think we'll see Coding Machines be realized any time soon.
https://www.teamten.com/lawrence/writings/coding-machines/
via https://suricrasia.online/iceberg/
Coding Machines