Mastodawn

Csepp 🌢Jan 15, 2024

Since I know a bunch of you are from Canada, you might want to weigh in on #genAI in this consultation.
I wouldn't mind if you read this first:
https://limited.systems/articles/climate-cost-of-ai-revolution/

https://ised-isde.canada.ca/site/strategic-policy-sector/en/marketplace-framework-policy/consultation-copyright-age-generative-artificial-intelligence

The climate cost of the AI revolution

On the energy cost of Large Language Models, what their widespread adoption could mean for global CO₂ emissions, and what could be done about it.

Wim Vanderbauwhede

Show thread

tbsp Jan 15, 2024

@csepp Thanks for the article (and consultation link). That article appears to use the cost of training current LLMs, and scales up to 100 concurrent active models with such a training cost. This paper discusses an extrapolated increase in the energy use of training models to obtain improved performance moving forward: https://limits.pubpub.org/pub/wm1lwjce/release/1 I'm curious if accounting for that increase would affect the conclusion that training LLMs isn't the problem.

The Computational Limits of Deep Learning

Computing within Limits

Show thread

Simon Brooke Jan 16, 2024

@tbsp @csepp if the #LLMs lack a semantic layer – and my understanding is that all of the current ones lack one – then they're just toys, capable only of producing credible-seeming nonsense. They cannot know whether their answers are true or false, since they do not map either your prompt or their answer onto meaning, nor that meaning onto reality.

A semantic layer will certainly come – it isn't rocket science – but until it does there's nothing to see here.

#AI
#StochasticParrots

Show thread

Csepp 🌢Jan 16, 2024

@simon_brooke @tbsp That doesn't fix the issue that's being discussed: environmental destruction.

Show thread

Csepp 🌢Jan 16, 2024

@simon_brooke @tbsp I'm also not sure if semantics not being rocket science is accurate. Google did recently demonstrate that they could use an LLM to optimize code, but it required many prompts (I think they quoted O(10^6) API calls). It also just used testing and scoring, not a true formal correctness definition.
I haven't seen demonstrations of truly semantically correct output being generated, like, you can't ask an LLM to prove a theorem in Coq.

Show thread

Csepp 🌢

@simon_brooke @tbsp But even if such a model was created, it's unlikely that it would be small enough to be practical to train and operate for anyone but a megacorp, so it would only serve their interests.
And if it's anything like that Google project, it will still be orders of magnitude more environmentally destructive than current models.

Show thread

Simon Brooke Jan 16, 2024

@csepp @tbsp Why do you say that? The #OpenNLP English Parts of Speech model (linked below) is 1.1Mb and runs happily on my laptop. It's less than perfect; but it's not hugely less than perfect.

#LLMs are huge, granted; but they also won't solve this problem. The software which will solve this problem is probably not huge.

https://www.apache.org/dyn/closer.cgi/opennlp/models/ud-models-1.0/opennlp-en-ud-ewt-pos-1.0-1.9.3.bin

Apache Download Mirrors

Home page of The Apache Software Foundation

Show thread

Csepp 🌢Jan 16, 2024

@simon_brooke @tbsp It's true that the code optimization "paper" by Google claims that performance is pretty good with smaller models. Maybe I'm wrong and an "llm" Coq tactic that runs on a mid-range desktop PC is just a few years away. 🤷

Show thread

Simon Brooke Jan 16, 2024

@csepp @tbsp it won't be an #LLM, I think. I really think that #LLMs are an evolutionary dead end: for all their remarkable ability to produce results that look plausible, they don't advance us towards machines which can display actual intelligence.

By which I mean, machines which can make generally good decisions in the face of uncertain and incomplete data.

Show thread

Csepp 🌢Jan 16, 2024

@simon_brooke @tbsp Yeah, in this case, it was basically brute forcing, although a well guided brute force code optimizer is not something I want to outright dismiss. If used in small and targeted ways, ML models that can optimize or reverse engineer code can be a net positive. There is a Mandiant report about their use of an internally trainer large-ish language model (runs on two beefy GPGPUs) that they say made analysis malware samples easier. I want that, but for pmOS drivers.

Show thread

Csepp 🌢Jan 16, 2024

@simon_brooke @tbsp Although the dark side of this is that this tech will 100% be used to find vulnerabilities and to generate exploits. But I don't think we'll see Coding Machines be realized any time soon.
https://www.teamten.com/lawrence/writings/coding-machines/
via https://suricrasia.online/iceberg/