I offer Cassandra's Complete Class Theorem¹.

All "good use cases for AI" break down into one of four categories:

I. Bad use case.
II. Good use case, but not something AI can do.
III. Good use case, but it's already been done without AI.
IV. Not AI in the LLMs and GANs sense, but in the sense of machine learning, statistical inference, or other similarly valid things AI boosters use to shield from critique.

https://wandering.shop/@xgranade/115766140296237983

___
¹Not actually a theorem.

@xgranade the drum I’m banging is that there is no use case that could be valid enough, even if you were wrong, to justify its harms, so this conversation is ultimately irrelevant

@zkat @xgranade

The problem here is the IV cases. For example, I’d be hard pressed to find harms caused by Mozilla’s language-translation models:

  • The models are small.
  • They have done a lot of work to make sure you can reproduce the training on a single moderately-powerful machine.
  • The training data is public and curated for the specific purpose of training machine translation systems, not harvested from a load of sources without permission.

And, in the current marketing environment, things like that are also branded ‘AI’.

When the models can be trained ethically, machine learning typically does well in places where there is no harm in a wrong answer but significant value in a correct one. CPU branch predictors now often use neural networks. If they give a wrong answer, the CPU does a small amount of work that it throws away. If they give a correct answer, the CPU does useful work instead of idling. Getting the right answer 95% of the time gives a 10x or better speed up relative to not doing it. This is a great place to use machine learning. But a lot of the places where it’s proposed have significant negative real-world consequences from wrong answers.

@david_chisnall @zkat @xgranade That is the point of IV and exactly the example I've seen employed by Mozilla employees here. Someone complains about Firefox adding chatbots and whatever ai window nonsense they're talking about doing, and instead of justifying those features they move the conversation to "but you like translations right, checkmate".

@alex @david_chisnall @zkat @xgranade (This is in good faith. I am not a Nazi lover or AI pusher. Please do not eviscerate me in the comments. If you are going to tell me I am wrong, and I may be, please do so nicely. I just want to make sure we are playing with the same set of facts.)

Aren't translation models literally LLMs though?

@trashpanda @david_chisnall @zkat @xgranade I'm not personally a fan of LLM based machine translation, I think it has a lot of the same issues as "summarization". But small, local, specially trained models do sidestep a lot of the worst parts so a lot of people don't mind them as much.

The point of the discussion here is that wrapping everything under the umbrella of "AI" allows large orgs to dance around valid critism.
@david_chisnall @zkat @xgranade The actual translators beg to differ: https://linuxiac.com/ai-controversy-forces-end-of-mozilla-japanese-sumo-community/

That said, it could technically be implemented correctly. Hard to do in current marketing environment pushing artificial idiocy everywhere under high pressure
@david_chisnall @zkat @xgranade Regarding the branch predictors: you could avoid them completely if OS vendors did not pretend they are writing code for PDP11, and CPU vendors did not have to cover up that their CPUs have a pipeline.

Branch delay slots are well known technology, and compilers are completely capable of loop unrolling and filling the slots with instructions that are useful.

Exposing the fact that the CPU has a pipeline avoids the need for statistical branch predictor as well as it avoids multiple forms of Spectre vulnerabilities.

There, solid solution without use of AI. Unfortunately, for reasons outlined here
https://archive.org/details/lca2020-What_UNIX_Cost_Us solving problems is not viable, only papering over them is. Reasons boil down basically to entrenched industry inertia. ​
@david_chisnall
@zkat @xgranade branch predictor NNs are like two layers 2b CNNs and have nothing to do with the point being made here