@zkat @xgranade
The problem here is the IV cases. For example, I’d be hard pressed to find harms caused by Mozilla’s language-translation models:
- The models are small.
- They have done a lot of work to make sure you can reproduce the training on a single moderately-powerful machine.
- The training data is public and curated for the specific purpose of training machine translation systems, not harvested from a load of sources without permission.
And, in the current marketing environment, things like that are also branded ‘AI’.
When the models can be trained ethically, machine learning typically does well in places where there is no harm in a wrong answer but significant value in a correct one. CPU branch predictors now often use neural networks. If they give a wrong answer, the CPU does a small amount of work that it throws away. If they give a correct answer, the CPU does useful work instead of idling. Getting the right answer 95% of the time gives a 10x or better speed up relative to not doing it. This is a great place to use machine learning. But a lot of the places where it’s proposed have significant negative real-world consequences from wrong answers.