Mastodawn

We used to have working spelling and grammar checkers. Why does everybody in tech pretend you need a whole-ass LLM to check for typos?

Show thread

wikiyu Feb 20

@baldur
And translations
And text to speech was working good in most cases

Show thread

Marcus Müller Feb 20

@wikiyu @baldur no offense, but LLMs are really really good at translations, compared to the state of the art before. (and e.g. Google Translate was a lot more LLM-style AI for years than people think)

Show thread

Gustavo Feb 20

@funkylab @wikiyu @baldur People forget that the transformer model that LLMs use was created first for translations, so, of course, LLMs should be good at it as much as other solutions, they are based on the same building blocks.

Show thread

Marcus Müller Feb 20

@qgustavor @wikiyu That's what I'd argue, too, but: this very basic theory and reality, especially of really available implementations, might diverge there.
Thing is that @baldur is actually someone from the field, so his word does weigh heavy to me, even if it doesn't reflect my own experience with translation quality.

(EDIT: way->weigh. Human in-mind translations are not perfect, either :D)

Show thread

Baldur Bjarnason Feb 20

@funkylab @qgustavor @wikiyu

So, AFAICT and as best I know, in general LLMs are sensitive to the size of the training data set. Only a few languages have a collection of machine-readable texts big enough for these models

IIRC they used to compensate for this in the pre-LLM days specifically for each language.

Show thread

Baldur Bjarnason

@funkylab @qgustavor @wikiyu

Once everybody began to migrate to approaches that require large data sets, performance for all of those tasks (translation, summary, correction) in smaller languages especially began to suffer

Though, it should be noted that in a lot of third party, neutral testing, specialised models outperform LLMs for many language tasks such as summarisation, even in English. At least in the same ballpark, even if they underperform, while costing orders of magnitudes less