Given how they can only be produced by exploitation of workers from the global majority no system including any of the big LLMs in their production can ever be called "fair".

"Fair LLMs" of the size required to do the tasks people want LLMs to do (badly) do not exist.

@tante

... and not needed IMO. There are enough people more than willing to do all those meaningful tasks ... !

@quincy that is a different conversation.
@quincy @tante but those people seem to think they need to be paid for their work and have their rights respected. you can't run a business like that

@tante
I think it is possible to create small (and fair LLMs) which are capable to do the tasks people want them to do (but for a certain problem scope)

Work Papa Reo Maori researchers developed speech recognition and natural language processing capabilities for Indigenous language communities, ensuring that the sovereignty of the data remains with them and the benefits derived from these technologies goes directly to their communities.
https://papareo.nz/

Papa Reo

Papa Reo is a multilingual language platform grounded in indigenous knowledge and ways of thinking and powered by cutting edge data science.

@realn2s @tante small large language models aren't large language models. They're just language models.

@realn2s @tante

by-and-large, both worshipers and detractors of "AI" dance to the tune set by the techbros. They have set the narrative and the agenda and it is full of deceitful marketing. The greed is transparent and their objectives double-down on the existing dystopia.

We'll only know the intrinsic potential of this class of algorithms when/if people with different incentives and objectives examine (and further develop) them in new directions and in tandem with other human-centric tools.

@tante well, capitalism is based on the exploitation of labour, so there is nothing that can be produced that can be called fair, if fair means free of exploitation. Small specialized models working in sets will outperform large models, which are mostly required for The Grand Vizier use case pushed by big tech, same old theme of inefficient centralization in the interest of monopoly and profit.
@tante How to you see approaches like olmo with their open and ethical approach play into this? https://allenai.org/olmo
Olmo from Ai2

Our fully open language model and complete model flow.

@djh it is also based on commoncrawl so includes all kinds of unlicensed data included against the author's explicit will.
@tante huh, interesting detail I was not aware of! 🙌
@tante I haven’t evaluated its performance but Apertus seems to be trying https://www.swiss-ai.org/apertus
Apertus | Swiss AI

Swiss AI
@UlrikeHahn performance ain't great.
@tante thanks! any particular aspects?
@UlrikeHahn I found its rate of very clear falsehood generation quite bad and it works even worse than other models in the contexts people use those systems (think coding assistant, rephrasing text, etc)
@UlrikeHahn on the other hand: I test these systems and have colleagues using them but I don't use any LLMs for any of my work or activities.