@davidsonsr @david_chisnall @EUCommission "Using AI for content moderation" doesn't mean anything to me.
To "increase content output" and "enhance CRM data" sounds like a deluge of slop, not increased performance. (As a personal anecdote, I was considering using Duolingo myself when I heard they were adding LLM slop to their app, so I lost all interest. I want to learn languages, not consume "content output".)
I'm not qualified to judge the experimental setup of "Generative AI and labour productivity: a field experiment on coding", but some things stood out to me:
- They looked at ~1200 programmers from one company (Ant Group) over a period of 6 weeks.
- 335 of them had access to a specific (internal) LLM.
- The junior programmers with LLM access produced 50% more verbose code, the senior programmers didn't.
That's it. The only thing they measured was the number of lines of code produced, not quality or correctness or anything. And this was only the short-term effects (less than two months); there's nothing there about the mid- or long-term consequences of mandating LLM use to a company's whole workforce.
"Generative AI at Work" is about US customer support (from a call center in the Philippines). The paper is creepy ("AI drives convergence in communication patterns: low-skill agents begin
communicating more like high-skill agents", "customers are less likely to question the competence of agents"). Results are mixed: "AI assistance increases worker productivity, resulting in a 14% increase in the number of chats that an agent successfully resolves per hour", but only for less-skilled and inexperienced agents: "we find evidence that AI assistance may decrease the quality of conversations by the most skilled agents". The metrics used are questionable: Issue resolutions per hour and "net promoter score" (as a proxy for customer satisfaction) are used to determine both productivity and agent "skill".
(Why are these papers all written by economists?)