Mastodawn

David Chisnall (*Now with 50% more sarcasm!*)2d ago

@EUCommission

I don’t know if this account is actually monitored, or just a publishing place, but you may have noticed that this post has received almost overwhelmingly negative responses.

You could disregard this as Mastodon bias, but keep in mind that the biggest bias on Mastodon is that people who understand and built core parts of the information technology that you use every day are massively over represented. This is probably the only place you will get a lot of replies from people who both understand technology and do not have a financial incentive to hype things to get large amounts of government funding.

EDIT: I should add, I used machine learning during my PhD and there are a lot of problems for which it is a really good fit. But, in the current climate, it’s generally safe to interpret ‘AI’ as meaning ‘machine learning applied to a problem where machine learning is the wrong solution’. It isn’t a technology, it’s a branding term, and it’s a branding term used almost exclusively for things that have no social benefit.

Davidson

@david_chisnall @EUCommission The EU is tasked with the difficult challenge of balancing democratic values with maintaining economic parity with undemocratic superpowers. Initiatives like these are usually aimed at ensuring that the EU doesn't fall behind. What are you proposing? No AI infrastructure with data sovereignty for the EU while other superpowers use AI to optimize every facet of digital infrastructure? What is the incentive for the EU to risk sitting out a technological leap?

GeraintLlanfrancheta 22h ago

@davidsonsr @david_chisnall @EUCommission also true.

Michal Valasek 22h ago

@davidsonsr @david_chisnall @EUCommission
“…optimize every facet of digital infrastructure…”
Like what for example?

@fuji @david_chisnall @EUCommission

Davidson 22h ago

Organizations have been implementing AI for years by identifying what human tasks that can be safely done by AI at no risk to the company. More or less every single modern organization of a moderately large size that relies to a large degree on digital infrastructure does this now, either directly or indirectly through the tools and services that they use. And if they don't, then their suppliers and vendors do.

David Chisnall (*Now with 50% more sarcasm!*)22h ago

@davidsonsr @fuji @EUCommission

This is true only if you conflate 'AI' with 'automation'. Companies trying to sell 'AI' like it when you do this, but if 'AI' includes anything that a computer does then it's a meaningless term.

Davidson 22h ago

I'm talking about LLMs/reasoning models enabling software to make decisions based on natural language instead of programmatic instructions. I'd say that this is what's commonly understood to be the meaning of the term "AI" in business contexts. Isn't that what we're talking about?

David Chisnall (*Now with 50% more sarcasm!*)21h ago

@davidsonsr @fuji @EUCommission

So what are these use cases? Replacing customer support with a chatbot that makes up policies, can't answer questions, and drives away customers? Meeting summary systems that invert the conclusion of the meeting? Note taking for doctors that fabricates conditions and cancels essential prescriptions?

Machine-learning systems work really nicely in situations where either the result can be checked instantly and cheaply, or where the cost of a wrong answer is vastly lower than the benefit of a correct answer. Very few natural-language processing tasks have this property.

LLMs have had hundreds of billions of dollars spent on them, and are not yet profitable. No company can offer them to customers at a price that customers are willing to pay and which covers the costs. And, even with that level of subsidy, it has made zero measurable impact on the GDP of the USA.

If a technology has failed to deliver anything of value to the economy after sinking a hundred billion, the rational thing to do is not say 'we must also throw money down this hole'. It is to say 'other countries, please keep wasting your economic potential! We will invest in things that actually deliver!' (Or, at least, in things that haven't yet been shown to not deliver).

We're discussing a diffuse economic impact, so you're not going to see many concentrated labor displacements or sweeping gains. Companies are reporting being able to conduct AI improvements at scale with fine-grained tasks, but the ratio at which it displaces, pressures or complements labor differs depending on the context. What's important to acknowledge is that this ratio is changing as companies are continuously optimizing for AI implementation.

There are some quantifiable indicators like direct labor displacements in professions like freelancing, language and content work, what seems to be declines in junior and entry-level hiring in AI-exposed companies and industry surveys and labor data indicating that AI is significantly commoditizing skills in some professions. We're possibly going to see more of this as the Service-as-a-software model is emerging.

For anecdotal real-world examples of how AI is being used, there's invoice and document processing (ingestion and scanning), document drafting (legal, corporate and technical writing), content reviewing (code, contracts or otherwise), monitoring (credit underwriting, fraud detection), technical inspection (defect detection, report processing) and so on. Companies see marginal to significant improvements related to many of these AI implementations.

There's no reliable data on what amount of AI processing that is dedicated to work rather than waste, but some reviews seem to indicate that the number might sit at 50%. So the question is, if unethical superpowers continuously self-optimize work-related AI processing to the point where they see a significant economic or even military impact, then what is this going to mean for a EU that decided to opt out of having even a regulated, basic AI infrastructure?

@david_chisnall @davidsonsr @fuji @EUCommission

Violet Madder 14h ago

The only time something is this aggressively useless, but gets massive investments anyway, is when it's a weapon.

@violetmadder @david_chisnall @fuji @EUCommission

Davidson 9h ago

AI has introduced a shift in how humans can interact with computers. All IT infrastructure was built with the restriction that computers could only be interfaced with through predefined rules, whereas AI can now allow us to give computers instructions through natural language. It's true that emerging technologies tend to see significant investments and at times economic bubbles, but that doesn't negate the effectiveness of AI as a technology.

David Chisnall (*Now with 50% more sarcasm!*)9h ago

@davidsonsr @violetmadder @fuji @EUCommission

Natural language interfaces are not new. They've been around in various forms for decades. Some ML techniques allow higher accuracy but they come with the same limitations as any attempt at this technology. First, the set of things that can be done is still defined by programming. The difference with LLM-based approaches is that, rather than failing when they are asked to do something that they can't do, they do something else. This is much worse, because it means that the systems are not reliable.

Natural language interfaces pop up periodically but typically go away because natural language is ambiguous. Computer languages are intentionally not like natural language because their requirement is to unambiguously convert programmer intent into a sequence of instructions for the computer. As soon as you introduce natural language, you introduce a requirement for interpretation and that both removes agency from the user (now they aren't the one providing this - 'agentic AI' systems are ones that aim to remove agency from the user) and introduces a large space of failure modes that the user cannot reason about.

@david_chisnall @violetmadder @fuji @EUCommission

The user doesn't need to be the one providing context. Software can instruct AI to reason about a piece of unknown information and then provide context for the program to consume. The AI becomes a cog that eliminates unknowns and converts it to instructions that the program can understand.

As far as I know this has never been possible before, and I don't know of any proto-solutions that could do this through natural language instructions.

David Chisnall (*Now with 50% more sarcasm!*)7h ago

@davidsonsr @violetmadder @fuji @EUCommission

Okay, it's clear that you really, really don't understand how LLMs (or other machine-learning algorithms) work. At all.

@davidsonsr @david_chisnall @fuji @EUCommission

Violet Madder 6h ago

Machines are not capable of "reasoning". Unknowns aren't "eliminated" but filled in with arbitrary BS, context defined by the people who wrote the thing (in the service of technofascist oligarchs out to destroy the usefulness of the internet).

The technology (machine learning) CAN be very good and very useful-- but not when it is implemented like this.

This is a bullshit generator.

I'm just going to keep on saying it: It's a weapon.

@violetmadder @david_chisnall @fuji @EUCommission

A programmer can instruct an AI to handle an unknown piece of information according to an instruction given in natural language, and the AI has the capacity to follow that instruction with a high level of accuracy. Introducing additional AI for reviewing can increase the accuracy further, often to the point where the error margin is negligible. This is being done across companies today, with tasks that were previously done by humans.

David Chisnall (*Now with 50% more sarcasm!*)22h ago

@davidsonsr @EUCommission

The EU has to prioritise investment. It needs to pick things that are likely to give a good return, both financially and in building the kind of society that EU members wish to belong to. To date, AI has not materially contributed in either. There has been no measured impact on economic productivity from AI adoption, in any industry. The systems are built on top of large-scale plagiarism that undermine the creative industry.

If the USA and China wish to sabotage their economise by throwing vast amounts of money at things that deliver negligible benefits (and often the reverse), then the EU should encourage them to do so, while investing in things that actually deliver a return.

@david_chisnall @EUCommission

Davidson 19h ago

AI being hard to isolate in aggregate statistics isn't the same as it having no measured impact. While AI has displaced some labor, the clearest evidence of productivity gains appears in field studies and task-level performance measurements, which there's an abundance of.

I'd rather see a hopefully more ethical, more productive and more energy efficient EU AI infrastructure with EU data sovereignty than the EU relying on other superpowers' AI implementations.

@davidsonsr @david_chisnall @EUCommission

19h ago

the clearest evidence of productivity gains appears in field studies and task-level performance measurements, which there's an abundance of.

Where?

@barubary @david_chisnall @EUCommission

Davidson 17h ago

There are examples like TikTok, Meta and other social platforms using AI for content moderation, Duolingo using AI to significantly increase their content output and HubSpot using AI to enhance customer CRM data. There are also papers like "Generative AI and labour productivity: a field experiment on coding" and "Generative AI at Work" which indicate productivity gains for junior workers. There are many instances of applied AI working as intended.

9h ago

@davidsonsr @david_chisnall @EUCommission "Using AI for content moderation" doesn't mean anything to me.

To "increase content output" and "enhance CRM data" sounds like a deluge of slop, not increased performance. (As a personal anecdote, I was considering using Duolingo myself when I heard they were adding LLM slop to their app, so I lost all interest. I want to learn languages, not consume "content output".)

I'm not qualified to judge the experimental setup of "Generative AI and labour productivity: a field experiment on coding", but some things stood out to me:

They looked at ~1200 programmers from one company (Ant Group) over a period of 6 weeks.
335 of them had access to a specific (internal) LLM.
The junior programmers with LLM access produced 50% more verbose code, the senior programmers didn't.

That's it. The only thing they measured was the number of lines of code produced, not quality or correctness or anything. And this was only the short-term effects (less than two months); there's nothing there about the mid- or long-term consequences of mandating LLM use to a company's whole workforce.

"Generative AI at Work" is about US customer support (from a call center in the Philippines). The paper is creepy ("AI drives convergence in communication patterns: low-skill agents begin
communicating more like high-skill agents", "customers are less likely to question the competence of agents"). Results are mixed: "AI assistance increases worker productivity, resulting in a 14% increase in the number of chats that an agent successfully resolves per hour", but only for less-skilled and inexperienced agents: "we find evidence that AI assistance may decrease the quality of conversations by the most skilled agents". The metrics used are questionable: Issue resolutions per hour and "net promoter score" (as a proxy for customer satisfaction) are used to determine both productivity and agent "skill".

(Why are these papers all written by economists?)

David Chisnall (*Now with 50% more sarcasm!*)9h ago

@barubary @davidsonsr @EUCommission

You'll find this in pretty much all papers that show an improvement in productivity from 'AI'.

Most of them use an invalid metric: self-reported feelings of productivity (a thing that's been shown previously to have a weak inverse correlation with actual productivity), lines of code (known since the '60s to be a terrible metric), or tickets resolved (who marks them as resolved? I can get 100% on this by just claiming everything is resolved, but if the outcome is that the customer gives up and goes to a competitor, that isn't actually a win).

Content moderation is similar. Using 'AI' is not there to improve efficiency, it's there to shift blame. TikTok and Meta moved to having an automated system moderate content so that they could claim compliance with rules about harm, without actually bothering to do the work. It does not increase the quality of the moderation decisions. Note specifically for the @EUCommission : this is a technology that is being used to attempt to bypass regulations that you have passed for the benefit of your citizens. Is that really what you want to be funding.

Developers aren't being evaluated by or paid for KLOCs anymore, so it's not invalid to view an increase in code throughput as an indicator of increased productivity during experimental evaluations, especially in delivery-focused teams. In the same vein, the paper regarding support agents showing an increased usage of unmodified AI response suggestions in combination with increased delivery velocity is also a valid indicator.

Reports and papers on generative content in knowledge work-related contexts seem to indicate that somewhere around a third to a half of proposed AI suggestions that are reviewed by humans are deemed acceptable, and that this in turn frees up personnel hours.

But more importantly, even if you wish to disregard that then there's still more than enough examples of applied AI being used by companies in both internal and customer-facing contexts to show that it's able to replace human tasks. You can easily confirm first hand what many of these software products are capable of doing in terms of using AI to reduce time spent on tasks. AI is ubiquitous now, and it has been rolled out for a long time.

David Chisnall (*Now with 50% more sarcasm!*)7h ago

You seem to have created an account entirely for this thread, to jump in and make vague and unverifiable claims, and to cite papers that have poor methodology.

You are demonstrating one use for LLMs: they can easily replace your kind of engagement. And this is why they're so popular with troll farms and scammers.

The point being discussed is whether AI is capable of replacing human work, and I think I've shown pretty clearly that it is. You can easily verify this yourself.

Register a HubSpot account, pretend that you're a salesperson, add a hundred empty contacts to the CRM, then ask yourself if you'd prefer to refine your contacts manually or whether you'd like an AI to do it for you.

That's AI eating work.

David Chisnall (*Now with 50% more sarcasm!*)7h ago

Okay, that's two logical fallacies in one post, so I can only assume you either are an LLM or you've been using one for so long that the well-documented cognitive impairment that this leads to has hit you.

First, let's look at the metric you're using. Does using an LLM require less work than not using one? If your objective function does not include any notion of quality, sure. There are a load of ways of doing work faster if you don't care about quality.

And that's the other issue: you're assuming that the two choices that exist in the world are 'a human does a task 100% manually' and 'an LLM does it'. And yet, until a few years ago, none of the automation that people were deploying was using LLMs. And a lot of companies are selling exactly the same kind of automation now but branding it 'AI' because that's what the current hype wave is for (fewer now, because there's such a consumer backlash against 'AI' that it's increasingly a toxic term and you get more sales if you don't use it).

To give a very concrete example of this: LinkedIn now has an 'AI' filtering thing for submitted CVs. It's on by default, so I didn't realise I was using it when I posted a job for a compiler engineer there. 70% of developers with prior LLVM experience, for a job working on LLVM were filtered out by it. I had more good CVs in the filtered-out pile than in the left-in pile.

Not only was this bad, but some traditional keyword filtering would have given me a much better first pass (at least for prioritising: simply searching all CVs for 'LLVM' would have given me a better high-priority-to-read list than the LLM did), but that option wasn't available because LinkedIn is all-in on AI.

Oh, and it's got even worse since then. LLMs are being used to automatically craft applications tailored to job ads. Hiring has become much harder. Yes, by one measure of productivity, LLMs have made things better: it's now much easier to apply for a job. You can apply for a hundred jobs in a day easily! But then the hiring manager has a thousand CVs for a job where only ten are qualified. And LLMs are really bad at filtering them (they're full of biases, but also don't understand the job requirements).

Yes, AI can require significantly less work than not using it and can deliver results faster and in parity with the level of quality that a human being would deliver. AI is now ubiquitous in digital products and in cases where the AI delivers high value at low risk and with a low error impact, it's performing well.

Using the aforementioned HubSpot as an example, as a salesperson you can integrate it with prospecting tools that crawl company websites, extract key data using AI, sends it back, and then HubSpot refines it for you using AI and presents you with a list of companies to call. This would've been an expensive multi-person effort that can now be done in an hour with the help of AI. Errors are minimal and when errors do occur, the impact is negligible.

And going back to the support agent use case, letting an AI scan incoming mails, search the knowledge base and then draft up a prepared response for you to review and either edit or send is again a high value feature with a low error margin (though arguably a higher error impact, which is why you have human reviewers).

So I think that we can establish that AI has the capacity to replace human work with an acceptable level of quality. Now whether that justifies the way that AI is being marketed by tech companies is a different discussion.

David Chisnall (*Now with 50% more sarcasm!*)7h ago

And going back to the support agent use case, letting an AI scan incoming mails,

Cannot be done safely, because there is no way to prevent prompt injection. Computing learned from telecoms that in-band signalling is a bad idea. Separation of control and data is essential for security. LLMs have no mechanism for doing this.

search the knowledge base and then draft up a prepared response for you to review and either edit or send is again a high value feature with a low error margin (though arguably a higher error impact, which is why you have human reviewers).

LLMs are really good at accidentally inverting the meaning of text when they summarise it, so this flow requires you to carefully read the message that it's a reply to, and the reply. If type very slowly, I can imagine that might be a time saving. But I can definitely imagine it is perceived as a time saving because people tend to report time reading as shorter than time writing.

I know people do this, because I've exchanged emails with some people who do. And it's frustrating because now I need multiple round trips with them to get them to actually give the required response instead of a statistically plausible reply. Eventually it's often easier to have a call with them.

If you're a good salesperson (I've worked with some, they do exist), then you know that your biggest value is in building relationships with customers and establishing trust. LLMs undermine this.

The AI would only have access to reading knowledge base data and drafting messages in plaintext, which are low-risk operations, and would need to pass through monitoring points that treat the draft as untrusted content both before reviewing and after sending. The biggest risk is a prompt injection generating sketchy content, the support agent somehow accidentally pressing the send and confirm buttons, and detection tools missing all of this again.

At this point you're talking about a margin of error that is comparable to just about anything else that is customer-facing in the organization.

As for salespeople relying on enriched data, it goes a long way when cold calling prospective customers. Not knowing anything about the company that you're calling vs. being presented with a fully-enriched dashboard containing everything from decision makers to company history does a lot for the success rate.

David Chisnall (*Now with 50% more sarcasm!*)7h ago

Is this why my work inbox is full of approaches from salespeople at companies who have no understanding of what my company does or what it needs, but feel the need to send me personalised emails?

Because that's not a net win for the economy. It's wasting my time.

And it's not a net win for the companies in question, because they get added to a list of companies I will never buy from.

That can happen when salespeople rely on integrations with public databases containing vague industry codes and third party websites with unreliable data sourcing. It's a problem that AI-based data enrichment specifically solves.

6h ago

@davidsonsr @david_chisnall @EUCommission You sound like an ad. Are you all marketing? 😃

@barubary @david_chisnall @EUCommission

No, I'm just trying to explain that AI is already successfully doing what people think it's incapable of doing, so that people can base their opinions on a correct understanding of what's happening.

6h ago

@davidsonsr @david_chisnall @EUCommission "AI" isn't a thing. It's a marketing term.

@barubary @david_chisnall @EUCommission

It's a colloquial term for products based on LLM/reasoning models.

David Chisnall (*Now with 50% more sarcasm!*)7h ago

Developers aren't being evaluated by or paid for KLOCs anymore, so it's not invalid to view an increase in code throughput as an indicator of increased productivity during experimental evaluations

What? Developers aren't paid for pissing on the floor, therefore it's not invalid to view an increase in the amount of piss on the floor as an indicator of increased productivity during experimental evaluations.

That made exactly as much sense as what you said.

You are completely ignoring the reason why developers are not paid per line of code anymore: because it does not correlate with useful productivity.

New code added is technical debt. The ideal change set is one that reduces the total amount of code and does not harm functionality. The very best changes I've reviewed are ones that simplify the code and simultaneously improve performance.

It's very easy to add code more quickly. If I wanted to, I could easily increase my productivity measured in lines of code changed by a large factor, simply by copying and pasting more code instead of building clean abstractions, but throwing in buggy implementations rather than thinking through the corner cases, and so on. And LLMs make it very easy to do all of these things. That's not productivity, that's just creating future problems (and, potentially, liability if you're in an industry where you can't just disclaim all liability, such as, oh, I don't know, The EU, where the CRA explicitly prohibits this).

It can be an indicator of increased productivity when evaluating the rate at which AI unblocks productivity. It's not a comment on the merits of viewing code bloat as an intrinsically valuable metric.

Michael Ormsby 4h ago

@davidsonsr @david_chisnall @barubary @EUCommission Before I retired I was a software developer for a little more than 3 decades. You’re right, LOC was an awful metric for programmer productivity. And there were many other crackpot schemes for enhancing or evaluating programmers or even redefining the role of computer folk.

But replacing 1 failed methodology with another isn’t the way forward. From the era of vacuum tubes onward we have failed to understand the potential of digital technology.

sidereal 4h ago

@davidsonsr @david_chisnall @EUCommission Hi, AI is demonstrably worse than useless (eats up more money in resources than it brings in, proven by multiple studies), and anyone who avoids it is setting themselves up for success.