This "careful" "AI Safety" company that just accidentally leaked its entire source code to the world is the one that African governments are entering into agreements with to include in infrastructures from health care to god knows what.

These are the products people have to use to make sure that they don't get dinged in their performance reviews for "not using AI."

These are the products teachers have to use in schools so that "students aren't left behind."

https://www.theguardian.com/technology/2026/apr/01/anthropic-claudes-code-leaks-ai

Claude’s code: Anthropic leaks source code for AI software engineering tool

Nearly 2,000 internal files were briefly leaked after ‘human error’, raising fresh security questions at the AI company

The Guardian

I appreciated this article by @mttaggart
infosec.exchange.

I get the temptation especially in this world we're all living in where you have to produce something super fast all the time.

But my question is, what are people's arguments for how functioning software can be created with these tools?

What about new architectures, new ways of thinking, new programming languages, etc? Who will create those?

https://taggart-tech.com/reckoning/

I used AI. It worked. I hated it.

I used Claude Code to build a tool I needed. It worked great, but I was miserable. I need to reckon with what it means.

I'm not even talking about the data stealing, exploitation, environmental pillaging, pollution, environmental racism etc.

I'm talking about the way people use the tools. Like what do advocates of using these tools say will happen to software engineering in the future? That it just won't need to exist because everyone will be able to create software using these tools?

That it will just take a different form, which is fine?

@timnitGebru Yes. To a large degree, I think it's fine.

And the old forms will still be there in a lot of cases and contexts. And, if we build the future well, we won't put hard barriers to digging in and finding out what's going on. If we build it poorly and let platform rentiership win, that's a big problem loomng.

@aredridel
I worked in software preservation for a few years and i think "the old forms will still be there in a lot of cases" is massively optimistic about what software and coding knowledge will be preserved. We've already lost a lot of knowledge from previous generations before gen AI joined the party. Digital forms of knowledge can't continue to exist without intentional preservation interventions, which is not currently happening.
@timnitGebru
@aredridel
Show me one case in mass-market computing history where building the future went well for the commoner.
@timnitGebru
@ozzelot the personal computer, the pc revolution, arguably the iPhone and cell phone both.
@aredridel
All of those are by and large corporate controlled. I can for example only install an alternative OS at the mercy of my motherboard (and CPU - thanks, AMD PSP and Intel ME) vendor.
Cellphones, whilst allowing for communication across arbitrary distances, still depend on networks operated by cartel-like structures and usable for consumer surveillance by unsavory authoritarian entities. Not much good in that.
@aredridel
The iPhone? Are you kidding me? It was a shiny, but useless slab of materials at first, and the App Store introduction only forced consumers to get into a walled garden and enjoy it. Along with, well, being a cellphone.

@ozzelot Wow your worldview is dismal.

Is something good only if it's _purely_ good?

@aredridel
I find some parts of personal computing good - but largely, it's been "let's put computers where they don't belong, make a bunch of solutions in search of problems, and make someone a bunch of money." What I enjoy about computing (that is, some parts of FOSS/OSHW) exists largely despite the dreams of those who mass-market computers, not because of them. If I look at the so-called PC revolution as a whole... Meh. At best, meh.
@ozzelot @aredridel PCs are way more open than phones. ME and PSP and similar are new inventions, and you can practical computers predating that, and MISTer is completely open. You also have qemu etc.
@pavel Yes this is true, but the question was whether any of these technologies have been good for ‘the commoner’ (I'ma ignore the elitism in that construction). The original phrasing was ‘building the future’ which I'm not sure we agree on the meaning of, but c’est l’communique.
@pavel
I know, and I do keep such puters around. But does my uncle? Do schools which make massive deals with Microsoft and Google? Once again, computing done in good and pleasing ways exists despite market needs.
@aredridel

@ozzelot @pavel Yes, but the question is “good for the 'commoner’" — and that means looking at the actual _good things_ that this has enabled for people. The actual systems they use, and what they actually do with it.

(I'm not so much an optimist or pessimist as an “it's complicated"-ist here, but I think that seeking a pure good is folly and often causes a lot of external harms)

@aredridel
Yeah, I guess in the way I said it, you got me (or I got myself; and apologies for my word choice as well.)
A Spotify subscriber, for example, may see the good in having access to ~infinite music for very little, while not privy to the fact that Ek prefers funding war to paying artists. Do we wish to spoil the user's joy, that is the question.

I would also like to say the war Ek funds isn't good for anyone. Out of pacifism I shan't be tooted.
@pavel

@ozzelot Yeah. That's actually a really good case study. What it replaced was better, and the rentiership is nasty.

But: the thing before is _also_ a part of that history of computing, and the sheer explosion of that era was gorgeous.

@ozzelot @aredridel Well, poisoning kids with Microsoft sw is huge problem. Nonfree firmware is a problem, too, but not in same ballpark.

@timnitGebru I think this is relevant to these questions, albeit handles them on a different level:
https://freakonometrics.hypotheses.org/89367

> Someone still has to reread, compare, test, contextualize, and sometimes rewrite. And if no one seriously takes on that work, the cost does not disappear. It reappears later in the form of errors, urgent fixes, loss of trust, and eventually litigation. What is presented as a productivity gain is often just an accounting displacement.

If No One Pays for Proof, Everyone Will Pay for the Loss

This post was initially written in French, Si personne ne paie pour la preuve, tout le monde paiera pour le sinistre Let’s start with a truism. In ordinary life, just as in economic life, we have to make decisions without ever knowing everything. Every decision involves some uncertainty, and therefore some risk. Some risks are … Continue reading If No One Pays for Proof, Everyone Will Pay for the Loss →

Freakonometrics
@rysiek Great article.

@timnitGebru it really is.

And boy does the Claude Code leaked codebase support that assessment. Have you seen @jonny 's thread on this? If not:
https://neuromatch.social/@jonny/116324676116121930

@timnitGebru the whole thing is great, but somewhere down the thread there are truly astonishing gems like:

> So the reason that Claude code is capable of outputting valid json is because if the prompt text suggests it should be JSON then it enters a special loop in the main query engine that just validates it against JSON schema for JSON and then feeds the data with the error message back into itself until it is valid JSON or a retry limit is reached.

Thousand monkeys, thousand typewriters…

@timnitGebru of course it makes total sense for Claude Code to waste developer tokens like that, since Anthropic charges per token… 🙄
@rysiek Literally the questions of "what if computer science was no longer about figuring out the most efficient way to do X but the brute force way to do X"?

Yeah @jonny's thread is great, really eye-opening.

It's an interesting question. There are a few different arguments that advocates for using these tools make.

  • skilled software engineers are very good at using imperfect tools -- figuring out the scenarios they work well in and how to work around the problems. @mttaggart's article was a great example of how this can work in practice, and @glyph has some thoughtful posts along these lines (not that either of them are advocates of the tools, but they illustrate the point). Static analysis tools (my software engineering claim to fame) is a great example of this general tendency: they can be extremely useful despite high numbers of false positives and false negatives.

  • the tools will radically democratize who can create personal-use software -- stuiff that that addresses their own (and their friends/family's) problems without being intended for broader use. For a lot of secnerios, attributes like scalability / reliability / security don't necessarily matter that much; so being able to start with a natural language definition and get something "good enough" can potentially be useful.

  • agentic software development is a transformative approach that leverages today's immense computing power so can produce software at least as good as today's hand-crafted software (which to be fair mostly sucks) far more quickly.

Then again as well as the issues that excellent article @rysiek discusses, advocates in general don't consider Gender HCI, Feminist HCI, Post-Colonial Computing, Anti-Oppressive Design, Design Justice, Accessibility, Security, Algorithmic Discrimination, or Design from the Margins into account. Neither do the people creating these tools, and neither does the overwhelming majoriity of the existing software these tools have been trained on. So software generated by these tools is at besting going to replicate the existing problems in these areas -- and more likely magnify them.

So this to me is where the bullet points above break down.

  • Few if any software developers are "skilled" in all of these areas, so don't know how to compensate for imperfect tools (and quite possibly aren't even aware of the tools imperfections).

  • "Personal use" tools that aren't accessible or designed from the margins, or embed algorithmic discrimination, aren't useful for most people.

  • Generating more software more quickly that magnifies (or even reproduces) today's problems in all these areas magnifies oppressions.

And as you say there's also the the data stealing, exploitation, environmental racism, etc, of the current generation of tools -- and let's not forget fascism, eugenics, and cognitive issues!

In theory there are alternate approaches that can avoid these problems; @anildash has talked about using small models trained locally on his own code, and that seems like a potentially-promising direction. In practice though the vast majority of advocates today seem to be using stuff from Anthropic, OpenAI, Meta ... even the ones who acknowledge the ethical issues don't actually address them.

@timnitGebru

Gender HCI, Feminist HCI, Post-Colonial Computing, Anti-Oppressive Design, and Design Justice

Some great insights about how to create software that works better for everybody.

The Nexus Of Privacy

Also I think this question is very related to @tarakiyee excellent On The Enshittification of Audre Lorde: "The Master's Tools" in Tech Discourse. Of course as Tara points out, Lorde wasn't talking about "tools" in the tech sense.

"[T]he "tools" Lorde was naming were not literal instruments but epistemic ones: the frameworks of thought, the methods of inquiry, the structures of inclusion and exclusion that had been built by and for a particular kind of subject (white, heterosexual, Western) and that continued to operate even within movements ostensibly committed to liberation. The questions she poses make this frame legible: what does it mean to conduct feminist analysis while systematically excluding the voices of poor women, Black women, Third World women, lesbians? What does it mean to theorize liberation using categories that treat those exclusions as incidental rather than structural?"

And heaven knows most of the discussions about "AI" tools in software engineering isn't being done as "feminist analysis"!

Still ... "AI" is indeed an epistemic framework, and the pattern of systematically excluding the voices of poor women, Black women, Third World women, lesbians (and disabled people, etc etc etc) and then treating those exclusions as incidental rather than structural is exactly what's going on here.

@timnitGebru

On The Enshittification of Audre Lorde: "The Master's Tools" in Tech Discourse

🖼️Cover Photo: Train at the Nairobi terminus of the Mombasa–Nairobi Standard Gauge Railway. It runs parallel to the Uganda Railway that was completed in 1901. The first fare-paying passengers boarded the "Madaraka Express" on Madaraka Day (1 June 2017), the 54th anniversary of Kenya's attainment of self-rule from Great

Do Flamingos Know They're Pink

@timnitGebru "efficient" in what way, measured by whom, right?

Wasting developer tokens — tokens these developers or their companies pay for — is very efficient from the perspective of extracting value…

@rysiek @timnitGebru The illusion of progress, indeed! I plan to do my initial experiments with Gemini as it is being massively subsidised at the open API gateway level via Opencode.AI, as opposed to using monthly subscriptions for the now arguably massively discredited Claude Code. That's if I even get around to it. So far just using project-wide find/grep/sed magic is working just fine for me, and traditional clang-tidy abstract syntax tree (AST) based refactoring is closer in grasp.

@rysiek @timnitGebru
I was so baffled to learn how *mandatory* output verification is implemented. Any sane developer would have resorted to a compact loop along the lines of

`do { result = tool_call(…) } while (!is_valid(result));`

Zero overhead besides the wasteful repetitive tool calls in the hope of eventually getting the format right.

Instead, they have complex, expensive instructions for the LLM to do that.
https://neuromatch.social/@jonny/116326861737478342

jonny (good kind) (@[email protected])

Attached: 3 images OK i can't focus on work and keep looking at this repo. So after every "subagent" runs, claude code creates *another* "agent" to check on whether the first "agent" did the thing it was supposed to. I don't know about you but i smell a bit of a problem, if you can't trust whether one "agent" with a very big fancy model did something, how in the fuck are you supposed to trust another "agent" running on the smallest crappiest model? That's not the funny part, that's obvious and fundamental to the entire show here. HOWEVER RECALL [the above JSON Schema Verification thing](https://neuromatch.social/@jonny/116325123136895805) that is unconditionally added onto the end of every round of LLM calls. the mechanism for adding that hook is... JUST FUCKING ASKING THE MODEL TO CALL THAT TOOL. second pic is registering a hook s.t. "after some stop state happens, if there isn't a message indicating that we have successfully called the JSON validation thing, prompt the model saying "you must call the json validation thing" this shit sucks so bad they can't even ***CALL THEIR OWN CODE FROM INSIDE THEIR OWN CODE.*** Look at the comment on pic 3 - "e.g. agent finished without calling structured output tool" - that's common enough that they have a whole goddamn error category for it, and the way it's handled is by just pretending the job was cancelled and nothing happened.

neurospace.live

@marcel if all you have is a hammer, and if you charge by each hammer stroke, every single problem looks like a nail.

@timnitGebru

@rysiek I guess it’s ok if you don’t plan to maintain the code.