"I used AI. It worked. I hated it." by @mttaggart https://taggart-tech.com/reckoning/

This is a really good blogpost. And I"m sure it'll make some people unhappy to read whether they're pro or anti genAI. What's good about @mttaggart's blogpost is he talks honestly about how using Claude Code did actually solve the problem he set out to do. It needed various guardrails, but they were possible to set up, and the project worked. But the post is also completely clear and honest about how miserable it was:

- It removed the joy from the process
- If you aim to do the right thing and carefully evaluate the output, your job ends up eventually becoming "tapping the Y key"
- Ramifications on people learning things
- Plenty of other ethical analysis
- And the nagging wonder whether to use it next time, despite it being miserable.

I think this is important, because it *is* true that these tools are getting to the point where they can accomplish a lot of tasks, but the caveat space is very large (cotd)

I used AI. It worked. I hated it.

I used Claude Code to build a tool I needed. It worked great, but I was miserable. I need to reckon with what it means.

What I think is also good about the piece is that it shows how using this tech eventually funnels people down a particular direction. This is captured also by this exchange on lobste.rs: https://lobste.rs/s/7d8dxv/i_used_ai_it_worked_i_hated_it#c_7jirfk

The story that people start with vs where they go is very different:

- They're really just for experts, and are assistants, they don't write the code for you
- Okay the write a lot of the code for me, but I personally don't commit anything without reviewing
- YOLO mode

Which eventually leads you to becoming the drinky bird pressing the Y key from that Simpsons episode. (Funnily enough I wrote that in my comment on lobste.rs in reply to someone else before I had even gotten to the point where I saw that @mttaggart literally had that gif)

And at that point, you're checked out. All that's left is vibes.

And unfortunately, these systems don't survive that point very well. And neither do you, in your skills and abilities.

There are a lot of other concerns but I think since a lot of people on the fediverse are opposed to these tools, they might not be very familiar with where they're currently at ability-wise. @mttaggart provides a good description that they *are* capable of solving many problems you put in front of them... and that doesn't remove the other problems they generate or involved in their process.

The slop part isn't just the individual outputs, but the cumulation, and the effect on society itself.

Is that pushing the goalposts? It may be. I think "slop" used to be easier to dismiss when it came to code because it was obviously bad. Now when it's bad, it's non-obviously bad, which is part of its own problem. And cognitive debt, deskilling, and etc don't get factored into the quality of output aspect.

But unfortunately, the immediate reward aspects of these things are going to make it hard for society to recognize.

Let me add one more thing to this. It's said implicitly above but let's be explicit. The problem is that this pipeline effectively *undoes itself*.

Part of the reason this worked well for @mttaggart is carefully setting up guardrails and monitoring things.

But the very patterns of usage of these things makes it so that people either never develop the skills where they can, or are demotivated to provide that level of care over time.

Which means the system eventually moves towards a structure that degrades and shakes itself apart by the very patterns of usage.

I don't know how to solve this.

@[email protected] I was looking at a few products on Amazon today. I noticed that, for various products, the product description on the page itself contradicts itself in several places. First it is described as having feature X, then Y, then X again, then Z, and then Y once more. I think it would be a good task for an AI to check such self-contradictory pages and flag any inconsistencies. This is particularly relevant for very large websites such as Amazon, Wikipedia or GrokiPedia. However, proponents of AI seem to prefer letting their software do things for which it is unsuitable, whilst completely ignoring tasks for which it is actually well-suited. @[email protected]
@Life_is
LLMs aren't good at fact checking though. What IS the feature set of the product? Only the manufacturer or maybe people who already bought the object know what features it has. An LLM can flag "this is inconsistent" but it can't figure out what the truth is. it has no concept of the world.
@cwebber @mttaggart

@dlakelan @Life_is That could change with neurosymbolic programming. Which I believe is important, and the next step.

And, as it turns out, leads to dramatic structural improvements AND doesn't resolve any of the problems in @mttaggart's blogpost.

@dlakelan
If Amazon just added a big scare banner “this product page appears to be misleading” for every obvious contradiction (start with the ultra-high confidence cases only) would probably help a lot. Obviously, the lower the confidence threshold, the more false-positives without effective recurse.

Also for Wikipedia this wouldn’t work too well, I think. For instance in medicine there’s something “paradoxical effect”, where eg different dosages lead to literally completely opposite effects. I’ve had cases before where I researched a drug and thought “that cannot both be true” only to find that yes it is – at different dosages. I doubt the LLM would fare much better at this.
@Life_is @cwebber @mttaggart