agentic AI in particular is so fucking funny. i run an absolutely *tiny* indie studio and i still ask people "hey could you run me through how this works?" all the time because knowing how things work is a vital part of creating a quality product

how does that work with AI? "hey could you run me through how this works?" and people just go "idk the AI did it ¯\_(ツ)_/¯"

i feel like the only way businesses fall for this is when they're big enough nobody at any level really knows fully how the product gets made, because it's abundantly clear to anyone who actually knows how products are made that "nobody knows how this works" is the biggest red flag ever

"you can just read the code the agent wrote"

oh fuck off. the whole idea is that agents can churn out code at way higher volumes than people can generate, and the bottleneck when people wrote the code and not "agents" was already code review, because making sense of code is harder than writing it

the only thing you've done is made the code review bottleneck so, so much worse. and this will help you be more productive... how exactly?

so what's the alternative if you can't review the avalanche of slop code? you just don't. and that's basically akin to live coding in the production environment. anyone who attempts this for long enough will be punished for their hubris
@eniko I absolutely cannot understand anyone who would be happy to ship things they don't understand, or that they can ask a person they trust about if they don't. I get the argument that as a sole coder (now) I'm an outlier and teams delegate understanding between themselves all the time, but a team is different to an LLM. A person can earn your trust, an LLM can only delude you into trusting it because it has no real memory or integrity or anything to lose by screwing you
@sinbad @eniko
you're not an outlier
(except if like 50% of the foss world is outliers)
@eniko I have seen the argument that they will just get a newer, smarter, AI to fix all the problems generated by the old one, and it's giving “I speak of none but the computer that is to come after me” from Hitchhikers.
@jamesthomson @eniko It's also the exact same bullshit people have been saying about climate collapse for the past twenty years. "Oh, we will just invent something that takes out all the CO2 afterwards. We don't have to worry about it now! Growth can continue forever!!"
@jamesthomson @eniko
My coworker already uses one agent to write code and another agent to code review it 🫠

@eniko I'm admittedly limited in my coding experience, but I'm less worried by code generation than by error fixing.

I vaguely trust the first draft to at least sorta focus on the original intent. Every iteration where code fails to compile or gives an incorrect output, though, creates a new layer of problems to focus on, and if an LLM will do ANYTHING to look good to you (as seems to be the trend), then who knows what bullshit it'll put in there just to produce something that functions?

@eniko one answer you'll get is that as long as it includes complete code coverage in tests, it should be good.
But here's the thing: the agent wrote those as well - without context of the bigger system - and bugs can and have manifested in code that was 100% covered in tests
@longhairmoto @eniko also perverse incentives: to increase code coverage, we removed this error handling. Number go up!
@longhairmoto yeah, bugs have never existed even in codebases with good test harnesses and they definitely have never happened in codebases with bad to mediocre test harnesses
@eniko @javierg immediate rejection if it looks like vibe code. Low-effort submissions get low-effort reviews.
@eniko also cherry on top: I am convinced that all this focus on code is because it was one of the few datasets on the internet that remained unpolluted by LLM outputs. Polluting it with slop pretty much guarantees that whatever the current models are capable of doing, the future ones will not be able to do anymore.
@eniko what I also find absurd in this situation is that I've always used relied on the fact that if you're trying to solve a problem there's a big, big chance that someone already solved it better than how I'd solve it. And that means there's a library or a tool somewhere that does what I need. And finding it and learning it means I need to write very little code, and I don't need to maintain that code later, because somebody else is maintaining it. Deliberately writing a lot of code is dumb.
@eniko "I produced 2 million lines of code so I am clearly doing my job brilliantly and should be the one promoted yes yes"
@eniko Let's not kid ourselves, this AI code will mostly be reviewed by AI tools, because who cares if it breaks, it's not their fault anymore, it's the AI's fault.
@ainmosni this is akin to live coding changes in the production environment and anyone who attempts it for long enough will be punished for their hubris
@ainmosni @eniko this is already how some teams at my company operate
@ainmosni @eniko our agentic pilot projects actively brag about how no human sees the code anymore.
@ratsnakegames @ainmosni they will be punished for their hubris
@eniko @ainmosni and i'm gonna be collateral damage, unfortunately
@ratsnakegames @ainmosni at least your brain will still work? silver linings i guess

@eniko
I need a rubber stamp with this sentence to stamp it on peoples faces.

"Writing code is the easy part. Reading code is the hard part."

@eniko exactly this!

I've been trying to explain this to people as well.

But it seems "we have the ai write tests too, that we also don't read" is apparently fine.

@eniko preach! This is the biggest talking point for me. When stuff falls apart, you can absolutely ask a person "Hey, why is your code like this?" And you can get some idea as to what the problem is.

With AI, you can't. It doesn't have context behind its decisions so you not only have to find out what's wrong, you have to figure out why and do so with zero guidance or input.

@eniko I’m retired now, but my career was writing code for about three and a half decades. Absolutely agree that documenting code, making sense of code, and understanding the pattern of interaction between different developer’s code is the hard part. And planning for reuse.

In my experience managers of software development almost never got that. Invariably they saw code as some kind of monolithic, homogeneous product that could be mass produced by the cheapest supplier, measured and sliced.

@eniko and we all know how much developers like doing code reviews, right?

Turn my job into one never ending code review? I’d rather quit, to be honest.

@eniko a lot of places have started use AI to do code reviews, and started to shift around who takes fault when the AI generated code goes bad (either the person who did the prompting, or the person who did the review).

So something goes wrong and one of those two gets fired, and the process that incentivizes rubber stamping PRs is never given scrutiny.

@eniko 'We have KPI targets and knowing how stuff works is not there!'
@eniko I desire not the AI peas
@eniko And even if someone doesn’t fully understand what they were doing, they can usually explain up to the point where they made a guess or just copied a solution from somewhere else. An AI often only admits a fault, when confronted directly or when it obviously contradicts itself.

@SilverOwl @eniko

The AI only admits a fault where its training data indicates that the most likely next sentence would be admission of a fault.

Given that it's training data is the Internet where people will regularly argue well after being shown hard proof that they are wrong, that takes a lot.

@gbargoud @eniko You’re absolutely correct! I should have refined my comment before posting. I always strive to provide thoughtful input in conversations and will take greater care in doing so in the future. 😉
@eniko Sadly the current business climate does not reward quality, and hasn't for a long time. AI is accelerating this attitude in an extreme way, and maybe it'll lead to such an extreme crash that we'll rediscover that quality actually matters. One thing I'm sure about, a crash of some sort is coming, and it's not going to be good.

@ainmosni @eniko True, but I suspect CRA/PLD/... will change this at least for some market segments.

(I'm not a huge fan of the CAIDA; we should treat "AI" in products just like normal products; as a consumer/citizen I don't much care why they don't meet requirements. That just adds to the "oh look how special AI is" crap.)

@eniko It's even narrower than that. I don't know how a Polar Code works on FPGA, but I can read a high-level wikish summary of the *intent* of the design, and I know the basic error correction coding theory to absorb what it implies for my block of the machine.

This kind of indifference can only pass when nobody knows fully how the product gets made and also the way it's made was already full of bullshit performative work-shaped nonsense even before they began asking randstr for the files.

@eniko
^ this

I suspect this is a big part of why it's so appealing to corporate bigwigs : they don't know/understand what their teams are doing anyway, so genAI doesn't change that significantly from their perspective.

@rfnix @eniko

Also boardroom misconceptions that correcting something is much easier and quicker than doing the thing yourself, which just isn't true when there is any skill involved.

@eniko
I work at a medical device company. The factory I work in, of which the company has *a bunch*, employs well over 1000 people.

Enthusiasm for generative LLMs is directly proportional to distance from production. The C_Os are urging people to adopt it. Managers are interested. Office workers are cautiously optimistic. And we on the assembly lines are all dead set against it.

@eniko Well, you see, the C-suite guys have the mental capacity of a snake. Nobody they talk to understands how anything works anyway. But they do know that Agentic AI puts predatory pressure on the labour cost centres and that means it's 'good'.
@eniko I regret that I have but one boost to give this
@eniko nobody ever hired subject matter experts to do the human feedback part of RLHF so what they optimized for instead is surface level plausibility. When the people running the company aren't subject matter experts either, and hence are not equipped to distinguish "plausible at first glance" from "actually correct"...
@eniko at least make the AI document its code properly and confirm that documentation as well. Make it as inconvenient to use AI while you're at it. Demand a full white paper!
@mitsunee @eniko
The AI will happily produce a whitepaper-shaped object, unfortunately
@sabik @eniko sounds like a massive pain to proofread that ​
edit: could make it even more evil: demand the white paper handwritten, cleanly enough that everyone is easily able to read it
@mitsunee @eniko
I guess LLMs can supercharge Goodhart's law
@eniko Agentic AI is interesting because it allows the LLM-based AI to blatantly cheat by using bits of existing software that are “outside of the AI” or in other words “not AI at all” to cover up the holes that the LLM has demonstrated that it is useless at, thus giving the impression that now an LLM can actually do all the things that it previously totally failed at by using things that aren’t an AI, which we’ll call Agents, and instead of calling this Cheating AI, we’ll call it Agentic AI.
@eniko This is exactly what happened in my non-coding ecommerce job. It's been getting a lot of use for pipeline scripts and I've asked hey how does this work exactly? And gotten exactly that as an answer. No one knows how any of it's built exactly and the result is a batch file that self installs brew and uses python to call some applescript so it can rename files.
@eniko <prompt>Could you run me through how this works?</prompt> 🙃

@eniko recently I've been reviewing code with the most subtle and horrible kinda mistakes- and then folks who defend it "because Claude did it"

It is three times as exhausting to review this kind of crap because the mistakes are irrational, unmotivated and seem to be there solely to make reading the code harder.

@eniko I've tried to ask people questions about their LLM-generated code and they just feed my question to the LLM and paste the output. I don't understand what's going on in their brain that they fail to grasp how offensive that is.

@eniko my new favourite is people who do know how it works answering that you're supposed to prompt the AI now instead of share information directly 🤮

so - ok, i see we're literally not allowed to ask questions any more, is that how it fucking is