Mastodawn

BlueMonday1984 Mar 9

Stubsack: weekly thread for sneers not worth an entire post, week ending 15th March 2026

Stubsack: weekly thread for sneers not worth an entire post, week ending 15th March 2026 - awful.systems

Want to wade into the snowy surf of the abyss? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid. Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret. Any awful.systems sub may be subsneered in this subthread, techtakes or no. If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high. > The post Xitter web has spawned so many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be) > > Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them. (Credit and/or blame to David Gerard for starting this.)

Show thread

YourNetworkIsHaunted Mar 10

FT reports from Amazon insiders that they’re investigating the role AI-assisted development has played in a spate of recent issues across both the store and AWS.

FT also links to several previous stories they’ve reported on related issues, and I haven’t had the time to breach the paywalls to read further, but the line that caught my eye was this:

The FT previously reported multiple Amazon engineers said their business units had to deal with a higher number of “Sev2s” — incidents requiring a rapid response to avoid product outages — each day as a result of job cuts.

To be honest, this is why I’m skeptical of the argument that the AI-linked job losses are a complete fabrication. Not because the systems are actually there to directly replace the lost workers, but because the decision-makers at these companies seem to legitimately believe that these new AI tools will let their remaining workforce cover any gaps left by the layoffs they wanted to do anyways. It sounds like Amazon is starting to feel the inverse relationship between efficiency and stability, and I expect it’s only a matter of time before the wider economy starts to feel it too. Whether the owning class recognizes what’s happening is, of course, a different story.

Amazon holds engineering meeting following AI-related outages

Ecommerce giant says there has been a ‘trend of incidents’ linked to ‘Gen-AI assisted changes’

Financial Times

Show thread

lurker

to follow this one up: there is now a new study about AI agents being dogshit at keeping code working for over 8 months

SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

Large language model (LLM)-powered agents have demonstrated strong capabilities in automating software engineering tasks such as static bug fixing, as evidenced by benchmarks like SWE-bench. However, in the real world, the development of mature software is typically predicated on complex requirement changes and long-term feature iterations -- a process that static, one-shot repair paradigms fail to capture. To bridge this gap, we propose \textbf{SWE-CI}, the first repository-level benchmark built upon the Continuous Integration loop, aiming to shift the evaluation paradigm for code generation from static, short-term \textit{functional correctness} toward dynamic, long-term \textit{maintainability}. The benchmark comprises 100 tasks, each corresponding on average to an evolution history spanning 233 days and 71 consecutive commits in a real-world code repository. SWE-CI requires agents to systematically resolve these tasks through dozens of rounds of analysis and coding iterations. SWE-CI provides valuable insights into how well agents can sustain code quality throughout long-term evolution.

arXiv.org

Show thread

jaschop Mar 10

Unfortunately the paper structure screams “AI senpai, notice me!”

AI coding agents seem bad at this job yet, but if you optimize for our benchmark…