When I started in security, one of the prevailing attitudes was "The weakest link in the chain will always be the human."

I would like to thank every LLM provider and startup for changing this paradigm by introducing a much weaker link in the chain.

Thank you to everyone saying "it's still the human."

No, it isn't. It's product deployment without any concern for security or impact. This is the equivalent of suggesting every customer catch a falling knife, for their own benefit.

This is nondeterministic, autonomous malicious enablement, and we cannot blame the user as much as I'd like to.

@neurovagrant

I'd say it's still a human. But it's not the user, it's the product deployer.

In my worldview, responsibility always, and only, lands on humans

@jztusk Uh, well, I guess you disagree with the idiom, then, because there are no links in the chain that are not humans
@neurovagrant one of these days I need to sit down and write a blog post about how I have a blade that is cheap as hell, but more safe than any other blade I’ve owned, and how that relates to… everything.
@neurovagrant How is that not still the human? Didn't humans decide to let AI run entire systems without anyone watching.
FFS, Tencent's shares just skyrocketed for saying their deploying OpenClaw which is _known_ to be destructive and have massive security vulnerabilities.

@neurovagrant Why do you surrender agency so readily?

We are and remain masters of our world.

So much of the slopocalypse is shitty CEOs catering to dumb investors who arrogantly yet wrongfully think they know a damn thing about IT. All a very (if deplorably) human thing.

That said, your post is funny and I like it a lot.

@renardboy @neurovagrant no way. Nobody back home is going to believe me when I tell them I saw an actual bus
@neurovagrant
?
They haven't.
@phil @neurovagrant
Most humans don't copy/paste commands from ticket titles into their shells...
@EndlessMason @neurovagrant
Sorry,
who decided to, and then gave these tools access to do so?

Putting a non-deterministic tool with """safeguards""" there has very predictable consequences. If not humans, who exactly is to blame for this mess?

Cause it sure isn't a pile of numbers.
@phil @neurovagrant
Oh I see. In that case we should blame the fundamental forces of the universe for kicking off formation of planets and bootstrapping abiogenesis and evolution.
@EndlessMason @neurovagrant
To my knowledge, the fundamental forces of the universe, just like dead matter (including LLMs), don't have agency of their own.

Humans do.
@phil @EndlessMason "guns don't kill people" hasn't been convincing for decades.
@neurovagrant @EndlessMason
Guns, like any tool, need to be carefully managed by any human owning/ controlling them. LLMs can do a crapload of damage, but they can't be held accountable, just like a computer can't be held accountable for what sysadmins do.

@phil @neurovagrant
I don't.

I'm a stimulus-response machine. I'm governed by the laws of physics exclusively.

@phil by this logic, a human who forgets to update a PHP server is the weakest link in the chain. sure, the human is responsible if the PHP server gets hacked, but the human isn't what got compromised. "the weakest link in the chain will always be the human" is talking about phishing, and phishing LLMs makes phishing grandmas look difficult.
@EndlessMason @neurovagrant As a sidenote, I've seen things you wouldn't believe in the last few months that has me genuinely convinced that it's humans that made LLMs look bad, rather than LLMs being bad intrinsically (aside from the copyright issues, power drain, freshwater use, global warming, financial abuse, privacy issues, deals with government...).

The math models (locally hosted, fitting on gaming GPUs) can be fairly easily be made useful and helpful (a few days of effort after work) in menial tasks that can't be completed deterministically, provided basic oversight. They cost pennies, and they're private.
@phil @neurovagrant
HahhahahahhahhahhahhhahhHhHahjahahhahahahhahahhahahhaHhHHHhz aside from escalating the rate at which we're rendering the hahhhha planet unliveable hahHahahjhaha fucking good point man. Oh boy
@EndlessMason @neurovagrant
Running Qwen3.5 on my 7900xtx eats as much power as running any video game. I have zero issue with running LLMs locally to assist with my journals/ notes. Nothing compared to a data center.
@phil @EndlessMason this has gotten a bit tedious for me, if y'all want to continue, please start a thread between yourselves/untag me, thanks
@phil @neurovagrant @EndlessMason are you cory doctorow with a fake mustache on i think ive heard that slopologist argument before
@zaire@fedi.absturztau.be @EndlessMason
No, I'm not.

I've yet to hear an argument against using a crutch when it helps me function and hurts nobody. Do you have one?
@phil @zaire
At minimum, your metaphor is hurting people.
@EndlessMason @zaire@fedi.absturztau.be
Genuinely not a metaphor. I'll turn this around: your assumptions about other's ability to perform 'simple' everyday tasks is hurting people.

Not everyone has the privilege of being 'normally functioning.'

@phil @zaire
Fuck it, I'm thirsty, so if you'll join me and my neck beard at the "well, actually"

> Metaphor:
> A figure of speech in which a word [] that ordinarily designates one thing is used to designate another, thus making an implicit comparison, as in β€œa sea of troubles” []

It is a metaphor. A computer program is not a stick one uses to support their body weight to supplement the functionality of their legs

@phil @zaire

I'm not sure what kind of note taking you do that requires generative AI, but I bet you could do it better, faster, cozier and in a more usable way in a group of friends than with a slop bot spewing out likely wrong nonsense while running your fans at 100%

@phil @zaire
On top of that, you're talking about "normal functioning" and "shortfall" instead of talking about "accommodations", making me think that your analysis of "privilege" is sketchy at best.
@EndlessMason @zaire@fedi.absturztau.be
Right, because I want people around me to know everything that's screwed up in my head from a life of various kinds of trauma and untreated mental issues.

If it's wrong, I catch it in review, because I actually write all of my notes with my own hands. That's my job as a user. Nobody expects LLMs to be 100% truthful/ accurate.

Here's an example of things I use it for.
- "Analyze the last 9 months of my journal, collect all the entries that may relate to issue X about Y." I'd run this two or three times while doing other things (e.g. working).
- "Analyze my sleep cycle, stress levels, migraine occurrence/ other health issue in relation to [recurring event I just noticed]."
- "Perform sentiment analysis of my journal entries from the last month in relation to my org-agenda tasks", and then I'd cross-reference that with my self-reported values as an additional layer of validation (so it's not just "what I think I felt")
- or simple things, like "cross-reference the last year of journal entries with my technical notes, and tell me what topics frustrated me the most."
- "does this message from X indicates emotion Y"
- "does this wording come across as hostile, given my previous conversations with Z" (which are also recorded in my notes)

In many ways, it is a disability aid for terrible memory and executive dysfunction.

And let's be honest, it's not like a literal crutch doesn't require effort. The role of an aid like this isn't to make me faster or 'better', it's to help me get these things done in the first place.

So from my view, that's useful.

I can't read people very well, I don't do well with subtlety or what some call, common sense.

I hyperfocus and am incredibly distractable at the same time. Give me a technical puzzle and I'll find a solution. Ask me to perform any routine action and I'll forget about it before you finish speaking.

Sure, ADD meds can and do help me function, but that only helps on the focus side, not the actual "start doing things" aspect.

Anyway. I doubt you're amicable to further conversation, since I've had this exact one at least a dozen times now.

Re: privilege. If I could afford an actual person (or software existed which could fulfill the same purpose), I'd 100% do that. I can't. This costs me cents a month in power bills, and gets the job done.

I do feel that having the leeway to avoid disability aids like this is, in fact, a privilege.

@phil @zaire

> - "does this message from X indicates emotion Y"
> - "does this wording come across as hostile, given my previous conversations with Z"

How are you fact checking this, exactly?

@EndlessMason @zaire@fedi.absturztau.be
I'm not, because emotions are subjective, and a massive part of the training material pushed into LLMs.

I suck so badly at reading emotions/ intentions that I'll gladly take a chance at a machine doing it for me.

It's not like I'll ask people "what are you feeling when you say that" multiple times a day.

Edit: out of curiosity, I looked it up. LLMs aren't that bad at this, turns out.

https://nhsjs.com/2025/a-case-study-of-sentiment-analysis-on-survey-data-using-llms-versus-dedicated-neural-networks/
A Case Study of Sentiment Analysis on Survey Data Using LLMs versus Dedicated Neural Networks - NHSJS

Abstract Sentiment analysis of open-ended survey responses is a complex but essential task in understanding public opinion. This study compares the performance of three large language models (LLMs)β€”GPT-4o, Llama-3.3-70B-Instruct, and Gemini-2.0-Flashβ€”against dedicated sentiment classification neural networks, specifically Twitter-RoBERTa-base and DeBERTa-v3-base-absa-v1.1. Using survey data from two COVID-19 studies, we evaluated these models based on accuracy, precision, […]

NHSJS

@phil @zaire
This has the winning combo of

- less skill in the field
- delegating to an llm
- no desire to fact check
- actual outcomes for other people too

These are relationships with real people you're rolling the dice with.

You're now smacking people in the shins with your crutch.

@EndlessMason @zaire@fedi.absturztau.be
Propose an alternative that's as effective and available at comparable cost.

So far, the number of complaints about my tact/ blunt affect have dropped to zero since I started experimenting with this method several months ago.

The outcome for other people is that they aren't displeased with me misunderstanding their intentions/ expectations.

I'd say that's a positive. Haven't seen any evidence to the contrary so far, at least.

But please, tell me, out of this 'winning combo' list, how can I:
1. gain more skill in reading what people intend/ feel/ mean, aside from putting in effort to build a theory of mind for each person I interact with?
2. delegate to something else? Are there better tools/ systems/ options?
3. fact-check people's feelings over text? at least 90% of my communication in life is in text.

It seems that you'r ecoming at this with an a-priori conviction that anything ML/AI is intrinsically inferior to human effort.

I know first hand that isn't the case. In just the last few months I've seen acts of stupid that I would never have imagined.

When an LLM makes a mistake, at MINIMUM it's possible to trace back the faulty output to some kind of input, parameter, or other factor.

Sometimes it'll "misunderstand" the task. Sometimes it'll great the input and take that as instructions. Sometimes it'll just mix up the input (wrong dates, names, mixing subject/ object, etc). Sometimes it'll delete everything because it'll notice a mistake in its work and over-react when trying to fix it.

All of these things can be worked with or compensated for.

When people make mistakes, there's no backtrace. There's no reasoning, explanation, or excuse. Imagine this literal scenario: a bank's development team blindly mass-merging a thousand PR's meant to address security findings generated by an LLM on top of an actual SAST, the PR's being generated by an LLM, with no validation aside from other LLMs.

I can't, in good faith, blame a set of mathematical matrices for making stupid decisions. LLMs aren't sentient. They're not aware. They're incapable of understanding the consequences of their actions. They're incapable of imagination or predicting future events based on the present.

Granted, not all people are capable of all of these things either, but we (as a society) tend not to put people who don't meet a certain bar of competence in positions with decision-making authority.

IMO, people are responsible for the tools they use (LLM or otherwise) and should 100% be held accountable for the results these tools produce.
@EndlessMason @zaire@fedi.absturztau.be
Let's call it a disability aid, then.

@phil @neurovagrant @EndlessMason similar experience. humans can drive these models if they have a decent engineering/security understanding. i've got no issue with leveraging it to offload tedious tasks and operational burden.

but to your point on the human factor, there's been a lot of footgunning lately. even with principal staff getting lazy.

running models on a ada4000-20gb works pretty nicely and way less power use than a dc or some 5090 monster i need a new circuit for

@jae @neurovagrant @EndlessMason
I just give the LLM some tools to read my journals, and then type my notes into my note git repo in a separate place.

https://codeberg.org/bajsicki/gptel-got

I've a bunch of re-writes locally, but they're not ready to be out in public yet until I test more and gain confidence.
gptel-got

Tooling for LLM interactions with org-mode. Requires gptel and org-ql.

Codeberg.org

@phil @neurovagrant @EndlessMason that's really clever. i had a pile of links from the last 2 years. dedupe + sort + relevance tagging took ~10 minutes which would have taken me a frustrating couple of days.

i like how you're clear on the disclaimer. i've seen others tout their tool as "military-grade secure" and i fall back out of my chair

@phil @neurovagrant @EndlessMason you have to be smart enough to do the job without AI to be able to use the current generation of AI effectively and safely.

But that's not how it's being sold, and that's not how executives see the situation

Which means this whole mess isn't an end user failure (oh, if only the end users were smarter and more attentive, BUT THEY"RE NOT)

It's a management failure (not understanding their workers, and not understanding the tools they are making their workers use).

@neurovagrant while the post is funny, fundamentally, no, they haven't. what they've done is creating something that is considered by the weakest link in the chain to be something that improves the security, when it does the opposite
@neurovagrant Not on their own, but Llm's do allow different humans to also compete for weakest link.
A!i
@neurovagrant The weakest link is a mediocre statistical approximation of the human.
@neurovagrant We invented tech vulnerable to classic computer viruses, and social engineering too! Best of both worlds!
@neurovagrant i suspect we have two weak links now, great!

@neurovagrant

:sigh: better than humans *again*!

the end is nigh....

(/s)

@neurovagrant To err is human, but to *really* agree things up you need a computer.
@neurovagrant i mean its still humans choosing to delegate tasks to slop machines (who cannot be held accountable)
@neurovagrant The weakest link was always the VCs and the techbros in charge.
@neurovagrant Well we do have humans carelessly accepting AI submits without an review: one could consider them an even weaker chain.
@neurovagrant It's still kind of a human's fault for installing that weak link. The weakest link are the c-suite making terrible decisions.
@neurovagrant okay, now the weakest link is the human who decided "I think I'll outsource my work to a dumbass who's wrong about everything."
@neurovagrant now the weakest link is the human who decided to implement AI.
So what's changed?
@neurovagrant it still is the human. They just changed how they break things. Instead of breaking things themselves they trust a machine that does it.

@neurovagrant

Turns out the weakest link was just waiting for a better prompt.

@neurovagrant

It's still a human, it's just shifted to the decision-making ones that mandate use of these systems.