Mastodawn

When I started in security, one of the prevailing attitudes was "The weakest link in the chain will always be the human."

I would like to thank every LLM provider and startup for changing this paradigm by introducing a much weaker link in the chain.

Show thread

Ian Campbell 🏴4d ago

Thank you to everyone saying "it's still the human."

No, it isn't. It's product deployment without any concern for security or impact. This is the equivalent of suggesting every customer catch a falling knife, for their own benefit.

This is nondeterministic, autonomous malicious enablement, and we cannot blame the user as much as I'd like to.

Show thread

jz.tusk 4d ago

@neurovagrant

I'd say it's still a human. But it's not the user, it's the product deployer.

In my worldview, responsibility always, and only, lands on humans

Show thread

some kind of orange shape 4d ago

@jztusk Uh, well, I guess you disagree with the idiom, then, because there are no links in the chain that are not humans

Show thread

Tindra 4d ago

@neurovagrant one of these days I need to sit down and write a blog post about how I have a blade that is cheap as hell, but more safe than any other blade I’ve owned, and how that relates to… everything.

Show thread

➴➴➴Æ🜔Ɲ.Ƈꭚ⍴𝔥єɼ👩🏻‍💻4d ago

@neurovagrant How is that not still the human? Didn't humans decide to let AI run entire systems without anyone watching.
FFS, Tencent's shares just skyrocketed for saying their deploying OpenClaw which is _known_ to be destructive and have massive security vulnerabilities.

Show thread

an actual bus 4d ago

@neurovagrant Why do you surrender agency so readily?

We are and remain masters of our world.

So much of the slopocalypse is shitty CEOs catering to dumb investors who arrogantly yet wrongfully think they know a damn thing about IT. All a very (if deplorably) human thing.

That said, your post is funny and I like it a lot.

Show thread

nieuemma 4d ago

@renardboy @neurovagrant no way. Nobody back home is going to believe me when I tell them I saw an actual bus

Show thread

Phil 4d ago

@neurovagrant
?
They haven't.

Show thread

EndlessMason 4d ago

@phil @neurovagrant
Most humans don't copy/paste commands from ticket titles into their shells...

Show thread

Phil 4d ago

@EndlessMason @neurovagrant
Sorry, who decided to, and then gave these tools access to do so?

Putting a non-deterministic tool with """safeguards""" there has very predictable consequences. If not humans, who exactly is to blame for this mess?

Cause it sure isn't a pile of numbers.

Show thread

EndlessMason 4d ago

@phil @neurovagrant
Oh I see. In that case we should blame the fundamental forces of the universe for kicking off formation of planets and bootstrapping abiogenesis and evolution.

Show thread

Phil 4d ago

@EndlessMason @neurovagrant
To my knowledge, the fundamental forces of the universe, just like dead matter (including LLMs), don't have agency of their own.

Humans do.

Show thread

Ian Campbell 🏴4d ago

@phil @EndlessMason "guns don't kill people" hasn't been convincing for decades.

Show thread

Phil 4d ago

@neurovagrant @EndlessMason
Guns, like any tool, need to be carefully managed by any human owning/ controlling them. LLMs can do a crapload of damage, but they can't be held accountable, just like a computer can't be held accountable for what sysadmins do.

Show thread

EndlessMason 4d ago

@phil @neurovagrant
I don't.

I'm a stimulus-response machine. I'm governed by the laws of physics exclusively.

Show thread

Vincent Sparks 4d ago

@phil by this logic, a human who forgets to update a PHP server is the weakest link in the chain. sure, the human is responsible if the PHP server gets hacked, but the human isn't what got compromised. "the weakest link in the chain will always be the human" is talking about phishing, and phishing LLMs makes phishing grandmas look difficult.

Show thread

Phil 4d ago

@EndlessMason @neurovagrant As a sidenote, I've seen things you wouldn't believe in the last few months that has me genuinely convinced that it's humans that made LLMs look bad, rather than LLMs being bad intrinsically (aside from the copyright issues, power drain, freshwater use, global warming, financial abuse, privacy issues, deals with government...).

The math models (locally hosted, fitting on gaming GPUs) can be fairly easily be made useful and helpful (a few days of effort after work) in menial tasks that can't be completed deterministically, provided basic oversight. They cost pennies, and they're private.

Show thread

EndlessMason 4d ago

@phil @neurovagrant
HahhahahahhahhahhahhhahhHhHahjahahhahahahhahahhahahhaHhHHHhz aside from escalating the rate at which we're rendering the hahhhha planet unliveable hahHahahjhaha fucking good point man. Oh boy

Show thread

Phil 4d ago

@EndlessMason @neurovagrant
Running Qwen3.5 on my 7900xtx eats as much power as running any video game. I have zero issue with running LLMs locally to assist with my journals/ notes. Nothing compared to a data center.

Show thread

Ian Campbell 🏴4d ago

@phil @EndlessMason this has gotten a bit tedious for me, if y'all want to continue, please start a thread between yourselves/untag me, thanks

Show thread

EndlessMason 4d ago

@neurovagrant nah I'm good

Show thread

Ian Campbell 🏴4d ago

@EndlessMason i hear that

Show thread

zivi 4d ago

@phil @neurovagrant @EndlessMason are you cory doctorow with a fake mustache on i think ive heard that slopologist argument before

Show thread

Phil 4d ago

@zaire@fedi.absturztau.be @EndlessMason
No, I'm not.

I've yet to hear an argument against using a crutch when it helps me function and hurts nobody. Do you have one?

Show thread

EndlessMason 4d ago

@phil @zaire
At minimum, your metaphor is hurting people.

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
Genuinely not a metaphor. I'll turn this around: your assumptions about other's ability to perform 'simple' everyday tasks is hurting people.

Not everyone has the privilege of being 'normally functioning.'

Show thread

EndlessMason 4d ago

@phil @zaire
Fuck it, I'm thirsty, so if you'll join me and my neck beard at the "well, actually"

> Metaphor:
> A figure of speech in which a word [] that ordinarily designates one thing is used to designate another, thus making an implicit comparison, as in “a sea of troubles” []

It is a metaphor. A computer program is not a stick one uses to support their body weight to supplement the functionality of their legs

Show thread

EndlessMason 4d ago

@phil @zaire

I'm not sure what kind of note taking you do that requires generative AI, but I bet you could do it better, faster, cozier and in a more usable way in a group of friends than with a slop bot spewing out likely wrong nonsense while running your fans at 100%

Show thread

EndlessMason 4d ago

@phil @zaire
On top of that, you're talking about "normal functioning" and "shortfall" instead of talking about "accommodations", making me think that your analysis of "privilege" is sketchy at best.

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
Right, because I want people around me to know everything that's screwed up in my head from a life of various kinds of trauma and untreated mental issues.

If it's wrong, I catch it in review, because I actually write all of my notes with my own hands. That's my job as a user. Nobody expects LLMs to be 100% truthful/ accurate.

Here's an example of things I use it for.
- "Analyze the last 9 months of my journal, collect all the entries that may relate to issue X about Y." I'd run this two or three times while doing other things (e.g. working).
- "Analyze my sleep cycle, stress levels, migraine occurrence/ other health issue in relation to [recurring event I just noticed]."
- "Perform sentiment analysis of my journal entries from the last month in relation to my org-agenda tasks", and then I'd cross-reference that with my self-reported values as an additional layer of validation (so it's not just "what I think I felt")
- or simple things, like "cross-reference the last year of journal entries with my technical notes, and tell me what topics frustrated me the most."
- "does this message from X indicates emotion Y"
- "does this wording come across as hostile, given my previous conversations with Z" (which are also recorded in my notes)

In many ways, it is a disability aid for terrible memory and executive dysfunction.

And let's be honest, it's not like a literal crutch doesn't require effort. The role of an aid like this isn't to make me faster or 'better', it's to help me get these things done in the first place.

So from my view, that's useful.

I can't read people very well, I don't do well with subtlety or what some call, common sense.

I hyperfocus and am incredibly distractable at the same time. Give me a technical puzzle and I'll find a solution. Ask me to perform any routine action and I'll forget about it before you finish speaking.

Sure, ADD meds can and do help me function, but that only helps on the focus side, not the actual "start doing things" aspect.

Anyway. I doubt you're amicable to further conversation, since I've had this exact one at least a dozen times now.

Re: privilege. If I could afford an actual person (or software existed which could fulfill the same purpose), I'd 100% do that. I can't. This costs me cents a month in power bills, and gets the job done.

I do feel that having the leeway to avoid disability aids like this is, in fact, a privilege.

Show thread

EndlessMason 4d ago

@phil @zaire

> - "does this message from X indicates emotion Y"
> - "does this wording come across as hostile, given my previous conversations with Z"

How are you fact checking this, exactly?

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
I'm not, because emotions are subjective, and a massive part of the training material pushed into LLMs.

I suck so badly at reading emotions/ intentions that I'll gladly take a chance at a machine doing it for me.

It's not like I'll ask people "what are you feeling when you say that" multiple times a day.

Edit: out of curiosity, I looked it up. LLMs aren't that bad at this, turns out.

https://nhsjs.com/2025/a-case-study-of-sentiment-analysis-on-survey-data-using-llms-versus-dedicated-neural-networks/

A Case Study of Sentiment Analysis on Survey Data Using LLMs versus Dedicated Neural Networks - NHSJS

Abstract Sentiment analysis of open-ended survey responses is a complex but essential task in understanding public opinion. This study compares the performance of three large language models (LLMs)—GPT-4o, Llama-3.3-70B-Instruct, and Gemini-2.0-Flash—against dedicated sentiment classification neural networks, specifically Twitter-RoBERTa-base and DeBERTa-v3-base-absa-v1.1. Using survey data from two COVID-19 studies, we evaluated these models based on accuracy, precision, […]

NHSJS

Show thread

EndlessMason 4d ago

@phil @zaire
This has the winning combo of

- less skill in the field
- delegating to an llm
- no desire to fact check
- actual outcomes for other people too

These are relationships with real people you're rolling the dice with.

You're now smacking people in the shins with your crutch.

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
Propose an alternative that's as effective and available at comparable cost.

So far, the number of complaints about my tact/ blunt affect have dropped to zero since I started experimenting with this method several months ago.

The outcome for other people is that they aren't displeased with me misunderstanding their intentions/ expectations.

I'd say that's a positive. Haven't seen any evidence to the contrary so far, at least.

But please, tell me, out of this 'winning combo' list, how can I:
1. gain more skill in reading what people intend/ feel/ mean, aside from putting in effort to build a theory of mind for each person I interact with?
2. delegate to something else? Are there better tools/ systems/ options?
3. fact-check people's feelings over text? at least 90% of my communication in life is in text.

It seems that you'r ecoming at this with an a-priori conviction that anything ML/AI is intrinsically inferior to human effort.

I know first hand that isn't the case. In just the last few months I've seen acts of stupid that I would never have imagined.

When an LLM makes a mistake, at MINIMUM it's possible to trace back the faulty output to some kind of input, parameter, or other factor.

Sometimes it'll "misunderstand" the task. Sometimes it'll great the input and take that as instructions. Sometimes it'll just mix up the input (wrong dates, names, mixing subject/ object, etc). Sometimes it'll delete everything because it'll notice a mistake in its work and over-react when trying to fix it.

All of these things can be worked with or compensated for.

When people make mistakes, there's no backtrace. There's no reasoning, explanation, or excuse. Imagine this literal scenario: a bank's development team blindly mass-merging a thousand PR's meant to address security findings generated by an LLM on top of an actual SAST, the PR's being generated by an LLM, with no validation aside from other LLMs.

I can't, in good faith, blame a set of mathematical matrices for making stupid decisions. LLMs aren't sentient. They're not aware. They're incapable of understanding the consequences of their actions. They're incapable of imagination or predicting future events based on the present.

Granted, not all people are capable of all of these things either, but we (as a society) tend not to put people who don't meet a certain bar of competence in positions with decision-making authority.

IMO, people are responsible for the tools they use (LLM or otherwise) and should 100% be held accountable for the results these tools produce.

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
Let's call it a disability aid, then.

Show thread

ᴏᴏᴍ-ᴋɪʟʟᴇʀ: 333 4d ago

@phil @neurovagrant @EndlessMason similar experience. humans can drive these models if they have a decent engineering/security understanding. i've got no issue with leveraging it to offload tedious tasks and operational burden.

but to your point on the human factor, there's been a lot of footgunning lately. even with principal staff getting lazy.

running models on a ada4000-20gb works pretty nicely and way less power use than a dc or some 5090 monster i need a new circuit for

Show thread

Phil 4d ago

@jae @neurovagrant @EndlessMason
I just give the LLM some tools to read my journals, and then type my notes into my note git repo in a separate place.

https://codeberg.org/bajsicki/gptel-got

I've a bunch of re-writes locally, but they're not ready to be out in public yet until I test more and gain confidence.

gptel-got

Tooling for LLM interactions with org-mode. Requires gptel and org-ql.

Codeberg.org

Show thread

ᴏᴏᴍ-ᴋɪʟʟᴇʀ: 333 4d ago

@phil @neurovagrant @EndlessMason that's really clever. i had a pile of links from the last 2 years. dedupe + sort + relevance tagging took ~10 minutes which would have taken me a frustrating couple of days.

i like how you're clear on the disclaimer. i've seen others tout their tool as "military-grade secure" and i fall back out of my chair

Show thread

Random Damage 🌻3d ago

@phil @neurovagrant @EndlessMason you have to be smart enough to do the job without AI to be able to use the current generation of AI effectively and safely.

But that's not how it's being sold, and that's not how executives see the situation

Which means this whole mess isn't an end user failure (oh, if only the end users were smarter and more attentive, BUT THEY"RE NOT)

It's a management failure (not understanding their workers, and not understanding the tools they are making their workers use).

Show thread

sarah tonin

4d ago

@neurovagrant while the post is funny, fundamentally, no, they haven't. what they've done is creating something that is considered by the weakest link in the chain to be something that improves the security, when it does the opposite

Show thread

Klara 4d ago

@neurovagrant Not on their own, but Llm's do allow different humans to also compete for weakest link.
A!i

Show thread

Ian Douglas Scott 4d ago

@neurovagrant The weakest link is a mediocre statistical approximation of the human.

Show thread

the_blackwell_ninja [EST]4d ago

@neurovagrant We invented tech vulnerable to classic computer viruses, and social engineering too! Best of both worlds!

Show thread

Ox1de 4d ago

@neurovagrant i suspect we have two weak links now, great!

Show thread

grrl_aex 4d ago

@neurovagrant

:sigh: better than humans *again*!

the end is nigh....

(/s)

massive bong rip

Who decided to deploy the LLMs? It wasn't a computer...

Show thread

Ludwig Vielfrass 4d ago

@cR0w @neurovagrant
Or *was* it? <dramatic music>

Show thread

cR0w

🏴‍☠️4d ago

@lerxst @neurovagrant

Show thread

Andrew Golding 4d ago

@cR0w @neurovagrant "Stop, OpenCaw!"

Show thread

Misuse Case 4d ago

@neurovagrant To err is human, but to *really* agree things up you need a computer.

Show thread

zivi 4d ago

@neurovagrant i mean its still humans choosing to delegate tasks to slop machines (who cannot be held accountable)

Show thread

Zeki Çatav 🤔 ☕ 🕯️🎶4d ago

@neurovagrant

Show thread

Maximilian Overdraft, Esq.4d ago

@neurovagrant The weakest link was always the VCs and the techbros in charge.

Show thread

Loren Kohnfelder 4d ago

@neurovagrant Well we do have humans carelessly accepting AI submits without an review: one could consider them an even weaker chain.

Show thread

StarkRG 4d ago

@neurovagrant It's still kind of a human's fault for installing that weak link. The weakest link are the c-suite making terrible decisions.

Show thread

Mad Engineering 4d ago

@neurovagrant okay, now the weakest link is the human who decided "I think I'll outsource my work to a dumbass who's wrong about everything."

Show thread

Chasalin 4d ago

@neurovagrant now the weakest link is the human who decided to implement AI.
So what's changed?

Show thread

Sarah Savage 4d ago

@neurovagrant it still is the human. They just changed how they break things. Instead of breaking things themselves they trust a machine that does it.

Show thread

tuban_muzuru 4d ago

@neurovagrant

Turns out the weakest link was just waiting for a better prompt.

Show thread

Fennix 4d ago

@neurovagrant

It's still a human, it's just shifted to the decision-making ones that mandate use of these systems.