Mastodawn

When I started in security, one of the prevailing attitudes was "The weakest link in the chain will always be the human."

I would like to thank every LLM provider and startup for changing this paradigm by introducing a much weaker link in the chain.

Show thread

Phil 4d ago

@neurovagrant
?
They haven't.

Show thread

EndlessMason 4d ago

@phil @neurovagrant
Most humans don't copy/paste commands from ticket titles into their shells...

Show thread

Phil 4d ago

@EndlessMason @neurovagrant As a sidenote, I've seen things you wouldn't believe in the last few months that has me genuinely convinced that it's humans that made LLMs look bad, rather than LLMs being bad intrinsically (aside from the copyright issues, power drain, freshwater use, global warming, financial abuse, privacy issues, deals with government...).

The math models (locally hosted, fitting on gaming GPUs) can be fairly easily be made useful and helpful (a few days of effort after work) in menial tasks that can't be completed deterministically, provided basic oversight. They cost pennies, and they're private.

Show thread

EndlessMason 4d ago

@phil @neurovagrant
HahhahahahhahhahhahhhahhHhHahjahahhahahahhahahhahahhaHhHHHhz aside from escalating the rate at which we're rendering the hahhhha planet unliveable hahHahahjhaha fucking good point man. Oh boy

Show thread

Phil 4d ago

@EndlessMason @neurovagrant
Running Qwen3.5 on my 7900xtx eats as much power as running any video game. I have zero issue with running LLMs locally to assist with my journals/ notes. Nothing compared to a data center.

Show thread

zivi 4d ago

@phil @neurovagrant @EndlessMason are you cory doctorow with a fake mustache on i think ive heard that slopologist argument before

Show thread

Phil 4d ago

@zaire@fedi.absturztau.be @EndlessMason
No, I'm not.

I've yet to hear an argument against using a crutch when it helps me function and hurts nobody. Do you have one?

Show thread

EndlessMason 4d ago

@phil @zaire
At minimum, your metaphor is hurting people.

Show thread

Phil

@EndlessMason @zaire@fedi.absturztau.be
Genuinely not a metaphor. I'll turn this around: your assumptions about other's ability to perform 'simple' everyday tasks is hurting people.

Not everyone has the privilege of being 'normally functioning.'

Show thread

EndlessMason 4d ago

@phil @zaire
Fuck it, I'm thirsty, so if you'll join me and my neck beard at the "well, actually"

> Metaphor:
> A figure of speech in which a word [] that ordinarily designates one thing is used to designate another, thus making an implicit comparison, as in “a sea of troubles” []

It is a metaphor. A computer program is not a stick one uses to support their body weight to supplement the functionality of their legs

Show thread

EndlessMason 4d ago

@phil @zaire

I'm not sure what kind of note taking you do that requires generative AI, but I bet you could do it better, faster, cozier and in a more usable way in a group of friends than with a slop bot spewing out likely wrong nonsense while running your fans at 100%

Show thread

EndlessMason 4d ago

@phil @zaire
On top of that, you're talking about "normal functioning" and "shortfall" instead of talking about "accommodations", making me think that your analysis of "privilege" is sketchy at best.

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
Right, because I want people around me to know everything that's screwed up in my head from a life of various kinds of trauma and untreated mental issues.

If it's wrong, I catch it in review, because I actually write all of my notes with my own hands. That's my job as a user. Nobody expects LLMs to be 100% truthful/ accurate.

Here's an example of things I use it for.
- "Analyze the last 9 months of my journal, collect all the entries that may relate to issue X about Y." I'd run this two or three times while doing other things (e.g. working).
- "Analyze my sleep cycle, stress levels, migraine occurrence/ other health issue in relation to [recurring event I just noticed]."
- "Perform sentiment analysis of my journal entries from the last month in relation to my org-agenda tasks", and then I'd cross-reference that with my self-reported values as an additional layer of validation (so it's not just "what I think I felt")
- or simple things, like "cross-reference the last year of journal entries with my technical notes, and tell me what topics frustrated me the most."
- "does this message from X indicates emotion Y"
- "does this wording come across as hostile, given my previous conversations with Z" (which are also recorded in my notes)

In many ways, it is a disability aid for terrible memory and executive dysfunction.

And let's be honest, it's not like a literal crutch doesn't require effort. The role of an aid like this isn't to make me faster or 'better', it's to help me get these things done in the first place.

So from my view, that's useful.

I can't read people very well, I don't do well with subtlety or what some call, common sense.

I hyperfocus and am incredibly distractable at the same time. Give me a technical puzzle and I'll find a solution. Ask me to perform any routine action and I'll forget about it before you finish speaking.

Sure, ADD meds can and do help me function, but that only helps on the focus side, not the actual "start doing things" aspect.

Anyway. I doubt you're amicable to further conversation, since I've had this exact one at least a dozen times now.

Re: privilege. If I could afford an actual person (or software existed which could fulfill the same purpose), I'd 100% do that. I can't. This costs me cents a month in power bills, and gets the job done.

I do feel that having the leeway to avoid disability aids like this is, in fact, a privilege.

Show thread

EndlessMason 4d ago

@phil @zaire

> - "does this message from X indicates emotion Y"
> - "does this wording come across as hostile, given my previous conversations with Z"

How are you fact checking this, exactly?

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
I'm not, because emotions are subjective, and a massive part of the training material pushed into LLMs.

I suck so badly at reading emotions/ intentions that I'll gladly take a chance at a machine doing it for me.

It's not like I'll ask people "what are you feeling when you say that" multiple times a day.

Edit: out of curiosity, I looked it up. LLMs aren't that bad at this, turns out.

https://nhsjs.com/2025/a-case-study-of-sentiment-analysis-on-survey-data-using-llms-versus-dedicated-neural-networks/

A Case Study of Sentiment Analysis on Survey Data Using LLMs versus Dedicated Neural Networks - NHSJS

Abstract Sentiment analysis of open-ended survey responses is a complex but essential task in understanding public opinion. This study compares the performance of three large language models (LLMs)—GPT-4o, Llama-3.3-70B-Instruct, and Gemini-2.0-Flash—against dedicated sentiment classification neural networks, specifically Twitter-RoBERTa-base and DeBERTa-v3-base-absa-v1.1. Using survey data from two COVID-19 studies, we evaluated these models based on accuracy, precision, […]

NHSJS

Show thread

EndlessMason 4d ago

@phil @zaire
This has the winning combo of

- less skill in the field
- delegating to an llm
- no desire to fact check
- actual outcomes for other people too

These are relationships with real people you're rolling the dice with.

You're now smacking people in the shins with your crutch.

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
Propose an alternative that's as effective and available at comparable cost.

So far, the number of complaints about my tact/ blunt affect have dropped to zero since I started experimenting with this method several months ago.

The outcome for other people is that they aren't displeased with me misunderstanding their intentions/ expectations.

I'd say that's a positive. Haven't seen any evidence to the contrary so far, at least.

But please, tell me, out of this 'winning combo' list, how can I:
1. gain more skill in reading what people intend/ feel/ mean, aside from putting in effort to build a theory of mind for each person I interact with?
2. delegate to something else? Are there better tools/ systems/ options?
3. fact-check people's feelings over text? at least 90% of my communication in life is in text.

It seems that you'r ecoming at this with an a-priori conviction that anything ML/AI is intrinsically inferior to human effort.

I know first hand that isn't the case. In just the last few months I've seen acts of stupid that I would never have imagined.

When an LLM makes a mistake, at MINIMUM it's possible to trace back the faulty output to some kind of input, parameter, or other factor.

Sometimes it'll "misunderstand" the task. Sometimes it'll great the input and take that as instructions. Sometimes it'll just mix up the input (wrong dates, names, mixing subject/ object, etc). Sometimes it'll delete everything because it'll notice a mistake in its work and over-react when trying to fix it.

All of these things can be worked with or compensated for.

When people make mistakes, there's no backtrace. There's no reasoning, explanation, or excuse. Imagine this literal scenario: a bank's development team blindly mass-merging a thousand PR's meant to address security findings generated by an LLM on top of an actual SAST, the PR's being generated by an LLM, with no validation aside from other LLMs.

I can't, in good faith, blame a set of mathematical matrices for making stupid decisions. LLMs aren't sentient. They're not aware. They're incapable of understanding the consequences of their actions. They're incapable of imagination or predicting future events based on the present.

Granted, not all people are capable of all of these things either, but we (as a society) tend not to put people who don't meet a certain bar of competence in positions with decision-making authority.

IMO, people are responsible for the tools they use (LLM or otherwise) and should 100% be held accountable for the results these tools produce.

Show thread

Phil 4d ago

@EndlessMason @zaire@fedi.absturztau.be
Let's call it a disability aid, then.