Mastodawn

Eh, reading the wonderful responses to this thread, who am I kidding?

Just "Risk Acceptance" all of it, budget some money to deal with it when it happens again, and move on with your merry life.

Not like the proles matter at all.

TommyBoy Jul 19, 2024

@tinker right in the unknown unknowns budget

Cassandrich Jul 19, 2024

@tinker No tech company can be trusted to treat severe adverse impact on ordinary people's lives due to their fuck ups as anything other than a (very small, here's $5 and worse than useless credit monitoring) cost of doing business.

b4ux1t3

#1️⃣Jul 19, 2024

@tinker ". . .hopefully orgs will change."

Many will!

. . .and then they'll lay off their current staff, hire new people who find the old processes tedious, and start all over again.

Xavier Ashe

Jul 19, 2024

@tinker "Just let it auto-update?" Most EDRs (including CrowdStrike) don't give us that option. I bet they have that feature release pretty soon.

@Xavier - Indeed. How many folks with budget pushed back before this? How many will now?

A "feature" that everyone requests will be implemented.

Xavier Ashe

Jul 19, 2024

@tinker I used to work for an EDR company. We did allow customers to tweak the content updates, and so many customers shot themselves in the foot.

CrowdStrike has always been the MacOS of EDRs. Not very flexible, and many features are hidden from the end user. I used to call it EDR for dummies. You didn't need a dedicated team to run it. It was mostly set-and-forget.

CS has come a long way in flexibility, but still has a while to go.

Kaito Jul 20, 2024

@tinker @Xavier your company chose to run CrowdStrike and Windows. Your company HAD a choice.

GreenDotGuy Jul 19, 2024

@Xavier @tinker this wasn't even a 'product update', it was a 'definitions 'update'.

This same sort of thing happened many years ago where an Office binary was deleted by our antivirus, basically stopping business for a day. This seems like sort of a nightmare edition of that 'bad defs' problem that used to happen more often.

I sort of don't want to be in a world where we are testing the day-to-day definition updates for our EDR. I'd rather put this into perspective as one bad day, and maybe add this scenario to our DR procedures.

Gareth Simpson Jul 19, 2024

@DarcMoughty @Xavier @tinker your take is too sane. Not nearly spicy enough.

barunick Jul 19, 2024

@Xavier @tinker tbh it was a risk that we identified when we did our assessment. I checked and we put it as once in a decade type of event. It was unlikely but there is a potential. This was more fringe than anything.

Now if it happens again…

Today we found out who had good BCP’s and who didn’t, hopefully orgs who’s BCP’s weren’t great learn why.

Also you can have a good BCP and still have a bad time. BCP doesn’t mean you’re going to have fun, you’re minimizing downtime. It sucks but boy am I’m proud.

*shrug* most orgs are gonna see this, ultimately, as a black swan event.

Setting up test envs to -actually evaluate updates- before deployment requires specific expertise, which requires paying for it.

In my experience, they don't want to pay for this; the business process incentives do not align to make that a regular part of operations, due to the increased friction in IT operations that results, etc. etc. etc.

Given the systemic removal of in-org IT ops in favor of contracted MSP shit - who deploys endpoint agents like this under contract and is thus -not in the control structure of the org directly- and as such is not meaningfully a part of the organization and not a part of these conversations?

Yeah.

This one's not fixable by telling people to do the things that have been standard practice in well-run orgs for decades; if they're not doing the 'right' thing, then that's due to some kind of internal organizational dysfunction that cannot be treated with generic advice.

"Happy families are all alike; every unhappy family is unique" and all that - if they're refusing to do the correct workflow, there is some organizational trauma, unique to the org, preventing it.

Ain't no such thing as a "business psychotherapist" to unfuck that - tho, lol, if someone's willing to pay me enough I could take a stab at it.

@munin - business psychotherapist.... That's a vCISO right?

No. A vCISO is not capable of debugging the structure of the organization itself.

The problem lies in the org, not in the tech the org uses.

@munin - My joke lies in the idea of vCISOs providing consultant "policies and procedures" via GRC gigs, etc.

Just cause a psychotherapist tells you what to do, doesn't mean you apply that therapy to yourself.

Anyhow. I completely agree with everything you have said.

"only one, but the lightbulb has to want to change" lol

https://flowchainsensei.wordpress.com/2012/04/29/the-nine-principles-of-organisational-psychotherapy/

Jeff Grigg Jul 19, 2024

@tinker @munin

Just when you thought that there was no such things as "Organisational Psychotherapy" ...

The Nine Principles of Organisational Psychotherapy

The Nine Principles of Organisational Psychotherapy The core premise of Organisational (Psycho)Therapy is that flourishing organisations are great places to work, and because of this, highly effect…

Think Different

Jeff Grigg Jul 19, 2024

@tinker @munin

Honestly, I keep thinking that much of the modern corporate world these days is "certifiably insane."

@JeffGrigg @tinker

Sanity is a societal convention; assuming it has a meaning beyond "conforming with the norms of an organization or context" is prolly not useful.

@munin @tinker

And also, half the time the InfoSec industry is like "patch all your things with all the updates within 5 minutes or you are toast".... and then they are "oh, you don't have robust test, then canary, and full rollback and response plans for every single kind of update to everything you own? tsk tsk"

@munin @tinker

99% of orgs struggle to get stuff patched in anything like a timely manner. anti-malware are a compensating control for that being slow... so now we do that slowly as well. What is the compensating control for the compensating control for being slow being slow?

@mmaibaum @tinker

Yes, it requires systemic examination of the organization as a whole in order to determine how build up a comprehensive approach to understanding the vulnerability surfaces that the company has, and the way in which security controls can be applied to those surfaces relevantly, working with, rather than providing friction to, the workflows that the company requires to do business.

This is what competent blue teaming does, and it requires paying for people who are willing to engage with this problem, paying for their education, and giving them enough political clout within the organization that meaningful change can take place.

@munin @tinker yep, and I have worked with good people like this - tbh this was more a comment on the incredibly naive commentary around in general (not this thread) from people who quite obviously never worked in a large complex org

@mmaibaum @tinker

for context, I've done a lot of work in the past addressing these specific issues - tinker knows my background there with btv and the whole focus on the blue team education pipeline from them lol

@munin @tinker best person I worked with on this persuaded the wider tech function this was all basically a quality issue :)

@mmaibaum @tinker

It is. Security is part of QA and part of Ops generally. The tooling overlaps in both cases, and all three of those departments can - and in my opinion ought to - work synergistically.

Malfunction54 Jul 19, 2024

@munin @tinker I worked on a project where we engaged a "business psycotherapist" once. It was amazing. He was an organizational behaviorist. Basic message was that most people in the workplace aren't behaving like adults. Really interesting stuff.

Esc:wq!Jul 19, 2024

@munin @tinker this update can't stop won't stop. Even if you're n-2

Nick Jul 19, 2024

@tinker this was pushed to all customers regardless if they had auto-update on

@djnick - The core part of allowing a third party remote code execution at highest priv still stands.

But! What are you going to do. Trusting trust and all that.

LisPi Jul 19, 2024

@tinker @djnick The core of the problem is the OS.

Namely, an OS with a kernel lacking any meaningful fault isolation.

The research for making OSes not vulnerable to that sort of problem has been completed since at least the 80s.

There really isn't an excuse.

The research for making /performant/ equivalents that do not require special hardware is newer (Singularity OS project is one example from the very same OS publisher, for instance), but was also mostly completed a few decades back.

Trusting Trust is not about the same problem (malice of the component doesn't matter so much if it never has access to anything it shouldn't anyway), but even so David A. Wheeler's paper was published a while ago now.

(Caveat of course being that malicious components can still potentially DoS the system and hardware vulnerabilities can also enable complete compromise despite there being no logic-level flaw in the isolation.)

Cassandrich Jul 19, 2024

@lispi314 @djnick @tinker This, but also: AV shitware vendors *insist on bypassing any isolation that exists* because they deem themselves the most important thing on the machine. On Linux they would demand you load sketchy kernel modules to give them the same backdoors. (The history of how fanotify came to exist was basically trying unsuccessfully to avoid that shit happening.)

"AV" and "security products" need to become widely understood as malware, and rejected.

Esc:wq!Jul 19, 2024

@djnick @tinker auto update is just for the Driver. Channel updates cannot be stopped.

MidwintersTomb Jul 19, 2024

@tinker We're looking at potentially leveraging this event for a reason to get rid of it. We originally looked at another tool with better features and no remote access. But, then a new IT director went behind the back of security and signed a three year contract, because he had it at his last job, and told IT to install it.

Jim Rockford 👾Jul 19, 2024

@tinker our company had one machine go down...the one testing crowdstrike

Democracy Dies in Dumbass Jul 19, 2024

@tinker Saving this for later.

JohnsNotHere Jul 19, 2024

@tinker This whole situation just shows how cyclic our industry, and the tech industry in general, is. How many decades was it when Windows updates failed and we preached this? Oh, but then it got better, so nobody does staging upgrades anymore. Likewise for old AV vendors. McAfee comes to mind, but I feel that Symantec was in there as well.

We preach patching, and I'm not immune to that pulpit either, but I agree that we need to do better at adding that it needs to be done safely, in a controlled environment, and not just blindly accept it.

Shawn Webb Jul 19, 2024

@JohnsNotHere @tinker As an industry, we desperately need to invest in software (and hardware!) diversity.

Software monocultures reduce our resiliency against threats both malicious and benign (including stupid mistakes, like pushing out a malformed update).

We should plan for adequate software and hardware diversity as we design and architect systems and networks. Doing so will increase our security, accessibility, and availability postures.

Earthshine --> moved Jul 19, 2024

@tinker Unfortunately management types *like* that cloud MSPs offload liability, and this is a major reason they go with them in the first place. They buy into this crap of letting someone else manage their IT infrastructure so that when it breaks, it's not their own fault. More often than not, these MSPs have dangerous levels of access, but the orgs that use them don't care as long as they can check the box with auditors saying they're compliant. Then when shit goes sideways at the MSP, everyone acts surprised that they're up shit creek without a paddle.

Alyssa Voronin Jul 19, 2024

> I know everyone else does it, but if everyone else jumped off a bridge, would you?

Would it improve shareholder returns?

SpaceLifeForm Jul 19, 2024

The best way to avoid problems is to avoid Windows.

Lowlife's inane ramblings 🦜☠Jul 19, 2024

@tinker @Taco_lad Have been saying for years, should have stuck with QIC+sneakernet

okanogen VerminEnemyFromWithin Jul 19, 2024

@tinker
We try our best to keep everything in our control, but it is REALLY difficult with our management constantly being spammed by salespeople trying to push cloud services and SaaS. We dodged this particular bullet because of it, but that just means mgmt won't have it as a memory.
It is also REALLY difficult when using Windows products for desktop users because they deliberately bundle and make so many dependencies on their cloud services which we can't maintain or control.

Toadman628 Jul 19, 2024

@tinker Sound like where I work?

Shaman Jul 19, 2024

@tinker don't hold your breath. An occasional disaster is not on the books or covered in the risk of doing business budget, whereas teams to test and certify updates are committed costs showing up as ever quarter. So - fat chances 😮 the org will change. The goal is to optimize profits, not to provide reliable (customer) service ...
#clownstrike #

Kevin Russell Jul 19, 2024

This is true. It doesnt go far enough.

Discussions on software recently have talked about OS agnostic apps, and NASA has a policy of triple back ups of necessary, critical systems.

Triple back ups. NASA doesnt put new equipment into rockets, and when it gets in, another system stands ready to do the job INSTEAD.

Corners are being cut.

yonderboy Jul 19, 2024

@tinker Even Companies who used the N-1 and N-2 sensor update policies were equally hit, as well as companies who manually incremented the version of their sensor policies. This update didn't go through normal channels that should have been caught.

FeralFeminist Jul 19, 2024

@tinker but let’s throw all our money into AI because securing data and making sure the bugs are all gone isn’t sexy.

Malfunction54 Jul 19, 2024

@tinker I get what you're saying and I agree, but I feel like these kinds of events are just going to become normalized like data breaches have, and the costs absorbed by the customers.

Tinker ☀️Jul 20, 2024

@malfunction54 - i fear you're right :-/

Johnny Rebel, Antifa Division Jul 19, 2024

@tinker Thank-you, I feel the same about M$. M$ got rid of testers @ 20 years ago. I was at the highest level, 4, and they did it to make more money, and pushed the burden onto the developers, because testing your own code is so foolproof!🤬

W. Latif Ayubi 🏴‍☠️Jul 20, 2024

@tinker They tested it in production.

JWcph, Radicalized By Decency Jul 20, 2024

@tinker Yes, it will 👉 https://www.wheresyoured.at/crowdstruck-2/

CrowdStruck

Soundtrack: EL-P - Tasmanian Pain Coaster (feat. Omar Rodriguez-Lopez & Cedric Bixler-Zavala) When I first began writing this newsletter, I didn't really have a goal, or a "theme," or anything that could neatly characterize what I was going to write about other than that I was on the computer and that

Ed Zitron's Where's Your Ed At