Also, I'm pretty sure I've said this before, but I'll say it again:

Part of your job as a senior is to tell your juniors about your fuckups. The embarrassing cringe reckless and lazy bullshit that you did when you were new, and the various times you brought down Prod. We ALL did it sometime. And then tell them: the moment you realized you fucked up, I know, the impulse is to try and cover it up, but don't do it. Come to the seniors you trust, and they'll help you unfuck it, and fight management tooth and claw like mamma and pappa bears to defend you from any shitheads in management. Because that's what our seniors did to us.

@JessTheUnstill Not me gaining the trust of administrators in two dozen departments by openly not giving a shit whose fault anything was, or even if people didn't know something they were supposed to, as long as I was informed so I could prevent it from causing a massive headache in 2-6 years

@JessTheUnstill

❤️

And this never goes away, never ends. As a senior you keep fucking up newer and bigger things, because that's part of learning and growing. Tomorrow's seniors will need someone to learn from too.

@SQLAllFather @JessTheUnstill Anyone can break a glass, it takes a master to break a dozen.
@SQLAllFather @JessTheUnstill seriously. Even with decades of experience you sometimes fuck up. Own it, understand it, fix it, learn from it.

@JessTheUnstill

one of the epiphanies i had as i grew in my career was truly understanding just how incredibly patient and supportive my seniors/managers had been, in spite of all the reasons i gave them for losing said patience with me.

i have tried to pay that forward as i've progressed in my own career.

@JessTheUnstill also let them know the adrenaline rush when the terminal takes a split second too long to respond after a command doesn't go away.

@JessTheUnstill I once worked for a CEO where the (very) occasional conversation would go like this:

Him, spotting something he hadn't expected: "What's this then?"
Me: "A cock-up."
Him: "What are you doing about it?"
Me: explain the plan
Him: "Jolly good, carry on."

The clear message was that if I had *not* admitted to the cock-up I'd have been in deep shit.

@TimWardCam @JessTheUnstill right, this is what I told my current junior too, when she was certain she'd be fired for a fuck-up: "no-one will remember *that* you fucked up, because we all do. They *will* remember how you handled it. So, raise the alarm early, calmly try to figure out what happened, and provide a solid plan or at least a reliable risk-assessment."

@danielaKay @JessTheUnstill

Exactly.

When I became a portfolio-holding councillor pretty well the first thing I said to my directors and heads of service was "If there's a cock-up I want to hear about it from you first. If the first I hear of something going wrong is a phone call from a journalist I shall be seriously displeased."

And they did as asked. So when a journo phoned to ask about a cock-up I already knew about it so had something ready to say.

@JessTheUnstill

Working on a developer's desktop in 1999. Oh, they've already got a root terminal, cool.

# pkgadd -d /netsoft/ClearCase-0281-client-int-patch.pkg
# reboot
connection closed by foreign host
srini-desktop01$

. . . heck.

@JessTheUnstill
in 1996, after midnight, dialed in to a friend's linux machine across the country to fix it for something critical the next morning:
# dpkg -l | grep ^ii >/tmp/inst.pkg
# mkdir /debs
# mv / *.deb /debs
# ls -latr
ls: error while loading shared libraries: libc.so: cannot open shared object file: No such file or directory

... oh no. (That was the night I learned that you could use echo and shell globbing as a rudimentary ls, and that you could use ld.so as a preload)

@JessTheUnstill Or, on the sql side, something akin to:

update prod_users SET user_email = "asmith@foo.com" WHERE user_id = 82108;
update prod_users set user_name = "asmith";

@sekka

I have done almost exactly the same thing, but it was having root and moving around the packages in /lib on some solaris machines

... and it was 1999

thank goodness the whole cluster had each other's filesystems mounted through NFS and I could go move them back from another machine

@JessTheUnstill

@trochee @JessTheUnstill I feel like making mistakes like that really cement one's skillset.
@sekka @JessTheUnstill I did not learn about preload. So I wrote a program to dump the whole statically linked ash shell via echo hexstring into a file on the remote server. Copy pasted that intonthe still open ssh. I think that shell had some built in file operations which helped me restore the system.
@JessTheUnstill ... and yes, I do tell my colleagues (including and especially the juniors among them) all about them. Gory details and all. I'm not looking good, in hindsight. But as long as they get a fighting chance to do better than me...

Also stories come with beer, so there's that. :D

@JessTheUnstill Yes, unequivocally yes. I've got an extra "fuck ups" folder in my mail client for exactly this purpose. Has a beautiful mail thread where one of our senior build engineers very patiently explains to me that none of our releases build for three days after my change, and how I can reproduce the issue locally to figure out what I've screwed up.

I think this is one of the most important things you can have around as a senior.

@JessTheUnstill One thing I've always loved about working with computers is that if you take reasonable precautions, like having working backups, or taking one-off backups before making changes, that you can un-fuck anything that you fuck-up. You might lose some time, you might lose some face but mistakes can almost always be fixed, and once you realize that it becomes a superpower. Like in "Office Space" where the whole attitude changes once you are no longer afraid of consequences, instead of being paralyzed by fear that you are going to damage the computer, you have confidence that you know how to fix it before you make any change. And for changes where you don't have a working back-out plan you can focus on learning how to make one to make those changes safe, and then you don't have that worry anymore.
@JessTheUnstill I haven’t brought down prod in at LEAST a week 😂 

@JessTheUnstill I always tell them to let me know immediately if they need help.

I'd rather be bothered by my minions a dozen times for something "dumb", than stumble into a catastrophe a week later.

@JessTheUnstill only one minion ever truly pissed me off.

And they were only really a "minion" because they were working in my environment. Technically, they were my equal from the US, learning our system.

Spent 2 weeks assuring us they were doing fine. No problems.

Meanwhile their counterpart was constantly stumbling in our system, and asking for help hours after they should have.

Then came the day for deploy, except they were on vacation. So it was up to me...

I spent the next 3 days being screamed at by 3 PMs, about 4 of my own projects, plus the minion's project, as I had to delete the entire thing, and rebuild it from scratch.

... That person is probably gonna be in charge of our stuff soon (if not already)... Glad it's not my responsibility anymore.

@JessTheUnstill

I don’t need to tell the juniors the fuckups I made when I was new because they can see the fuckups I make when I’m old in real time.

@JessTheUnstill I agree with this and do it the best I can. (I have a natural tendency to blame, which makes it an uphill battle, even though I totally agree)

I do a lot of security approvals on high risk projects and I have to make tough calls a lot and often. I tell the story of how an entire small team (12 people today) got created off the back of one of my fuck-ups.

I had been in charge of approving high risk projects after vetting their security plans. And I got asked about something we had almost never done before. I did some checking on policies, I asked a few people I trusted, and I looked hard at the security plan (which was sound). I approved.

The first one I approved caught the attention of one of our lawyers who said “you approved what!?” And he calmly explained how we simply don’t do that ever, and here’s why. Having figured out I should not have approved it, I mentioned the 2 others just like it I had also approved that week. 😜 We contacted the teams and figured out what to do. It was fine in the end.

Eventually we spun up a whole team whose job is those specific high-risk approvals and the criteria for security plans to go along with them. But the need for such a team was made clear by a fuck-up from yours truly.

They all showed me grace because I was acting in good faith and just didn’t understand the legal context. I tell that to juniors on my team to emphasise that you don’t get fired (at least around here) for good faith efforts to the best of your ability, even when they go wrong.

I know it’s a vague story. But you can see why. Anyways, I completely agree.

@JessTheUnstill I've broken _aircraft_. Super-expensive aircraft. The safety incident report is written, we have a meeting, I say "this is what I did and why I thought that was OK." They were all relaxed, and explained why I was wrong. Nothing further happened.

I've had *test pilots* try to lie. "No, I never touched that switch." Except that all the switches are monitored, and recorded, so we have _proof_ that you flipped the switch. The Right Stuff it's not.

@JessTheUnstill 100% I had that very conversation today with somebody terrified about their code going into production behind a feature switch no less! “You’re talking to somebody that used Akamai to block every browser using the latest production Chromium build. It’s going to be fine.”

@JessTheUnstill I wish my management had had my back like that. My last FTE tech job (systems engineering) I made a mistake (yes, SQL, but it wasn't a query) that I'd been warning management about for 6 months, that one of us was bound to make this particular mistake, since we were using a workaround for a known bug that was highly risky. I knew one weekend, this bug would get escalated to the only engineer on, who would already be in the middle of some other tricky thing (or two) and get distracted... well that engineer turned it to be me, implementing the workaround while also in the middle of a very complicated, very important, very deadline migration.

I made the mistake in the workaround and a very bad customer-facing thing happened. (But the migration went well!)

I got fired so they could assure the customer. And now I'm a writer.

Who knows if they ever let dev fix that bug. 🤷

@JessTheUnstill Yes! And it applies to senior roles in just about ANY industry or career field!
@JessTheUnstill like the time I ran a killall command on a Solaris machine thinking I was on a Linux machine, you mean? That kinda thing? :)
@JessTheUnstill Are you even a Senior if you never "brought down prod"?!

@JessTheUnstill none of my juniors ever brought down production. *I* brought down production. Even if I was on vacation at the time. At least that was the message to management.

Of course, lessons were learned.

@JessTheUnstill ❤️❤️❤️❤️❤️

@JessTheUnstill It's how you build credibility with your team. If you haven't done it and broken it how do they trust you?

Near the end of a resilience programme and telling the PM about the time I blatted the OS drive of a prod server in the middle of the day. OS stayed running as files locked open were fine. Restored underneath and managed to keep it running till the next safe reboot window. The look of horror on their face.

@JessTheUnstill Standing in the data centre with a rapidly closing change window and the next change having C-suite visibility and no core network, with a very large, upset network engineer who made a copy/paste error on the core network. I didn't know if he was going to pummel me into the floor or break down and cry, or both. Thankfully neither. Calmed everyone down, worked the problem with the team, executed solution. All sat down and took a deep breath after.

We all screwup.

@JessTheUnstill my PhD supervisor was in Canada and needed the results of the program he had left running for 3 weeks in Edinburgh. So I needed to log on and copy the output files to a tape and mail it to him. (This tells you how long ago it was). But instead of

tar cvzf /dev/... Output

I typed

tar xvf /dev/... Output

And promptly extracted the empty tape over the results.

@JessTheUnstill I have lost count of the number of times I've said:
- I've made that mistake before
- That's a good question, I have no idea, lets find out together

Oh, and more than a couple of times I've ran interference (creative truth telling) when a junior made a boo-boo so that they can get on with fixing the thing. And yes, the team then made changes to work instructions so that the mistake was much less likely to be repeated.

Seniors are not all-knowing (okay, maybe that's just me...) and I try to normalise the not-knowing-but-knowing-how-to-find-out and my other job is getting others to the point that I'm replaceable.

@JessTheUnstill we use to say, and especially if someone gets criticized for downtime or blunders

”He who does nothing makes no mistakes.”

@JessTheUnstill I’ve been debating talking about my biggest fuckup at a talk i’m giving in a few months. but talking about it also helps illustrate all the fuckups that managment and the people who came before me contributed to it.
@JessTheUnstill inspired by your great post, I shared my fuckup and asked others to do the same. Hopefully they will become blog posts too. https://lobste.rs/s/ytefme/when_was_last_time_you_broke_production :)
When was the last time you broke production and how? | Lobsters

@JessTheUnstill To be the person with the thousand yard stare, taking a pull from a cigarette, saying "I've seen some things"
@JessTheUnstill This seems to apply to teaching as well. For instance at my high school.
@JessTheUnstill This is exactly my experience, when I was not yet senior - I fucked up majorly a few times, and was given protection and room to fix it myself and learn from it. I at least *try* to do the same now on the other end. Not sure how well I do, but I make it a point to often say "fuckups are normal, don't worry, stay calm, no one is going to die, everything can be fixed."
@JessTheUnstill
And the most important thing here, especially if "formalized" in a Fuckup Night, is that a Fuckup is a Fuckup, and not a "..but then I turned everything around and saved the situation".
Sharing Fuckups is to show that everyone screws up, not to brag how you saved the situation.
*intently staring at you, jon*
@JessTheUnstill As a manager I am forever telling my team about my technical fuckups and telling them “fix the problem; not the blame”.

@JessTheUnstill I did an update on the "users" table to change the password for my account... and forgot the where clause.

800k+ lines updated.

Whoopsie.

@jkb @JessTheUnstill I've done the same, not to the users table, to to a core table to the application. And I've told multiple juniors about it. 😁
@JessTheUnstill I do it because that’s not what my seniors did for me, it feels good to reject that and be better

@JessTheUnstill

Hell, I tell them about the fuckups I do NOW.

@JessTheUnstill @siracusa My best screw-up was to start a delete on a folder I thought I was in but was actually in the root of our Netware server. Luckily it was a test server but we still had 50 users on it doing our testing. As this was in the middle of the night I had to leave a message for the server admin, who was glad I had told her so she could restore the server the next morning. Best to admit your mistakes while they can be fixed 🙂
@JessTheUnstill you can’t graduate to seniority unless you’ve messed up at least one business critical system during your career.
It’s a rite of passage and the stories should be told around every camp fire.
@JessTheUnstill
Like that one time i chmod -R 777 a data dir (yeah, I know) and interrupted some customer jobs ... because somewhere in that tree there was a home dir and sshd didn't like the permissions.
I learned one or three things that day.
@JessTheUnstill 100%. This goes for my own field (law) too. I still remember the sheer terror I felt as a junior the first time I fucked up, and the relief when the partner I worked for rode to my rescue and made it all better, then rather than bollocking me she told me about the time she did something even worse when she was a junior.
@JessTheUnstill
I sed to have a list of "stupid stuff I've done that didn't get me fired" to share in 1:1 meetings, everyone enjoyed hearing it.
@JessTheUnstill So much this! It’s why I always participate in the “tell us about your biggest fuckups at work” thingies. Someone might learn a thing or two.
@JessTheUnstill There are lessons here! Like, uncommitted transactions can really screw things up, but sometimes they save your butt! And, well, I took down a rather significant chunk of a (well, at least once) well-known website by creating XML output with an unescaped ampersand (“&" rather than “&”) for a little while. Got a talking to, wiped the egg off my face, learned a lot about validating and sanitizing inputs https://xkcd.com/327/
Exploits of a Mom

xkcd
@JessTheUnstill One of the perks of getting older and more senior is being able to fuck up even bigger things...

@JessTheUnstill

Yes, very much that.

And another part of that, which I know I've written before, is to thank and praise people for finding bugs in your own code, deployment plans, etc. and/or fixing them.

@JessTheUnstill My students get regular stories about «Shit Ms Giliell did and still made it, for a given value of "made".
Mistakes were made. By me. A lot.