Many* years ago (approximately 27 in fact), when I was a Unix sysadmin at BHP-IT in Newcastle, we had one of our periodic "blame the network" events where suddenly all our systems went offline.
I, and another Unix sysadmin bolted to the network team's office to let them know there was a significant outage.
And so did admins from the NT team.
And so did admins from the VMS team.
And so did admins from the Tandem team.
And then, admins from the Mainframe team arrived.
***OH SHIT***
So we all piled downstairs to see the emergency lights on in the datacentre, and a very sheepish looking $telco engineer being escorted from the next-door operations centre from one of the senior operators.
And that was the moment it dawned on folks that having an unshielded "Emergency Stop" button right next to the "Open Door" button, both the same colour, was perhaps not the best of ideas. Particularly when neither were labelled.
It was also my first lesson that causes for major outages don't have to be highly esoteric or technical. The simple fact is that so many outages are caused by sheer mundanity. The mundanity of some bean counter refusing to pay for a small perspex cover. The mundanity of leaving a human – subject to human error – being left in the wrong place unattended. The mundanity of not putting labels on buttons.





