45 Followers
141 Following
11 Posts
Remember to get super angry today about things that you can't affect and wouldn't know about without a global network of organisations whose sole purpose is to find and promote the most enraging happenings of the previous day curated from the total output of eight billion people.

Great article on some of the problems with incident metrics - https://greatcircle.com/blog/2026/05/26/incident-metrics-mirage/

#sre #devops

Some optimism as a counterweight to, well, everything.

https://terrygodier.com/the-boring-internet

The Boring Internet

The internet you grew up on isn't dying. A commercial veneer glued on top of it is. A visual essay about what actually persists.

Terry Godier
@siracusa What this graph actually shows more than anything is the fact that after the MS acquisition (although not because of it), there was a more formalized incident management process put in place, with dedicated staff hired for managing incidents and developing incident tooling. The entire time I worked at GitHub, availability was typically around 2.5 nines, or roughly 99.5 percent. The reason this graph looks the way it does is not because site availability was worse after the acquisition. It's because incident management improved.
@siracusa There are two reasons why this data is not accurate. The first is simply that when GitHub moved to Statuspage, the existing data wasn't imported, or didn't exist. The second was that GitHub simply didn't status for incidents in a reliable or consistent fashion.  Which brings me back to the graph and how to interpret it.
@siracusa The major problem is that the GitHub published availability data is just not accurate before around 2020. For example, I personally caused a multi-hour outage in 2018, but the graph claims the site was only down for seconds for the entire year, and of course we had plenty that were just the normal outages as a result of unforeseen problems, bugs, provider failures, hardware failures, etc.
@siracusa On this week's ATP episode, I'm guessing you were referring to https://damrnelson.github.io/github-historical-uptime/ during the GitHub discussion. This page is based on GitHub's own data, so in theory it should be hard to argue with, right? Unfortunately, while it may be well intended, it has problems (I'm going to ignore the y-axis issue).
Historical GitHub Uptime Charts

View GitHub's monthly uptime between 2016 and 2026.