Mastodawn

todsacerdoti Mar 31

GitHub's Historic Uptime

https://damrnelson.github.io/github-historical-uptime/

Historical GitHub Uptime Charts

View GitHub's monthly uptime between 2016 and 2026.

Show thread

mholt Mar 31

Even better IMO is this status page: https://mrshu.github.io/github-statuses/

"The Missing GitHub Status Page" with overall aggregate percentages. Currently at 90.84% over the last 90 days. It was at 90.00% a couple days ago.

The Missing GitHub Status Page

Historical GitHub uptime reconstructed from archived status data.

Show thread

fontain

An aggregate number like that doesn’t seem to be a reasonable measure. Should OpenAI models being unavailable in CoPilot because OpenAI has an outage be considered GitHub “downtime”?

Show thread

fwip Mar 31

I think reasonable people can disagree on this.

From the point of view of an individual developer, it may be "fraction of tasks affected by downtime" - which would lie between the average and the aggregate, as many tasks use multiple (but not all) features.

But if you take the point of view of a customer, it might not matter as much 'which' part is broken. To use a bad analogy, if my car is in the shop 10% of the time, it's not much comfort if each individual component is only broken 0.1% of the time.

Show thread

remus Mar 31

> But if you take the point of view of a customer, it might not matter as much 'which' part is broken. To use a bad analogy, if my car is in the shop 10% of the time, it's not much comfort if each individual component is only broken 0.1% of the time.

Not to go too out of my way to defend GH's uptime because it's obviously pretty patchy, but I think this is a bad analogy. Most customers won't have a hard reliability on every user-facing gh feature. Or to put it another way there's only going to be a tiny fraction of users who actually experienced something like the 90% uptime reported by the site. Most people are in practice are probably experienceing something like 97-98%.

Show thread

fwip Mar 31

Sorry, by 'customer' I meant to say something like a large corporate customer - you're buying the whole package, and across your org, you're likely to be a little affected by even minor outages of niche services.

But yeah, totally agree that at the individual level, the observed reliability is between 90% and 99%, and probably toward the upper end of that range.

Show thread

mememememememo Mar 31

Or if your kettle is not working the house is considered not working?

Show thread

Polizeiposaune Mar 31

I've been on a flight that was late leaving the gate because the coffeemaker wasn't working.

Show thread

wang_li Mar 31

A better analogy is if one bulb in the right rear brake light group is burnt out. Technically the car is broken. But realistically you will be able to do all the things you want to do unless the thing you want to do is measure that all the bulbs in your brake lights are working.

Show thread

Dylan16807 Mar 31

That's an awful analogy because "realistically you will be able to do all the things you want to do". If a random GitHub service goes down there's a significant chance it breaks your workflow. It's not always but it's far from zero.

One bulb in the cluster going out is like a single server at GitHub going down, not a whole service.

Show thread

mort96 Mar 31

As long as they brand it as a part of GitHub by calling it "GitHub Copilot" and integrate it into the GitHub UI, I think it's fair game.

Show thread

mememememememo Mar 31

What is Google's uptime (including every single little thing with Google in the name)?

Show thread

mort96 Mar 31

I don't think that's a fair comparison. Google Maps, Google Calendar, Google Drive, Google Search, Google Chrome, Google Ads, etc. are all clearly completely different products which have very little to do each other, they're just made by the same company called Google.

GitHub is a different situation. There's one "thing" users interact with, github.com, and it does a bunch of related things. Git operations, web hooks, the GitHub API (and thus their CLI tool), issues, pull requests, Actions; it's all part of the one product users think of as "GitHub", even if they happen to be implemented as different services which can fail separately.

EDIT: To illustrate the analogy: Google Code, Google Search and Google Drive are to Google what Microsoft GitHub, Microsoft Bing and Microsoft SharePoint are to Microsoft.

Show thread

Kaliboy Mar 31

Completely agree, it makes it worse actually as Github's secondary functions so to speak are things we implicitely rely on.

When I merge to master I expect a deploy to follow. This goes through git, webhooks and actions. Especially the latter two can fail silently if you haven't invested time in observation tools.

If maps is down I notice it and immediately can pivot. No such option with Github.

Show thread

dogma1138 Mar 31

It depends, for example - I would consider Google Drive uptime as part of say Google Docs’ overall uptime because if I can’t access my stored documents or save a document I’ve been working on for the past 3 hours because Drive is down I would be very pissed and wouldn’t care if it’s Drive or Docs that is the problem underneath I still can’t use Google Docs as a service at that point.