relevant to my post last night that I'm going back and fixing up my 23 years of blog archives, and restoring broken links, it is *astounding* how much better a job the personal web does at keeping links alive and content online. Most personal blogs I linked to are either still around or redirect to a place where the content is easy to find. The vast majority of corporate content (including news) has been erased, with only imperfect Internet Archive copies available.
@anildash now i have to see if jack & jill politics is saved
@anildash At TED I’ve spent a good chunk of time dealing with “historic slugs” to make sure no matter which URL you use, you end up at the talk. This is a solved problem if the org still exists. So annoying to see the dead links. It doesn’t have to be this way.

@anildash I've been building business websites since 1997. I always took pride in making sure dead links got redirected.

Recently worked with an expensive firm to migrate my college's site (3000 links), & they regarded my concerns as quaint. 'We think you should abandon all but your top 100 legacy URLs, & that's what we'll agree to redirect for you.' Any more I wanted, I had to do myself.

@FeralRobots @anildash even if I never visit your school's website, your efforts are appreciated!
@anildash The one issue I’ve found with my own 1990s/2000s blogging is that all the image links from photobucket etc, and some video embeds, are dead now. But yes the writing is still there!
@ProgGrrl @anildash services like photobucket have been such a disaster for preservation. i wish the economics of hosting were somehow such that peoples' digital histories weren't randomly wiped out without their knowledge!
@ProgGrrl @anildash Freaking Photobucket, man. They broke SO MUCH of the pre-2010s web

@anildash would be interested to see your process for accomplishing this. I have two 18 year olds blogs and the updating has been on my mind for a while.

And agreed on the personal web vs corporate.

@anildash Individual people care about the Web as shared culture. But get a group of people together for the purpose of making money and shared culture is pretty far down the list of priorities.
@anildash it’s so frustrating and the time between publish and “weirdly offline” is getting shorter and shorter

@anildash For 12 years now i’ve meticulously, thanklessly maintained a list of 301s in our corporate .htaccess file.

Perhaps the business has accrued no benefit from this, but I can sleep at night.

@anildash I started this process as well. All of my old IMDb links are broken because they used to use us.imdb.com and have since removed it from their SSL cert. 🤬
@anildash passion projects always win out over paid projects. MySpace deleted everything from its first 12 years of existence during a server migration. https://www.reddit.com/r/techsupport/comments/7uiv8b/myspace_player_wont_play_songs_and_i_want_to/
Myspace player won't play songs, and I want to download them if possible

I've seen this asked before without solution, but at this point I'm desperate. There are a couple of songs that it is of the utmost importance...

reddit
@anildash Similar situation. My blog content goes back 20+ years, all the links still work, except the comments to the posts died because I made the mistake of using IntenseDebate, a blog comments platform that Automattic bought, ruined, and abandoned (even though that platform appears to still exist—avoid it!). So all the comments are gone. Good thing I had all the source to MovableType, which I still use.
@anildash
Ha! I was checking and fixing broken links too, today. Rewriting some stuff, but not any news articles. I found the same as you, corporate stuff often deleted or moved to a new URL without a redirect. X
Edit: typo fix
@anildash heh, I recently noticed some 2003 era blog images of mine having been offline for a decade or so due to being almost but not quite clever about something. Felt good to fix them, but I think I need better monitoring (self-crawling?) to actually generally keep such things live...
@anildash I print everything into pdfs to preserve.
@anildash I'm afraid to look at links on my older blog posts!
@anildash is it just personal or maybe personal and smaller businesses? My business site has redirects for contents over 10yrs old.
@anildash Link rot is a real issue. I keep a large database of bookmarks and a sizable portion are dead links, now. I've resorted to making archives of everything via some scripts that run on a schedule. Sad that this has to be the case.

@anildash

How many articles are on your blog? I started mine in 2005 and it has more than 1200 articles since then. That's a LOT of articles to check for broken links.

And, since links keep breaking, you'd need to keep fixing.

I often point to past articles and when I do I check for broken links, both to external websites and to my own past websites, which are now only available on the Internet archive.

Do you have a strategy for how you're finding and fixing broken links?

@anildash I've taken to doing two things: 1) after a few years I auto-replace all my outbound links with the archive.org version; and 2) any time I make a blog post, it sends an archive request for each outbound link, to make sure they're in there.
@jwz @anildash Thank you! It's such an obvious idea once you've heard it, but here we are. My blog is 21 years old in May and the earliest links are definitely broken. I'll patch what's available on a.o and in the future do an archive request for any other outbound links. @anildash I know you've been cleaning up. Maybe this tip will be useful for your site as well?
@jwz @anildash Those are both really good ideas.
@anildash A few days ago I had to spend several more hours restoring the broken image links to articles I wrote 10-15 years ago and archived on my personal blog, because some corporate jackass moved or deleted the whole platform they were published on years ago and didn't bother with redirects...
@anildash what are you using to find broken links? Clicking on each by hand or have you used a tool to check the links/flag ones that no longer resolve as expected?
@anildash This makes me think about care. Previously, I’ve worked for websites who have been either negligent or intentionally hostile towards old links. But in both cases, there was a simple element of disregard. Old links had no value, they were seen as disposable. Which is shocking and disappointing! I feel that many digital media companies internalized the skeptical gaze of traditional media, and came to consider their own work as disposable in a way that the personal web would never.
@anildash agree. in rarest of cases the corporate content doesn't disappear but still moves to a different URL without a redirect left behind - which is just as bad
@anildash that's kind of hopeful from the point of view of future historians and sources?
@anildash we've been compiling a list at https://indieweb.org/site-deaths for a while. Which does make me wonder why you moved your mastodon presence into a corporate silo rather than your own site.
site-deaths

Where incredible journeys end

IndieWeb
@KevinMarks @anildash I’d be curious about this too, tho I think the decentralized nature makes it easy to migrate. I’ve considered my own instance but I don’t know if I have it in me to do the maintainance tbh.
@film_girl @anildash maintaining your own Mastodon instance does seem to be a lot of wrangling work, from what I've heard from others, yes eg https://lagomor.ph/2022/11/mastodon-is-too-heavy-for-its-own-good/
Mastodon Is Too Heavy for Its Own Good

Warning: This Article Is Poorly Written With the ongoing collapse of Twitter, there has been a lot of talk about the Fediverse, and primarily Mastodon, which in spite of it probably not wanting to be, is the flagship in the ActivityPub fleet. I want to preface this by saying that I think Mastodon is really great software from the user side. It’s a very powerful tool and deserves all the credit it gets for it’s UI, it’s filtering features, and it’s very in-depth profile settings.

Lagomorph
@KevinMarks @film_girl @anildash, so true, I'm doing it in an even more difficult way, in a home Kubernetes cluster. There has been a lot of learning, many bugs and PRs filed, countless workarounds.
6 hours that author used doesn't even scratch the surface of this project, but I'm not doing it because it's easy, I'm doing it because it's hard.

@KevinMarks @film_girl @anildash

I’m iterating (off and on, in bursts as I find time and desire) on an improved Docker-Compose.

The article claims you need S3 and you definitely don’t. Especially for a single user instance unless you are very popular.

https://github.com/bplein/mastodon-docker-compose/tree/main

I think the heavy lift here is docker experience. If you have it, this is easy. If you don’t, instead of learning all the mastodon bits up front, learn Docker and Docker Compose (which are useful elsewhere)

GitHub - bplein/mastodon-docker-compose

Contribute to bplein/mastodon-docker-compose development by creating an account on GitHub.

GitHub

@KevinMarks @film_girl @anildash

But I agree with the premise that someone with little experience running apps, that consist of multiple services, well biting off Mastodon is a learning curve.

But it all is. Getting stuff running is the easy part. Doing so it lives on “forever” is another thing entirely. Day 2 and beyond.

@bplein @KevinMarks @anildash right. I’ve managed many a web server in my day, some better than others. I don’t really relish having to have a reminder to run updates and check security every month or whatever. Sometimes that’s fun. But sometimes it feels too much like my actual job.
@bplein @film_girl @KevinMarks yeah, I mean… I spend my day worrying about running tens of millions of apps, I don’t want to go home and have one more to manage.

@film_girl @KevinMarks @anildash the long-term answer here, IMO, is Mastodon-compatible services that let you bring your own domain/identity without the hosting hassle. Ideally, they’d also provide domain registration and light DNS.

Micro.blog isn’t really that, at this point. They’ve chosen to be a separate, but connected network.

Identity and hosting should not be intrinsically linked!

@film_girl @KevinMarks @anildash Would you consider a paid model where a nonprofit entity would run the server and you'd pay a monthly cost? The costs would go toward infrastructure and for paid staff to administer the instances. Perhaps the org could be involved in other fediverse related services.
@ppatel @KevinMarks @anildash absolutely. But I’d also like to pay for my own managed instance that I control the rules for. I just don’t want to have to keep the application/server/DNS/cache updated.
@film_girl @KevinMarks @anildash I'm thinking about multiple levels of service including letting you be the administrator who controls the rules, blocking, etc. but letting the third party handle the labor-intensive, technical bits.
@KevinMarks it was already on a corporate silo, just a worse one. I don’t want to admin a service.
@anildash you might like how https://micro.blog enables you to interact with mastodon through a URL on your own domain
Micro.blog

Post short thoughts or long essays, share photos, all on your own blog. Micro.blog makes it easy, and provides a friendly community where you can share and engage with others.

@KevinMarks yeah, it’s great. But didn’t fit my preferred workflow. I evaluated all the options.
@anildash I still have some static Movable Type blogs that I removed the engine from! So some links from 1999 remain in place. I migrated across a couple platforms (Userland? -> Movable Type -> Squarespace) and I believe was able to ensure either a redirect or an exact URI across all those.
@glennf @anildash Still using MT. One reason: the attack bots no longer blitz my server with probing “mt-this” “mt-that” cgi script requests. The hacker weenies are too young to be aware of MT now and so all the bot attacks are WordPress “wpthis” “wpthat” which all get 404s. (Except a few where I created fake files that say “nice try” muhahaha.)
@anildash it boils down to the motives. Corp content is means to an end. Get the hype for sales or acquisition. New firm acquires for being brazen but then want the team to act the part. Or sales tactic change. Companies reinvent. Most personal web is about expression. And even if people change work life most are proud of expressions they put out. So personal web sticks around.