Update April 5th:
We got everything we can.
The amount of help has been very awesome - we did good 💜

If you think you can help with more than a few TB and/or with additional contacts etc. pp.
Please write an email to [email protected]

We (@SafeguardingResearch) need urgent help with archiving websites & datasets from NOAA, specifically the stuff on Amazon Web Services (AWS):
https://forum.safeguar.de/t/noaa-all-services-urgent/569

If you can contribute, please consider doing so.

#NOAA #SafeguardingResearch #Data #Weather

NOAA all services (URGENT)

I’ll start on: https://oeab.noaa.gov/ Any additional info about the take down of these specific services? Was it announced somewhere?

Safeguarding Research & Culture (SRC) — Distributing Cultural Memory
@lavaeolus @SafeguardingResearch Hi Henrik, isnt this something eg. Cern or @SURF can help you with, or is that not a helpful thought?

@meliimming @SafeguardingResearch yes, helpful thought!

Would love to chat with someone from @SURF

@lavaeolus @SafeguardingResearch @SURF I will connect you via email, do you have a link I can use that explains what you are doing, next to the NRC newsitem?
Safeguarding Research & Culture

"As researchers we often say 'we need the data'. Today, the data needs us." — Kathy Reid

Safeguarding Research & Culture
@lavaeolus @SafeguardingResearch Just a note that when your set your crawler to scraping a gov site, be reasonable about how frequently your requests are posted. The AI/MML bots are out there doing the same damn thing, and some sites have set their defenses on DROP if they don't like the look of your jib. Also, if your IP traces back to Malaysia or Lithuania or some other out-of-the-way non-US place, you'll look even more suspicious.

@roadskater yes, this is an overall issue - fighting with 403s is part of the daily work

re bots: https://go-to-hellman.blogspot.com/2025/03/ai-bots-are-destroying-open-access.html?m=1

@SafeguardingResearch

AI bots are destroying Open Access

There's a war going on on the Internet. AI companies with billions to burn are hard at work destroying the websites of libraries, archives, ...

@lavaeolus @SafeguardingResearch I've been reading variants of that article about twice a week for the past month or more. The slop-bots are an increasing PITA.

@roadskater totally - on many levels.

Ans they destroy the whole Open Data / Open Access landscape

@SafeguardingResearch

@lavaeolus @SafeguardingResearch

If you want to archive stuff from youtube without filling up your disk, this should do:

yt-dlp -i --restrict-filenames --yes-playlist --all-subs -o "%(title).100s-%(id)s.%(ext)s" --write-playlist-metafiles --check-formats --format-sort hasvid,+res:360,vcodec:av01,+size,+tbr VIDEO_OR_PLAYLIST_OR_ACCOUNT_URL

+res:360 uses the lowest resolution that’s at least 360px high.

@lavaeolus @SafeguardingResearch research.noaa.gov is gone :(

@f4grx @SafeguardingResearch not for me?

That would be unusual, as the cutoff for stuff is still hours away

Home: EPIC and UFS - Earth Prediction Innovation Center

Unifying Innovations in Forecasting Capabilities Workshop 2025 (UIFCW25)Monday, September 8, 2025 – Friday, September 12, 2025Read moreUFS Insights NewsletterThe latest issue of the UFS Insights newsletter is now available online! Read all about the latest and greatest goings on at EPIC and the UFS!Read the NewsletterEarth Prediction Innovation Center – Artificial Intelligence in Weather Modeling

Earth Prediction Innovation Center - Site for EPIC, the Earth Prediction Innovation Center
NOAA research websites slated to go dark get a reprieve

NOAA is under a mandate to slash its IT costs.

Axios

@OliviaVespera It's complicated.
But yes, we got more time to get 'not as accessible' data backed up.

A short reprise.

And we got most (if not all) the public facing data.

@SafeguardingResearch

@lavaeolus @SafeguardingResearch
That sounds really good!
Do we know how long we have now? days? weeks?
@OliviaVespera @SafeguardingResearch "Instead of ending at midnight, the contract will now expire on July 31"
@lavaeolus @SafeguardingResearch Haha! well :) That's really good. A lot can be accomplished in 3 months. Thanks for putting the word out.

@OliviaVespera Yes - let's hope for the best.
It's not on us, but on them to give access. We're standing by.

And got some rest.

But the next things are already there. We will write a general update soon.

@SafeguardingResearch

@lavaeolus @SafeguardingResearch FYI I tried creating an account on https://forum.safeguar.de yesterday but haven't received the activation email. Any idea why?
Safeguarding Research & Culture (SRC) — Distributing Cultural Memory

"As researchers we often say 'we need the data'. Today, the data needs us." — Kathy Reid

Safeguarding Research & Culture (SRC) — Distributing Cultural Memory

@fshwsprr @SafeguardingResearch very weird, because it has worked for many people..

Follow up via DM

@lavaeolus @SafeguardingResearch
Please help NOAA rescue data. Its an emergency.