Mastodawn

Weird Mustard (offline)May 29, 2023

@andreasdotorg Am Ende ist es einfach Musks AI Firma. Deswegen ist man ja zb auch nicht mehr auf Twitter, um die Scheiße nicht zu unterstützen.

@andreasdotorg if I were @internetarchive I'd limit the amount of traffic and connections #AWS can make to 1 per IPv4 & 1 per IPv6 @ 64kbit/s and automatically abuse-report and temporarily soft-block the source IPs via #blackholing [like any #DDoS] if not the entire #AWS #AS!

@kkarhan @andreasdotorg @internetarchive

Simon Lucy May 29, 2023

With swarms of serverless harvesters?
In the end you still end up with CIDR blocking.

@simon_lucy @andreasdotorg @internetarchive

Then that's a necessary sacrifice one needs to do.

If #aws doesn't combat #abuse then it's only valid to #DROP [#DontRouteOrPeer] their systems...

And yes, I do yeet hostile networks as an act of self- and mutual ITsec...
https://github.com/greyhat-academy/lists.d/blob/main/blocklists.list.tsv

lists.d/blocklists.list.tsv at main · greyhat-academy/lists.d

List of useful things. Contribute to greyhat-academy/lists.d development by creating an account on GitHub.

GitHub

@kkarhan @andreasdotorg @internetarchive

Simon Lucy May 29, 2023

The point I was making is that IP specific rules aren't sufficient.

@simon_lucy @andreasdotorg @internetarchive OFC you'd have to block all CIDRs associated to the ASN of AWS...

Which is relatively easy considering that said assignments are public...

@kkarhan @andreasdotorg @internetarchive

Simon Lucy May 29, 2023

Yes, and that negates archive.org, so it's a very temporary mitigation. I imagine AWS knows and have begun limiting the customer.

@simon_lucy @andreasdotorg @internetarchive which they should've done IMHO.

[Yaseenist] CauseOfBSOD

@andreasdotorg Wonder if there is a way to identify this abusive traffic and respond with garbage data and insults?

Rairii May 29, 2023

@CauseOfBSOD @andreasdotorg why respond with garbage data and insults when you can respond with a seven or eight figure invoice? (recurring monthly, of course)

of course they can afford it, they can afford to waste their money on AWS...

John Mierau May 29, 2023

...Because society doesn't prohibit bad actors when they have fat wallets?

it's B! Cavello 🐝May 29, 2023

@andreasdotorg I haven’t seen info from @internetarchive here on fedi, but on Twitter they shared this update: https://twitter.com/internetarchive/status/1663001853964189697

Internet Archive on Twitter

“https://t.co/KbdcLkyWhO is back! (and it may not have been an AI company, maybe just an eager user)”

Twitter

it's B! Cavello 🐝May 29, 2023

@andreasdotorg @internetarchive direct link to blog: https://blog.archive.org/2023/05/29/let-us-serve-you-but-dont-bring-us-down/
“Those wanting to use our materials in bulk should start slowly, and ramp up.

Also, if you are starting a large project please contact us at [email protected], we are here to help.

If you find yourself blocked, please don’t just start again, reach out.”

Let us serve you, but don’t bring us down | Internet Archive Blogs

DieMar4 May 29, 2023

@andreasdotorg Kleinanzeigen wäre vielleicht eine Möglichkei Möglichkeit 🤔

Sebastian Lasse May 29, 2023

@internetarchive
@brewsterkahle
any updates and background appreciated here in this cosy and federated place :)

nocci #GoToSocial May 29, 2023

@andreasdotorg

IA vs. AI

IA > AI

Tito Swineflu May 29, 2023

@andreasdotorg There will never be valid web traffic coming from an aws-owned IP. I always block them all first thing. Cuts out 99% of the brute force attacks.

Aral Balkan May 29, 2023

@andreasdotorg Sounds like capitalism working as designed. No doubt another Silicon Valley success story in the making…

sebastian büttrich May 29, 2023

@aral @andreasdotorg

it is not easy having #good #things in the presence of #capitalism.

Marie Belanger May 29, 2023

@andreasdotorg too bad there is not a "scowl"button...

Rob Bos May 29, 2023

@andreasdotorg Oh, man. I have a couple of library archive sites, and I ended up just blocking AWS and Azure entirely because of the constant high-pressure scraping.

Some people just can't keep their scraping down below a reasonable rate limit. Like, I'd be fine if you kept requests under 1000/hour or something but if you're spinning up 50 servers to hoover up thousands of pages each as fast as I can serve them, fuck y'all. Now you get NOTHING.

Rob Bos May 29, 2023

@andreasdotorg It's also a little ironic because if they'd just ASKED I could fedex them a hard drive or something.

monoxane May 29, 2023

@andreasdotorg AI companies and ruining good things for everyone, name a better duo

Alison Creekside May 29, 2023

@andreasdotorg
https://blog.archive.org/2023/05/29/let-us-serve-you-but-dont-bring-us-down/

Let us serve you, but don’t bring us down | Internet Archive Blogs

Garth Cummings May 29, 2023

@andreasdotorg @internetarchive any reason you’re not posting these updates to your Fediverse account?

Carsten May 29, 2023

@andreasdotorg It is because some people just overdue some stupid thing without realizing the consequences of data hoarding or data leeching.

enmodo ⚛️ 🧬 🇺🇦 🍉May 29, 2023

@andreasdotorg maybe someone is trying to archive the archive...

TetsuoYtek 🟦May 29, 2023

Add to the terms and conditions that traffic coming for bots or “non human” interface will be charged based on the traffic generated.
Then send them the bill.

The Doctor May 29, 2023

@andreasdotorg Wait until they start scraping the Wayback Machine.

IllusionOfMana May 29, 2023

@andreasdotorg Its stupid how much of this comes from AWS. Amazon really could care less about what happens on its services. My grandfathers HTML only site constantly get blasted with third party scrapers and DDOS. I shouldnt have to pay this much in security for all HTML sites about nature photography.

HistoPol (#HP) 🏴 🇺🇸 🏴May 29, 2023

They need to filter out #AWS and other corporate mass-retrievers.
10-20 accesses per day or so.

bigiain May 29, 2023

@HistoPol @andreasdotorg How about as soon as you get past 1000 requests in an hour, you start getting fed back random gibberish instead of actual pages… <smirk>

HistoPol (#HP) 🏴 🇺🇸 🏴May 30, 2023

@bigiain

1000 seems way too high fit me still @andreasdotorg

Jaime Herazo May 29, 2023

@andreasdotorg
The Librarian gave some details, it went down for a bit after some people hammered it badly twice: https://blog.archive.org/2023/05/29/let-us-serve-you-but-dont-bring-us-down/

Let us serve you, but don’t bring us down | Internet Archive Blogs

(((Jann Gobble)))🏳️‍🌈May 29, 2023

@andreasdotorg Hey @donmelton ⬆️ This could probably use a boost to an audience larger than mine...thanks in advance! #archive

Don Melton May 29, 2023

@jann @andreasdotorg Already boosted. 👍

(((Jann Gobble)))🏳️‍🌈May 31, 2023

@donmelton @andreasdotorg I should’ve known! 😂

David 🏳️‍🌈🏳️‍⚧️NB they/them May 29, 2023

@andreasdotorg Corporations can take whatever they want but when we do it to them they call it "piracy"

Elias May 29, 2023

@andreasdotorg fuck.

Demi Marie Obenour May 29, 2023

@andreasdotorg Has this been reported to AWS?

https://git.arielaw.ar/arisunz/ir34

sunz*May 30, 2023

@andreasdotorg we can give them hell

ir34

kaboom

ari's coven

potatoxel May 30, 2023

@andreasdotorg 3::

pedro 🍉May 30, 2023

@andreasdotorg maybe they should have asked the AI how to harvest content in a civilized way?

c0c0bird May 30, 2023

@andreasdotorg The account you forgot to tag is:
@internetarchive

Guillaume Rossolini May 30, 2023

@andreasdotorg
I've seen people use the Archive as a CDN, in the past 🤷🏼‍♀️

Aviancer May 30, 2023

@andreasdotorg Upsetting to see them effectively DoS the service when they’ve could do it in a much more reasonable manner. archive.org is a treasure for so many historical computer resources that aren’t really available anywhere else anymore.

Craig Nicol May 30, 2023

@andreasdotorg so, first cloud services were taken out by cyptominers, and now comment providers are getting taken out by AI? And remember the silicon valley startup wise business model was ddos attacks on restaurants so their app was the only place to get a table?

At which point do we start treating VCs as enablers for this drive-by suffocation?

Martin Be May 31, 2023

@andreasdotorg Nice things by stealing other people's work.

Walker Boh🛡Jul 8, 2023

@andreasdotorg guess elmos not the only one feeling the heat...