honestly the thing that makes me walk from alpine might not even be the LLM debate, the governance issues, etc

it might be the fact that i can no longer properly maintain my packages in alpine because the gitlab is being DoSed by AI scrapers

seriously, i have been trying to git pull *for 30+ minutes now*
incredible stuff, thanks AI companies, this is the best

the other day i caught one of these AI companies downloading the same pkgconf tarball over and over and over and over again

thank goodness that my colocation provider gives me a rather absurdly generous bandwidth allowance...

@ariadne why would they even need a tarball an absurd amount of times
@9pfs i don't know, feel free to ask the fine chaps at SpaceX
@ariadne I'd rather scream in their faces than talk to them, honestly, but that's fair
@9pfs @ariadne One of the things I'd heard about some of these chatbot harnesses is that when directed to touch a project you don't have locally, they'll point themselves at the HTTP endpoint for a Git repository, and fetch individual files, one at a time, with an absurd amount of headers with each request.
@chris @9pfs @ariadne I wonder if it would help to only support dumb-transport mode for unauthenticated HTTPS. That just needs a static file server. Authenticated users can get smart transports.
This is why dog gave us zipbombs. @ariadne
@ariadne ah so that's why it was so incredibly slow. I was going to ask what's up / if there's a problem on the server —but oh, it's overloaded :<
@colinstu our infrastructure is being decimated by LLM scrapers, and yet we can't even agree to say "fuck AI"
@ariadne @colinstu i don't know why every fucking company decided to throw away best scraping practices - caching etc it's ridiculous
@chirpbirb @ariadne caching? storage? why do that when we can just burn a zillion more tokens and charge for it and redo a bunch of work all the time?
@colinstu @chirpbirb @ariadne My wild guess is it's designed and coded by a generation of people who don't know what a caching HTTP proxy server is.

(kinda joking)
@diffie @colinstu @chirpbirb i don't think they are motivated to be resource efficient, considering they can bill for token use
@diffie @colinstu @chirpbirb @ariadne Well does reminds me that all the CI setups I've seen so far just rely on CDNs existing rather than caching proxies / local mirrors.
@chirpbirb @ariadne @colinstu the current wave isn’t by traditional style of scrapers/indexers- from the perspective of the server admin it looks like a DDoS because the scrape is initiated by individual agent sessions on individual machines running user requests; and in an “agentic” workflow this is indistinguishable from a botnet
@chirpbirb @ariadne @colinstu pure desperation to feed the slopinator 5000 with fresh input
@ariadne @colinstu It’ll always be funny (and extremely hypocritical) to me when people using LLMs deploy Anubis or whatever on their own infra. I wonder if Alpine will do something similar.
@ayushnix @colinstu we already use go-away, but clearly it isn't working anymore
@ariadne @colinstu Ah, in that case, people who are advocating for using LLMs in Alpine should be fine with removing any sort of defenses like go-away from the Alpine infra if they really believe LLMs are a net positive and worth discussing rather than just banning it at the first opportunity.
@ayushnix @ariadne @colinstu Anubis is slop and literally shares goals with ‘AI’ companies—it exists solely to gatekeep access to knowledge online
@mkljczk @ayushnix @colinstu cool, anyway, it would be nice to have a functioning alpine gitlab
@ariadne @colinstu @daisy ugh I am so sorry. Definitely sympathizing from a similar situation over here
@ariadne start returning zipbombs IMO
@ariadne GNOME gave up and uses Fastly now.