We apologize for a period of extreme slowness today. The army of AI crawlers just leveled up and hit us very badly.

The good news: We're keeping up with the additional load of new users moving to Codeberg. Welcome aboard, we're happy to have you here. After adjusting the AI crawler protections, performance significantly improved again.

It seems like the AI crawlers learned how to solve the Anubis challenges. Anubis is a tool hosted on our infrastructure that requires browsers to do some heavy computation before accessing Codeberg again. It really saved us tons of nerves over the past months, because it saved us from manually maintaining blocklists to having a working detection for "real browsers" and "AI crawlers".
However, we can confirm that at least Huawei networks now send the challenge responses and they actually do seem to take a few seconds to actually compute the answers. It looks plausible, so we assume that AI crawlers leveled up their computing power to emulate more of real browser behaviour to bypass the diversity of challenges that platform enabled to avoid the bot army.

We have a list of explicitly blocked IP ranges. However, a configuration oversight on our part only blocked these ranges on the "normal" routes. The "anubis-protected" routes didn't consider the challenge. It was not a problem while Anubis also protected from the crawlers on the other routes.

However, now that they managed to break through Anubis, there was nothing stopping these armies.

It took us a while to identify and fix the config issue, but we're safe again (for now).

For the load average auction, we offer these numbers from one of our physical servers. Who can offer more?

(It was not the "wildest" moment, but the only for which we have a screenshot)

@Codeberg In the days of single CPU servers (early 90s?) and an interesting filesystem problem, I think I may have seen ~400 at a client site!

@DamonHD @Codeberg got around 800 on a single CPU server (well actually VM) this week, I had blocked lots of Huawei networks but there’s no list of all of them.

I now block some more, but I also reduced MaxServer in Apache httpd, to put an upper cap on this. Getting into swapping is not fun.

@mirabilos @Codeberg Yes, that is bad. I now run servers with no disc swap and tightly capped limits on all services to avoid the dreaded OOM Killer being needed.
@DamonHD @Codeberg no swap at all is actually a bad thing on Linux. Put some in, if it is just 128 MiB. They designed it to work with it.
@mirabilos @Codeberg my scheme has been working well for me (with a little zram) for well over a decade!
@Codeberg @DamonHD it works, sure
@mirabilos @Codeberg I run all my primary services on an RPi3 off grid. It has 1GB RAM and swapping to the memory card would be very slow and would wear it fast.
@mirabilos @Codeberg It also gives better Web response than I can get out of Cloudflare for my target audience. So it meets all my goals.
@mirabilos @Codeberg Those may well be different to your goals for a server.
@mirabilos @DamonHD
AFAIK zram swap has the same positive effects as typical swap in terms of Linux being designed with swap in mind

@Codeberg ouch. This remains a cat-and-mouse game.

At least having them solve the Anubis challenge does cost them extra resources, but if they can do that at scale, it doesn't promise a lot of good.

@ikke @Codeberg it costs the planet extra resources too unfortunately :/
@Codeberg wow - that looks scary. Thanks for all your work ❤️
@Codeberg I really wish you contacted me at all about this before going public.
@cadey I'm sorry if this gave you any unwanted or negative attention. I consider crawlers emulating more of real browser features to bypass protections of websites an inevitable future, and today at least one big crawler seems to have started doing so. ~f
@Codeberg Can we continue this conversation over email after my panic subsides? me@xeiaso.net.
@cadey @Codeberg in case it is because of this, don't be too hard on yourself! It sounds like the bots went out of their way to spend insane cloud compute budgets but still not really achieved anything. It hurt them more than what they've gained
@Codeberg It'd be a good time to encourage folks to sign up to https://github.com/sponsors/Xe.
Sponsor @Xe on GitHub Sponsors

Support Xe's work in open source

GitHub
@thesamesam Unfortunately, I'm not sure if encouraging anyone to reinforce the vendor-lock-in of Microsoft GitHub by making maintainers financially dependent on that platform, is in spirit with our mission. ~f
Xe is creating Art, Code, Writing | Patreon

Become a patron of Xe today: Get access to exclusive content and experiences on the world’s largest membership platform for artists and creators.

Patreon

@Codeberg @thesamesam

https://liberapay.com/Xe/ you can support Xe through liberapay, too. Is that more in line with something you could make a visible post directing people to, in order to support that work?

Xe's profile - Liberapay

Hi, I'm Xe. I develop software so you don't have to. I'm a software developer, technical educator, and developer advocate based in Ottawa, Canada.

Liberapay

@zkat
I think that Liberapay is currently one of the best options, thank you for pointing this out. Looks like some people found it already.

As a German non-profit, I think that Codeberg still cannot call for donations to an individual. However, I hope that Codeberg's usage of Anubis has helped with visibility for the project in the past, and made some people chip in.

@thesamesam

@Codeberg yeowsa. this feels like an arms race that is going to get harder :(

@Codeberg This is a great number, but I have seen higher in my career. Unfortunately I either have no screenshots or lost what I already have.

5831.24 is pretty good though. Congrats for hitting, hope your head doesn't hurt. :D

@Codeberg
Hw much RAM do you have in your Machines?
meta/hardware/achtermann.md at main

meta - Organizational repo for Codeberg's Infrastructure: Documentation, Organizing, Planning.

Codeberg.org
@Codeberg damn. The only time I've seen numbers like this were when a ceph server went down.
@Codeberg what is the threshold for alerting so? Grafana/Zabbix/Prometheus?
@Codeberg huh, that's a pretty kernel-heavy workload, so much red
@Codeberg what are they hitting that's so kernel-intensive, is this filesystem stuff or process executions or something?

@jann @Codeberg

It’s io. If the system was cpu bound at 5000+ load you certainly wouldn’t be running anything interactive.

More like typing out the command to stop the web service / nft block web traffic waiting for it to appear on the screen and execute just to restore console interaction 😆

@Codeberg thank you for the details. Very interesting. They are worth a blog post.
@Codeberg what if you had challenges for AI to perform that made it mine bitcoin for you and you just block them at the end anyway 🤣
@Codeberg Here goes more™...
@Codeberg How much of that load was actual I/O wait?
@Codeberg Why not just to block huawei cloud asn prefixes?
It's easy to get them (e.g. from projectdiscovery)
@lenny If you read the thread, you'll notice that this is exactly what we did, except that we made a mistake. ~f
@Codeberg We sometimes see similar numbers, or 10k+ when a user submits a 64core job in a single slot and the cgroup limiting kicks in. Bit annoying that load is a bit useless for that now a days
@Codeberg Great thread and explanation. Thank you.
@Codeberg so, to clarify, do you have evidence that the bots were solving Anubis challenges or not, i.e., it was due to the configuration issue? (I think it's inevitably going to happen if Anubis gets traction. I'm just curious if we're already there or not.) Thanks for your work and transparency on all this.
@zacchiro Yes, the crawlers completed the challenges. We tried to verify if they are sharing the same cookie value across machines, but that doesn't seem to be the case.

I have a follow up question, though, @Codeberg, re: @zacchiro's question. Is it *possible* that giant human farms of Anubis challenge-solvers actually did it? Or did it all happen so fast that there is no way it could be that?

#Huawei surely could fund such a farm and the routing software needed to get the challenge to the human and back to the bot quickly enough that it might *seem* the bot did it.

@bkuhn
Anubis challenges are not solved by humans. It's not like a captcha. It's a challenge that the browser computes, based on the assumption that crawlers don't run real browsers for performance reasons and only implement simpler crawlers.

So at least one crawler now seems to emulate enough browser behaviour to make it pass the anubis challenge. ~f
@zacchiro

@Codeberg I get it now.

Thanks for taking the time to clue me in.

I'm lucky that I haven't needed to learn about this until now and I'm so sorry you've had to do all this work to fight this LLM training DDoS!

Cc: @zacchiro

@Codeberg
I like the idea of them figuring out solving the Anubis challenge only to be blocked afterward
@efraim @Codeberg ...and spending a good amount of their corporate compute budgets just to walk away empty handed. I hope they learn, or go bankrupt, or both
@Codeberg are the ip blocklists public?
@nemo Currently not. We wanted to investigate the legal situation with regards to sharing such lists. They could currently contain individual's IP addresses and likely need to be cleaned up first. ~f
@Codeberg no worries, ty for fighting the good fight o7
@Codeberg Was the solution to increase the proof-of-work difficulty?