Mastodawn

@algernon Are there any defenses against LLM scrapers that you would consider deploying alongside iocaine, or do you think it functions best alone? Like are there complementary approaches?

#iocaine

Show thread

a very strange trackball 4d ago

@skyfaller Depends on your goal. If you want to defend against most scrapers, iocaine can be enough, in my opinion.

It's not perfect, and there are scrapers that sometimes get through. There are ways to catch a bunch that iocaine can't, through behavioural analysis - that is something iocaine can't do yet, and likely won't for a while1.

Or, to put it differently: I'd never say iocaine functions best alone. There's always something where other tools can improve one's defenses. Whether that's practical, or worth it, is something that'll have to be decided on a case by case basis.

My goal is to make iocaine "enough", so that I don't need to use anything else for my own stuff.

I have made some experiments, I know how to teach iocaine to keep some state, but it's a lot of effort, and requires considerably more resources than iocaine uses currently. So I'm holding off on implementing these until it becomes necessary. Hoping the bubble will burst sooner. ↩︎

Show thread

Amanda G in Éirinn 3d ago

@skyfaller @algernon If I’m not mistaken Anubis works at the proxy layer, but I have not myself implemented it so take that suggestion with a grain of salt. https://dukespace.lib.duke.edu/items/a99a4736-6542-4ef1-8492-41c80e58e1be

Anubis Pilot Project Report - June 2025

In May & June 2025, Duke University Libraries (DUL) staff successfully implemented Anubis, a configurable open source web application firewall (WAF), in order to stave off persistent onslaughts of AI-related bot scraping activity. During this pilot period (May 1 - June 10, 2025), aggressive bot scraping led to extended outages for three critical library platforms (Duke Digital Repository, Archives & Manuscripts, and the Books & Media Catalog), and in each case, implementing Anubis mitigated the problem.

Show thread

Nelson 3d ago

@amanda @algernon Sadly the Anubis author is a self-described "centrist" on LLMs, uses them to write code and as a conversation partner, so I don't really trust Anubis long-term. I want to post about it but they are sick and I'm sick so now is not the time to pick a fight with a mutual.