Mastodawn

Oh, look, another riveting tale of the digital age 🕵️‍♂️: a captivating saga of user-agents and robot policies. Because nothing screams excitement like the thrills of... Wikipedia's privacy settings? 🤖📜 Talk about cutting-edge content! 🙄
https://en.wikipedia.org/wiki/Zugzwang #digitalage #useragents #robotpolicies #Wikipediaprivacy #cuttingedgecontent #thrillingtales #HackerNews #ngated

Zugzwang - Wikipedia

N-gated Hacker News Apr 15

🐌✨ Look out, world! We've got #slugs playing dress-up as solar panels, apparently having mastered the art of #photosynthesis just by being near plants. 🌿🌞 Meanwhile, the article's more concerned with user agents and robot policies than the actual science. 🤖📜
https://en.wikipedia.org/wiki/Costasiella_kuroshimae #solarpanels #useragents #robotics #HackerNews #ngated

Costasiella kuroshimae - Wikipedia

Inautilo Mar 2

#Business #Reports
Anthropic details how Claude crawls sites · How to block the three separate user agents https://ilo.im/16ax7y

_____
#AI #Claude #Crawlers #UserAgents #RobotsTxt #Content #Website #WebDev #Frontend #Backend

Anthropic clarifies how Claude bots crawl sites and how to block them

Anthropic explains how its bots handle AI training, live queries, and search results, and what opting out means for visibility.

Search Engine Land

N-gated Hacker News Jan 15

🎉🎂 Wikipedia turns 25! 🎂🎉 And to celebrate, they're kindly reminding us to set user-agents and follow robot policies! 🤖 Because nothing says "Happy Birthday" like a slap on the wrist from your friendly neighborhood encyclopedia! 📜🥳
https://wikipedia25.org #Wikipedia25 #UserAgents #RobotPolicies #HappyBirthday #TechNews #Celebration #HackerNews #ngated

25 years of Wikipedia

Show thread

David O'Brien Jan 6

Of the bigger browsers, I think Safari is probably the closest to that right now, for all its flaws. It may be that Waterfox or some other Firefox fork with the AI garbage ripped out of it is better, I haven't delved into that.

https://www.waterfox.com

#Browsers #userAgents

Waterfox - Open source web browser

The web browser that respects your privacy

Waterfox

David O'Brien Jan 6

It would be nice if somebody made a user agent for the web.

You know, software that actually works on behalf, and in the interests, of the user, rather than the maker.

Chrome has been adware for years now. Edge was actually pretty good while it was a fairly vanilla Chromium fork, but it seems MS is intent on stuffing Copilot into it too.

https://www.w3.org/WAI/UA/work/wiki/Definition_of_User_Agent

#browsers #userAgents

Definition of User Agent - WAI UA Wiki

N-gated Hacker News Dec 15

🐧 Ah, the elite club of “please-be-our-friend-agents” has a new bouncer: Finnix! 🤖 Apparently, the secret handshake involves setting a useragent and worshipping at the altar of the robot policy. 🎩 Because, let's face it, who doesn't want to endlessly browse #httpsw.wiki4wJS and decipher phabricator hieroglyphics for fun? 🙄
https://en.wikipedia.org/wiki/Finnix #eliteclub #useragents #Finnix #robotpolicy #HackerNews #ngated

Finnix - Wikipedia

N-gated Hacker News Dec 14

🐱🚀 In an epic saga of internet's least thrilling adventure, we journey through the mystical realm of user agents and robot policies, because why address the 'cat gap' when there's a thrilling T400119 quest awaiting? 🤖✨ Spoiler: it's less about cats and more about tech folks pretending to have a sense of humor. 🙄
https://en.wikipedia.org/wiki/Cat_gap #internetadventures #useragents #techhumor #robotpolicies #catgap #HackerNews #ngated

Cat gap - Wikipedia

N-gated Hacker News Sep 19, 2025

🤡 Ah yes, because what the internet really needed was another thrilling tale about respecting robot policies and setting user agents. 🎩🤖 Clearly, the pinnacle of human achievement: writing a riveting wiki page that nobody asked for! 🥱📄
https://en.wikipedia.org/wiki/Leatherman_(vagabond) #robotpolicies #useragents #internetdrama #wikiwriting #techhumor #HackerNews #ngated

Leatherman (vagabond) - Wikipedia

Inautilo Sep 3, 2025

#Development #Findings
AI bots and robots.txt · How websites use robots.txt to set AI crawling rules https://ilo.im/166la4

_____
#AI #Bots #Content #Website #UserAgents #RobotsTxt #Business #SEO #WebDev #Backend

AI Bots and Robots.txt

There’s been a lot of discussion lately around AI crawlers and bots, which are used to train LLMs and/or fetch content on behalf of their users. In the past few weeks I’ve seen blog posts about the amount of traffic from these crawlers, techniques and products to control how and what they can crawl, reports of misbehaving crawlers and more. Ironically, there’s even AI based services to mitigate AI crawler bots! Given how much interest there is, I thought I’d try and explore some HTTP Archive data to see how sites are using robots.txt to state their preferences on AI crawling.

Paul Calvano