#Development #Explorations
The overlap between search bots and AI scrapers · Why robots.txt alone won’t keep AI off your website https://ilo.im/15z9al
_____
#Business #SEO #AI #SearchEngine #SearchBot #AiBot #Website #WebDev #UserAgent #RobotsTxt
@asjo I've been seeing the same pattern for months: #OpenAI's crawlers are slurping anything they can lay their clammy hands on, no matter what /robots.txt? is saying.
So now I regularly grab the IP addresses from the JSON blobs mentions on https://platform.openai.com/docs/bots/ and add them to my #iptables.
/cc #ChatGPT, #GPTBot, #OAI, #SearchBot
#Development #Explorations
The overlap between search bots and AI scrapers · Why robots.txt alone won’t keep AI off your website https://ilo.im/15z9al
_____
#Business #SEO #AI #SearchEngine #SearchBot #AiBot #Website #WebDev #UserAgent #RobotsTxt
@blaine @andrew The elegant thing about @anildash's proposal is that your content would only be searchable if you were following this hypothetical #searchbot at the moment you publicly posted it, effectively opting in on a per-post basis.
Likewise, any flavor of boosted content (including quotes, if available) would presumably only be indexed if both accounts involved were opted in (via this mechanism) at the relevant posting time(s).