Mastodawn

olivierlacan Aug 7, 2023

Speaking of which, hot new robots.txt entry just dropped:

User-agent: GPTBot
Disallow: /

https://platform.openai.com/docs/gptbot

OpenAI Platform

Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform.

Show thread

Beldantazar Aug 7, 2023

@olivierlacan Even better to return a bunch of GPT generated garbage to poison the model instead of just disallow

Show thread

danwwilson🌱Aug 7, 2023

@Beldantazar @olivierlacan now that is a great idea. Have a singe directory on your site that is a bunch of Lorem ipsum or nonsens language pages that you allow the GPTbot to access. Surely some nice soul will make a generator for this soon. Just make sure to tell the agent only that dodgy directory. Of course that assumes you trust them to do the right thing, which I think they’ve already demonstrated they shouldn’t be trusted.

Show thread

Beldantazar

@Danwwilson @olivierlacan i mean, you're already trusting them if you use the disallow anyway, but the ideal thing is rather than return a bunch of lorem ipsum that will be easily detected, instead return stuff that is chatgpt generated trash, it's harder for them to detect that and ai models get killed fast if they feed off their own outputs. ideally just have the same set of pages return either the normal data or the gpt data depending on user agent, so that way it's even harder to detect.