so, let me get this straight: if i embed these names in all my web pages, ChatGPT won’t be able to plagiarize the content?

David Mayer
Jonathan Zittrain
Jonathan Turley
Brian Hood
Guido Scorza

https://arstechnica.com/information-technology/2024/12/certain-names-make-chatgpt-grind-to-a-halt-and-we-know-why/

i mean, this would be an amazing thing to put into practice, since these companies don’t respect robots.txt anyways.

this wouldn’t be poisoning the data. it would be more like embedding guardian angels onto your web pages.

Certain names make ChatGPT grind to a halt, and we know why

Filter resulting from subject of settled defamation lawsuit could cause trouble down the road.

Ars Technica
@blogdiva those names are only blocked on the web ui of Chat GPT and not using the API, so I don't think the crawler will skip your website like that ​​
@namori @blogdiva Yeah, it's a manually crafted filter on the output, it doesn't change what the crawler fetches or what the inference uses. The LLM outputs word after word, they literally just added rules to fail if the next generated word if in the list, so the moment it can be taken out, it will without having to touch the dataset or the trained model.