so, let me get this straight: if i embed these names in all my web pages, ChatGPT won’t be able to plagiarize the content?

David Mayer
Jonathan Zittrain
Jonathan Turley
Brian Hood
Guido Scorza

https://arstechnica.com/information-technology/2024/12/certain-names-make-chatgpt-grind-to-a-halt-and-we-know-why/

i mean, this would be an amazing thing to put into practice, since these companies don’t respect robots.txt anyways.

this wouldn’t be poisoning the data. it would be more like embedding guardian angels onto your web pages.

Certain names make ChatGPT grind to a halt, and we know why

Filter resulting from subject of settled defamation lawsuit could cause trouble down the road.

Ars Technica
@blogdiva @xgranade this appears to be a filter on prompts, which I do not think implies anything about training data? the article supposes that it does but it only describes experiments with prompts.
@glyph @blogdiva It's at least harder for your work to be plagiarized if it includes something that's likely to trip a filter at the end. In that sense, more poisoning training data than being excluded from it.

@xgranade @glyph @blogdiva after having read the article, it’s still unclear to me whether they employ filters on the user prompts, or secondary input data (i.e. websites) too

The names do not affect outputs using OpenAI’s API systems or in the OpenAI Playground (a special site for developer testing).

nonetheless, this could be an interesting hack in the future