https://aphyr.com/posts/388-the-future-of-comments-is-lies-i-guess
Tl;dr: Spam+LLM is a combo that’s hard to deal with.
https://aphyr.com/posts/388-the-future-of-comments-is-lies-i-guess
Tl;dr: Spam+LLM is a combo that’s hard to deal with.
@timbray This xkcd solution (from 2010, I think) doesn't seem so farfetched any more:
https://xkcd.com/810/
Wouldn't that work by now?
(Spoilers for those who don't click on unknown links - it's a comic suggesting a spam filter that uses AI to rate comments as constructive or not constructive, forcing spammers to train their AIs to write constructive comments, while also blocking human comments that aren't constructive.)
On second thought, it probably would have the same failure modes as RLHF, ending up not being able to distinguish something that is constructive from something that merely looks constructive.
The more I think about it the less does it look like a good idea. Though still not worse than our current situation?