Mastodawn

Pun Boleh May 30, 2025

@timbray This xkcd solution (from 2010, I think) doesn't seem so farfetched any more:
https://xkcd.com/810/
Wouldn't that work by now?

(Spoilers for those who don't click on unknown links - it's a comic suggesting a spam filter that uses AI to rate comments as constructive or not constructive, forcing spammers to train their AIs to write constructive comments, while also blocking human comments that aren't constructive.)

On second thought, it probably would have the same failure modes as RLHF, ending up not being able to distinguish something that is constructive from something that merely looks constructive.

The more I think about it the less does it look like a good idea. Though still not worse than our current situation?

Constructive

xkcd

The Future of Comments is Lies, I Guess

Constructive