The secret to making
#AIChatbots sound
#smart and
#spew less
#toxic nonsense is to use a technique called reinforcement learning from
#HumanFeedback, which uses input from people to improve the model’s answers. It relies on a small army of
#human #data #annotators who evaluate whether a string of text makes sense and sounds fluent and natural. They decide whether a response should be kept in the AI model’s database or removed.
https://www.technologyreview.com/2023/06/13/1074560/we-are-all-ais-free-data-workers