"Despite making significant progress, our #InstructGPT models are far from fully aligned or fully safe; they still generate toxic or biased outputs, make up facts, and generate sexual and violent content without explicit prompting"
https://openai.com/research/instruction-following
I know the feeling: Asy the asymmetric cat hardly ever bites people, and I'm very proud of him.
#ai #catsofmastodon
Aligning language models to follow instructions
We’ve trained language models that are much better at following user intentions than GPT-3 while also making them more truthful and less toxic, using techniques developed through our alignment research. These InstructGPT models, which are trained with humans in the loop, are now deployed as the default language models on our API.





