Now accepted to #ICLR2023! Look forward to our talk on open source, efficient natural language RLHF algorithms at Kigali, Rwanda!!!
The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing
💻RL4LMs: library to train any @[email protected] LM w/ RL
https://github.com/allenai/RL4LMs
👾GRUE: benchmark of 6 NLP tasks+rewards
📈NLPO: new RL alg 4 LMs
🌐https://rl4lms.apps.allenai.org
🐦🔗: https://twitter.com/rajammanabrolu/status/1577690380161585152