Mastodawn

Now accepted to #ICLR2023! Look forward to our talk on open source, efficient natural language RLHF algorithms at Kigali, Rwanda!!!

RT @[email protected]

The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing

💻RL4LMs: library to train any @[email protected] LM w/ RL
https://github.com/allenai/RL4LMs
👾GRUE: benchmark of 6 NLP tasks+rewards
📈NLPO: new RL alg 4 LMs

🌐https://rl4lms.apps.allenai.org

🐦🔗: https://twitter.com/rajammanabrolu/status/1577690380161585152

GitHub - allenai/RL4LMs: A modular RL library to fine-tune language models to human preferences

A modular RL library to fine-tune language models to human preferences - GitHub - allenai/RL4LMs: A modular RL library to fine-tune language models to human preferences

GitHub