Prithviraj (Raj) Ammanabrolu

273 Followers
69 Following
75 Posts
On faculty market! Interactive and Grounded AI. NLP+RL at AllenAI. PhD from Georgia Tech. he/him
http://prithvirajva.com
Websitehttp://prithvirajva.com

I've been spending a lot of time thinking about #ChatGPT and academic assessment and why I am not so concerned about it (at the college level). It's not that ChatGPT is bad for education. It's that our education system is built upon a basis wherein we do assessment wrong.

What @hoffman writes in his blog below comes close to how I think of things (thanks for saving me a lot of time writing my thoughts down!)

http://write.guyhoffman.com/why-i-dont-care-if-students-use-gpt

Why I Don't Care if Students Use GPT

They can go ahead, use it to cheat on their essay. It won't do them much good. An Experiment Here's a recent experience I've had with ...

Some Words To Not
Yes! Language as an interface!! Conversational information search works best when LMs are grounded in an underlying info source. See our recent TACL paper led by @[email protected] for more on this idea
https://arxiv.org/abs/2207.00746
INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions

In an information-seeking conversation, a user may ask questions that are under-specified or unanswerable. An ideal agent would interact by initiating different response types according to the available knowledge sources. However, most current studies either fail to or artificially incorporate such agent-side initiative. This work presents InSCIt, a dataset for Information-Seeking Conversations with mixed-initiative Interactions. It contains 4.7K user-agent turns from 805 human-human conversations where the agent searches over Wikipedia and either directly answers, asks for clarification, or provides relevant information to address user queries. The data supports two subtasks, evidence passage identification and response generation, as well as a human evaluation protocol to assess model performance. We report results of two systems based on state-of-the-art models of conversational knowledge identification and open-domain question answering. Both systems significantly underperform humans, suggesting ample room for improvement in future studies.

arXiv.org

RT @[email protected]

I'm pretty optimistic that the LLM reliability / factualness issue can be fixed. The key is to use LLMs as a dialog interface and not as a store of knowledge. LLMs as the query layer between a human user an a knowledge graph with sources (which can be hybrid generated/curated).

🐦🔗: https://twitter.com/fchollet/status/1617255566812008449

François Chollet on Twitter

“I'm pretty optimistic that the LLM reliability / factualness issue can be fixed. The key is to use LLMs as a dialog interface and not as a store of knowledge. LLMs as the query layer between a human user an a knowledge graph with sources (which can be hybrid generated/curated).”

Twitter

RT @[email protected]

the success of chatgpt has lead to investors thinking RLHF is magic (to some extent it is), but boy they are going to be disappointed when their portfolios realize its limitations

🐦🔗: https://twitter.com/deliprao/status/1616986341363044352

Delip Rao on Twitter

“the success of chatgpt has lead to investors thinking RLHF is magic (to some extent it is), but boy they are going to be disappointed when their portfolios realize its limitations”

Twitter

RT @[email protected]

Transformers are robust reasoners, but frustratingly lack the ability for accurate math, navigation, & other easily coded tasks. In our new work "Behavior Cloned Transformers are Neurosymbolic Reasoners", we show you can have the best of both worlds. 1/3

http://cognitiveai.org/wp-content/uploads/2022/10/wang2022-behavior-cloned-transformers-are-neurosymbolic-reasoners-arxiv.pdf

🐦🔗: https://twitter.com/peterjansen_ai/status/1580686608566583296

More good news, our paper on teaching interactive language agents to use existing symbolic tools and APIs has been accepted to #EACL2023. See y'all in Dubrovnik, Croatia!!!

https://arxiv.org/abs/2210.07382

Behavior Cloned Transformers are Neurosymbolic Reasoners

In this work, we explore techniques for augmenting interactive agents with information from symbolic modules, much like humans use tools like calculators and GPS systems to assist with arithmetic and navigation. We test our agent's abilities in text games -- challenging benchmarks for evaluating the multi-step reasoning abilities of game agents in grounded, language-based environments. Our experimental study indicates that injecting the actions from these symbolic modules into the action space of a behavior cloned transformer agent increases performance on four text game benchmarks that test arithmetic, navigation, sorting, and common sense reasoning by an average of 22%, allowing an agent to reach the highest possible performance on unseen games. This action injection technique is easily extended to new agents, environments, and symbolic modules.

arXiv.org

Now accepted to #ICLR2023! Look forward to our talk on open source, efficient natural language RLHF algorithms at Kigali, Rwanda!!!

RT @[email protected]

The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing

💻RL4LMs: library to train any @[email protected] LM w/ RL
https://github.com/allenai/RL4LMs
👾GRUE: benchmark of 6 NLP tasks+rewards
📈NLPO: new RL alg 4 LMs

🌐https://rl4lms.apps.allenai.org

🐦🔗: https://twitter.com/rajammanabrolu/status/1577690380161585152

GitHub - allenai/RL4LMs: A modular RL library to fine-tune language models to human preferences

A modular RL library to fine-tune language models to human preferences - GitHub - allenai/RL4LMs: A modular RL library to fine-tune language models to human preferences

GitHub
Not sure why I felt compelled to do this
I've got a paper with 10+ citations on @[email protected] but 0 on Google Scholar for a while now. (And another paper which is the other way around.) Any ideas what could be wrong with the indexing? Too long titles? Names?
Looking at how papers have cited me (improvements to my methods, criticisms, etc) helps me a lot in keeping up with progress so this is kinda important.