Mastodawn

Akari Asai Dec 20, 2022

New paper 🚨

Can we solely rely on LLMs’ memories (eg replace search w ChatGPT)? Probably not.
Is retrieval a silver bullet? Probably not either.

Our analysis reveals that LLMs' memorizations are still limited and scaling won't help much in long-tail distributions.
We show that adaptively incorporating non-parametric memories (eg retrieved chunks) can improve performance as well as efficiency.

📜 http://tinyurl.com/2sdeuupn 💻 http://github.com/AlexTMallen/adaptive-retrieval

#PaperThread #newpaper
[1/N]

Akari Asai Nov 18, 2022

We further introduce a new and realistic setup, cross-task cross-domain evaluation, where queries with diverse intents and documents are all pooled and queries only don't fully capture intents. TART largely outperforms competitive models on this setup as well.
7/N

Show thread

Akari Asai Nov 18, 2022

A user query can have diverse intents (e.g., retrieve relevant documents, find code implementation or similar questions asked in the forum previously).
We often build separate retrieval systems for different intents by training a retriever to model those implicit intents. 2/N

Show thread

Akari Asai Nov 18, 2022

We advocate for a new task formulation, retrieval with Instructions, where a retriever takes a query AND an instruction that EXPLICITLY describes the information need.

The goal here is to build a single retriever that can find relevant documents satisfying the instruction. 3/N

Akari Asai Nov 18, 2022

New paper 🚨 https://arxiv.org/abs/2211.09260

Can we train a single search system that satisfies our diverse information needs?

We present 𝕋𝔸ℝ𝕋 🥧 the first multi-task instruction-following retriever trained on 𝔹𝔼ℝℝ𝕀 🫐, a collections of 40 retrieval tasks with instructions! 1/N

#PaperThread #newpaper

Task-aware Retrieval with Instructions

We study the problem of retrieval with instructions, where users of a retrieval system explicitly describe their intent along with their queries, making the system task-aware. We aim to develop a general-purpose task-aware retrieval systems using multi-task instruction tuning that can follow human-written instructions to find the best documents for a given query. To this end, we introduce the first large-scale collection of approximately 40 retrieval datasets with instructions, and present TART, a multi-task retrieval system trained on the diverse retrieval tasks with instructions. TART shows strong capabilities to adapt to a new task via instructions and advances the state of the art on two zero-shot retrieval benchmarks, BEIR and LOTTE, outperforming models up to three times larger. We further introduce a new evaluation setup to better reflect real-world scenarios, pooling diverse documents and tasks. In this setup, TART significantly outperforms competitive baselines, further demonstrating the effectiveness of guiding retrieval with instructions.

arXiv.org

Website	https://akariasai.github.io/
Twitter	@AkariAsai
Location	Seattle, WA
Pronouns	she/her