We further introduce a new and realistic setup, cross-task cross-domain evaluation, where queries with diverse intents and documents are all pooled and queries only don't fully capture intents. TART largely outperforms competitive models on this setup as well.
7/N
7/N
