Learning to Rank in Generative Retrieval

Li, Yongqi; Yang, Nan; Wang, Liang; Wei, Furu; Li, Wenjie

Computer Science > Computation and Language

arXiv:2306.15222v1 (cs)

[Submitted on 27 Jun 2023 (this version), latest version 16 Dec 2023 (v2)]

Title:Learning to Rank in Generative Retrieval

Authors:Yongqi Li, Nan Yang, Liang Wang, Furu Wei, Wenjie Li

View PDF

Abstract:Generative retrieval is a promising new paradigm in text retrieval that generates identifier strings of relevant passages as the retrieval target. This paradigm leverages powerful generation models and represents a new paradigm distinct from traditional learning-to-rank methods. However, despite its rapid development, current generative retrieval methods are still limited. They typically rely on a heuristic function to transform predicted identifiers into a passage rank list, which creates a gap between the learning objective of generative retrieval and the desired passage ranking target. Moreover, the inherent exposure bias problem of text generation also persists in generative retrieval. To address these issues, we propose a novel framework, called LTRGR, that combines generative retrieval with the classical learning-to-rank paradigm. Our approach involves training an autoregressive model using a passage rank loss, which directly optimizes the autoregressive model toward the optimal passage ranking. This framework only requires an additional training step to enhance current generative retrieval systems and does not add any burden to the inference stage. We conducted experiments on three public datasets, and our results demonstrate that LTRGR achieves state-of-the-art performance among generative retrieval methods, indicating its effectiveness and robustness.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
Cite as:	arXiv:2306.15222 [cs.CL]
	(or arXiv:2306.15222v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.15222

Submission history

From: Yongqi Li [view email]
[v1] Tue, 27 Jun 2023 05:48:14 UTC (7,097 KB)
[v2] Sat, 16 Dec 2023 13:26:02 UTC (2,299 KB)

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computation and Language

Title:Learning to Rank in Generative Retrieval

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computation and Language

Title:Learning to Rank in Generative Retrieval

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators