ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval

Huang, Suyuan; Zhang, Chao; Wu, Yuanyuan; Zhang, Haoxin; Wang, Yuan; Wang, Maolin; Cao, Shaosheng; Xu, Tong; Zhao, Xiangyu; Qin, Zengchang; Gao, Yan; Bai, Yunhan; Fan, Jun; Hu, Yao; Chen, Enhong

Computer Science > Information Retrieval

arXiv:2411.15766 (cs)

[Submitted on 24 Nov 2024]

Title:ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval

Authors:Suyuan Huang, Chao Zhang, Yuanyuan Wu, Haoxin Zhang, Yuan Wang, Maolin Wang, Shaosheng Cao, Tong Xu, Xiangyu Zhao, Zengchang Qin, Yan Gao, Yunhan Bai, Jun Fan, Yao Hu, Enhong Chen

View PDF HTML (experimental)

Abstract:Dense retrieval in most industries employs dual-tower architectures to retrieve query-relevant documents. Due to online deployment requirements, existing real-world dense retrieval systems mainly enhance performance by designing negative sampling strategies, overlooking the advantages of scaling up. Recently, Large Language Models (LLMs) have exhibited superior performance that can be leveraged for scaling up dense retrieval. However, scaling up retrieval models significantly increases online query latency. To address this challenge, we propose ScalingNote, a two-stage method to exploit the scaling potential of LLMs for retrieval while maintaining online query latency. The first stage is training dual towers, both initialized from the same LLM, to unlock the potential of LLMs for dense retrieval. Then, we distill only the query tower using mean squared error loss and cosine similarity to reduce online costs. Through theoretical analysis and comprehensive offline and online experiments, we show the effectiveness and efficiency of ScalingNote. Our two-stage scaling method outperforms end-to-end models and verifies the scaling law of dense retrieval with LLMs in industrial scenarios, enabling cost-effective scaling of dense retrieval systems. Our online method incorporating ScalingNote significantly enhances the relevance between retrieved documents and queries.

Subjects:	Information Retrieval (cs.IR)
Cite as:	arXiv:2411.15766 [cs.IR]
	(or arXiv:2411.15766v1 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2411.15766

Submission history

From: Suyuan Huang [view email]
[v1] Sun, 24 Nov 2024 09:27:43 UTC (428 KB)

Computer Science > Information Retrieval

Title:ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:ScalingNote: Scaling up Retrievers with Large Language Models for Real-World Dense Retrieval

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators