LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

Borisyuk, Fedor; Song, Qingquan; Zhou, Mingzhou; Parameswaran, Ganesh; Arun, Madhu; Popuri, Siva; Bingol, Tugrul; Pei, Zhuotao; Lee, Kuang-Hsuan; Zheng, Lu; Shao, Qizhan; Naqvi, Ali; Zhou, Sen; Gupta, Aman

Computer Science > Machine Learning

arXiv:2407.13218 (cs)

[Submitted on 18 Jul 2024 (v1), last revised 7 Aug 2024 (this version, v3)]

Title:LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

Authors:Fedor Borisyuk, Qingquan Song, Mingzhou Zhou, Ganesh Parameswaran, Madhu Arun, Siva Popuri, Tugrul Bingol, Zhuotao Pei, Kuang-Hsuan Lee, Lu Zheng, Qizhan Shao, Ali Naqvi, Sen Zhou, Aman Gupta

View PDF HTML (experimental)

Abstract:This paper introduces LiNR, LinkedIn's large-scale, GPU-based retrieval system. LiNR supports a billion-sized index on GPU models. We discuss our experiences and challenges in creating scalable, differentiable search indexes using TensorFlow and PyTorch at production scale. In LiNR, both items and model weights are integrated into the model binary. Viewing index construction as a form of model training, we describe scaling our system for large indexes, incorporating full scans and efficient filtering. A key focus is on enabling attribute-based pre-filtering for exhaustive GPU searches, addressing the common challenge of post-filtering in KNN searches that often reduces system quality. We further provide multi-embedding retrieval algorithms and strategies for tackling cold start issues in retrieval. Our advancements in supporting larger indexes through quantization are also discussed. We believe LiNR represents one of the industry's first Live-updated model-based retrieval indexes. Applied to out-of-network post recommendations on LinkedIn Feed, LiNR has contributed to a 3% relative increase in professional daily active users. We envisage LiNR as a step towards integrating retrieval and ranking into a single GPU model, simplifying complex infrastructures and enabling end-to-end optimization of the entire differentiable infrastructure through gradient descent.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2407.13218 [cs.LG]
	(or arXiv:2407.13218v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2407.13218

Submission history

From: Fedor Borisyuk [view email]
[v1] Thu, 18 Jul 2024 07:04:33 UTC (2,139 KB)
[v2] Mon, 22 Jul 2024 18:33:25 UTC (2,145 KB)
[v3] Wed, 7 Aug 2024 16:57:06 UTC (2,145 KB)

Computer Science > Machine Learning

Title:LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators