Evaluating Learned Query Performance Prediction Models at LinkedIn: Challenges, Opportunities, and Findings

Song, Chujun; Bouguerra, Slim; Krogen, Erik; Abadi, Daniel

Computer Science > Databases

arXiv:2504.17181 (cs)

[Submitted on 24 Apr 2025]

Title:Evaluating Learned Query Performance Prediction Models at LinkedIn: Challenges, Opportunities, and Findings

Authors:Chujun Song, Slim Bouguerra, Erik Krogen, Daniel Abadi

View PDF HTML (experimental)

Abstract:Recent advancements in learning-based query performance prediction models have demonstrated remarkable efficacy. However, these models are predominantly validated using synthetic datasets focused on cardinality or latency estimations. This paper explores the application of these models to LinkedIn's complex real-world OLAP queries executed on Trino, addressing four primary research questions: (1) How do these models perform on real-world industrial data with limited information? (2) Can these models generalize to new tasks, such as CPU time prediction and classification? (3) What additional information available from the query plan could be utilized by these models to enhance their performance? (4) What are the theoretical performance limits of these models given the available data? To address these questions, we evaluate several models-including TLSTM, TCNN, QueryFormer, and XGBoost, against the industrial query workload at LinkedIn, and extend our analysis to CPU time regression and classification tasks. We also propose a multi-task learning approach to incorporate underutilized operator-level metrics that could enhance model understanding. Additionally, we empirically analyze the inherent upper bound that can be achieved from the models.

Subjects:	Databases (cs.DB)
ACM classes:	H.2.4
Cite as:	arXiv:2504.17181 [cs.DB]
	(or arXiv:2504.17181v1 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.2504.17181

Submission history

From: Chujun Song [view email]
[v1] Thu, 24 Apr 2025 01:35:34 UTC (1,829 KB)

Computer Science > Databases

Title:Evaluating Learned Query Performance Prediction Models at LinkedIn: Challenges, Opportunities, and Findings

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:Evaluating Learned Query Performance Prediction Models at LinkedIn: Challenges, Opportunities, and Findings

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators