Generative Regression Based Watch Time Prediction for Short-Video Recommendation

Ma, Hongxu; Tian, Kai; Zhang, Tao; Zhang, Xuefeng; Zhou, Han; Chen, Chunjie; Li, Han; Guan, Jihong; Zhou, Shuigeng

Computer Science > Machine Learning

arXiv:2412.20211 (cs)

[Submitted on 28 Dec 2024 (v1), last revised 12 Apr 2025 (this version, v3)]

Title:Generative Regression Based Watch Time Prediction for Short-Video Recommendation

Authors:Hongxu Ma, Kai Tian, Tao Zhang, Xuefeng Zhang, Han Zhou, Chunjie Chen, Han Li, Jihong Guan, Shuigeng Zhou

View PDF HTML (experimental)

Abstract:Watch time prediction (WTP) has emerged as a pivotal task in short video recommendation systems, designed to quantify user engagement through continuous interaction modeling. Predicting users' watch times on videos often encounters fundamental challenges, including wide value ranges and imbalanced data distributions, which can lead to significant estimation bias when directly applying regression techniques. Recent studies have attempted to address these issues by converting the continuous watch time estimation into an ordinal regression task. While these methods demonstrate partial effectiveness, they exhibit notable limitations: (1) the discretization process frequently relies on bucket partitioning, inherently reducing prediction flexibility and accuracy and (2) the interdependencies among different partition intervals remain underutilized, missing opportunities for effective error correction.
Inspired by language modeling paradigms, we propose a novel Generative Regression (GR) framework that reformulates WTP as a sequence generation task. Our approach employs \textit{structural discretization} to enable nearly lossless value reconstruction while maintaining prediction fidelity. Through carefully designed vocabulary construction and label encoding schemes, each watch time is bijectively mapped to a token sequence. To mitigate the training-inference discrepancy caused by teacher-forcing, we introduce a \textit{curriculum learning with embedding mixup} strategy that gradually transitions from guided to free-generation modes. We evaluate our method against state-of-the-art approaches on two public datasets and one industrial dataset. We also perform online A/B testing on the Kuaishou App to confirm the real-world effectiveness. The results conclusively show that GR outperforms existing techniques significantly.

Comments:	10 pages, 5 figures, conference or other essential info
Subjects:	Machine Learning (cs.LG); Information Retrieval (cs.IR)
Cite as:	arXiv:2412.20211 [cs.LG]
	(or arXiv:2412.20211v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2412.20211

Submission history

From: Hongxu Ma [view email]
[v1] Sat, 28 Dec 2024 16:48:55 UTC (3,530 KB)
[v2] Fri, 24 Jan 2025 11:18:26 UTC (3,529 KB)
[v3] Sat, 12 Apr 2025 13:16:19 UTC (5,345 KB)

Computer Science > Machine Learning

Title:Generative Regression Based Watch Time Prediction for Short-Video Recommendation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Generative Regression Based Watch Time Prediction for Short-Video Recommendation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators