Learning the Optimal Recommendation from Explorative Users

Yao, Fan; Li, Chuanhao; Nekipelov, Denis; Wang, Hongning; Xu, Haifeng

Computer Science > Machine Learning

arXiv:2110.03068 (cs)

[Submitted on 6 Oct 2021]

Title:Learning the Optimal Recommendation from Explorative Users

Authors:Fan Yao, Chuanhao Li, Denis Nekipelov, Hongning Wang, Haifeng Xu

View PDF

Abstract:We propose a new problem setting to study the sequential interactions between a recommender system and a user. Instead of assuming the user is omniscient, static, and explicit, as the classical practice does, we sketch a more realistic user behavior model, under which the user: 1) rejects recommendations if they are clearly worse than others; 2) updates her utility estimation based on rewards from her accepted recommendations; 3) withholds realized rewards from the system. We formulate the interactions between the system and such an explorative user in a $K$-armed bandit framework and study the problem of learning the optimal recommendation on the system side. We show that efficient system learning is still possible but is more difficult. In particular, the system can identify the best arm with probability at least $1-\delta$ within $O(1/\delta)$ interactions, and we prove this is tight. Our finding contrasts the result for the problem of best arm identification with fixed confidence, in which the best arm can be identified with probability $1-\delta$ within $O(\log(1/\delta))$ interactions. This gap illustrates the inevitable cost the system has to pay when it learns from an explorative user's revealed preferences on its recommendations rather than from the realized rewards.

Subjects:	Machine Learning (cs.LG); Information Retrieval (cs.IR)
Cite as:	arXiv:2110.03068 [cs.LG]
	(or arXiv:2110.03068v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2110.03068

Submission history

From: Fan Yao [view email]
[v1] Wed, 6 Oct 2021 21:01:18 UTC (240 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-10

Change to browse by:

cs
cs.IR

References & Citations

DBLP - CS Bibliography

listing | bibtex

Fan Yao
Denis Nekipelov
Hongning Wang
Haifeng Xu

export BibTeX citation

Computer Science > Machine Learning

Title:Learning the Optimal Recommendation from Explorative Users

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning the Optimal Recommendation from Explorative Users

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators