Contextual Multinomial Logit Bandits with General Value Functions

Zhang, Mengxiao; Luo, Haipeng

Computer Science > Machine Learning

arXiv:2402.08126 (cs)

[Submitted on 12 Feb 2024 (v1), last revised 18 Feb 2024 (this version, v2)]

Title:Contextual Multinomial Logit Bandits with General Value Functions

Authors:Mengxiao Zhang, Haipeng Luo

View PDF

Abstract:Contextual multinomial logit (MNL) bandits capture many real-world assortment recommendation problems such as online retailing/advertising. However, prior work has only considered (generalized) linear value functions, which greatly limits its applicability. Motivated by this fact, in this work, we consider contextual MNL bandits with a general value function class that contains the ground truth, borrowing ideas from a recent trend of studies on contextual bandits. Specifically, we consider both the stochastic and the adversarial settings, and propose a suite of algorithms, each with different computation-regret trade-off. When applied to the linear case, our results not only are the first ones with no dependence on a certain problem-dependent constant that can be exponentially large, but also enjoy other advantages such as computational efficiency, dimension-free regret bounds, or the ability to handle completely adversarial contexts and rewards.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2402.08126 [cs.LG]
	(or arXiv:2402.08126v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.08126

Submission history

From: Mengxiao Zhang [view email]
[v1] Mon, 12 Feb 2024 23:50:44 UTC (55 KB)
[v2] Sun, 18 Feb 2024 20:32:08 UTC (55 KB)

Computer Science > Machine Learning

Title:Contextual Multinomial Logit Bandits with General Value Functions

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Contextual Multinomial Logit Bandits with General Value Functions

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators