Batch Active Preference-Based Learning of Reward Functions

Bıyık, Erdem; Sadigh, Dorsa

Computer Science > Machine Learning

arXiv:1810.04303 (cs)

[Submitted on 10 Oct 2018]

Title:Batch Active Preference-Based Learning of Reward Functions

Authors:Erdem Bıyık, Dorsa Sadigh

View PDF

Abstract:Data generation and labeling are usually an expensive part of learning for robotics. While active learning methods are commonly used to tackle the former problem, preference-based learning is a concept that attempts to solve the latter by querying users with preference questions. In this paper, we will develop a new algorithm, batch active preference-based learning, that enables efficient learning of reward functions using as few data samples as possible while still having short query generation times. We introduce several approximations to the batch active learning problem, and provide theoretical guarantees for the convergence of our algorithms. Finally, we present our experimental results for a variety of robotics tasks in simulation. Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time. We then showcase our algorithm in a study to learn human users' preferences.

Comments:	Proceedings of the 2nd Conference on Robot Learning (CoRL), October 2018
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO); Machine Learning (stat.ML)
Cite as:	arXiv:1810.04303 [cs.LG]
	(or arXiv:1810.04303v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1810.04303

Submission history

From: Erdem Bıyık [view email]
[v1] Wed, 10 Oct 2018 00:02:55 UTC (3,748 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-10

Change to browse by:

cs
cs.AI
cs.RO
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Erdem Biyik
Dorsa Sadigh

export BibTeX citation

Computer Science > Machine Learning

Title:Batch Active Preference-Based Learning of Reward Functions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Batch Active Preference-Based Learning of Reward Functions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators