Computer Science > Computation and Language
[Submitted on 18 Jan 2022 (this version), latest version 28 Dec 2022 (v2)]
Title:Toward Self-Learning End-to-End Dialog Systems
View PDFAbstract:End-to-end task-oriented dialog systems often suffer from out-of-distribution (OOD) inputs after being deployed in dynamic, changing, and open environments. In this work, we propose SL-Agent, a self-learning framework that combines supervised learning, reinforcement learning, and machine teaching for building end-to-end dialog systems in a more realistic changing environment setting. SL-Agent consists of a dialog model and a pre-trained reward model to judge the quality of a system response. SL-Agent enables dialog agents to automatically adapt to environments with user behavior changes by learning from human-bot interactions via reinforcement learning, with the incorporated pre-trained reward model. We validate SL-Agent in four different dialog domains. Experimental results show the effectiveness of SL-Agent for automatically adapting to changing environments using both automatic and human evaluations. Furthermore, experiments on a challenging domain extension setting demonstrate that SL-Agent can effectively adapt to new tasks using limited human corrections provided via machine teaching. We will release code, data, and pre-trained models for further research.
Submission history
From: Xiaoying Zhang [view email][v1] Tue, 18 Jan 2022 09:56:35 UTC (1,296 KB)
[v2] Wed, 28 Dec 2022 12:59:30 UTC (5,937 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.