KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants

Dhole, Kaustubh D.

Computer Science > Human-Computer Interaction

arXiv:2401.16454 (cs)

[Submitted on 29 Jan 2024]

Title:KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants

Authors:Kaustubh D. Dhole

View PDF HTML (experimental)

Abstract:An effective multi-turn instruction-following assistant can be developed by creating a simulator that can generate useful interaction data. Apart from relying on its intrinsic weights, an ideal user simulator should also be able to bootstrap external knowledge rapidly in its raw form to simulate the multifarious diversity of text available over the internet. Previous user simulators generally lacked diversity, were mostly closed domain, and necessitated rigid schema making them inefficient to rapidly scale to incorporate external knowledge. In this regard, we introduce, Kaucus, a Knowledge-Augmented User Simulator framework, to outline a process of creating diverse user simulators, that can seamlessly exploit external knowledge as well as benefit downstream assistant model training. Through two GPT-J based simulators viz., a Retrieval Augmented Simulator and a Summary Controlled Simulator we generate diverse simulator-assistant interactions. Through reward and preference model-based evaluations, we find that these interactions serve as useful training data and create more helpful downstream assistants. We also find that incorporating knowledge through retrieval augmentation or summary control helps create better assistants.

Comments:	Simulation of Conversational Intelligence in Chat, EACL 2024
Subjects:	Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR)
ACM classes:	I.2.7; H.3.3
Cite as:	arXiv:2401.16454 [cs.HC]
	(or arXiv:2401.16454v1 [cs.HC] for this version)
	https://doi.org/10.48550/arXiv.2401.16454

Submission history

From: Kaustubh Dhole [view email]
[v1] Mon, 29 Jan 2024 06:57:02 UTC (1,409 KB)

Computer Science > Human-Computer Interaction

Title:KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Human-Computer Interaction

Title:KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators