CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

Santra, Bishal; Ghadia, Ravi; Gupta, Manish; Goyal, Pawan

Computer Science > Computation and Language

arXiv:2205.10558 (cs)

[Submitted on 21 May 2022 (v1), last revised 20 May 2023 (this version, v3)]

Title:CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

Authors:Bishal Santra, Ravi Ghadia, Manish Gupta, Pawan Goyal

View PDF

Abstract:In the field of Natural Language Processing, there are many tasks that can be tackled effectively using the cross-entropy (CE) loss function. However, the task of dialog generation poses unique challenges for CE loss. This is because CE loss assumes that, for any given input, the only possible output is the one available as the ground truth in the training dataset. But, in dialog generation, there can be multiple valid responses (for a given context) that not only have different surface forms but can also be semantically different. Furthermore, CE loss computation for the dialog generation task does not take the input context into consideration and, hence, it grades the response irrespective of the context. To grade the generated response for qualities like relevance, engagingness, etc., the loss function should depend on both the context and the generated response. To address these limitations, this paper proposes CORAL, a novel loss function based on a reinforcement learning (RL) view of the dialog generation task with a reward function that estimates human preference for generated responses while considering both the context and the response. Furthermore, to overcome challenges such as high sample complexity of RL training and a large action space, we propose a mix-policy training algorithm. Notably, using CORAL we can train dialog generation models without assuming the ground-truth as the only correct response. Extensive comparisons on benchmark datasets demonstrate that CORAL based models outperform strong state-of-the-art baseline models of different sizes.

Comments:	15 pages, 3 figures. TLDR: CORAL proposes a novel loss function for dialog generation, incorporating context and multiple valid responses. It outperforms existing models by optimizing human preference through reinforcement learning
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2205.10558 [cs.CL]
	(or arXiv:2205.10558v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2205.10558

Submission history

From: Bishal Santra [view email]
[v1] Sat, 21 May 2022 10:36:22 UTC (224 KB)
[v2] Tue, 13 Dec 2022 11:19:18 UTC (973 KB)
[v3] Sat, 20 May 2023 13:50:54 UTC (1,772 KB)

Computer Science > Computation and Language

Title:CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:CORAL: Contextual Response Retrievability Loss Function for Training Dialog Generation Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators