Open-Domain Dialog Evaluation using Follow-Ups Likelihood

De Bruyn, Maxime; Lotfi, Ehsan; Buhmann, Jeska; Daelemans, Walter

Computer Science > Computation and Language

arXiv:2209.05185 (cs)

[Submitted on 12 Sep 2022]

Title:Open-Domain Dialog Evaluation using Follow-Ups Likelihood

Authors:Maxime De Bruyn, Ehsan Lotfi, Jeska Buhmann, Walter Daelemans

View PDF

Abstract:Automatic evaluation of open-domain dialogs remains an unsolved problem. Moreover, existing methods do not correlate strongly with human annotations. This paper presents a new automated evaluation method using follow-ups: we measure the probability that a language model will continue the conversation with a fixed set of follow-ups (e.g., not really relevant here, what are you trying to say). When compared against twelve existing methods, our new evaluation achieves the highest correlation with human evaluations.

Comments:	Accepted at COLING 2022
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2209.05185 [cs.CL]
	(or arXiv:2209.05185v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2209.05185

Submission history

From: Maxime De Bruyn [view email]
[v1] Mon, 12 Sep 2022 12:22:31 UTC (7,248 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2022-09

Change to browse by:

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:Open-Domain Dialog Evaluation using Follow-Ups Likelihood

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Open-Domain Dialog Evaluation using Follow-Ups Likelihood

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators