DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation

Liu, Zeming; Wang, Haifeng; Niu, Zheng-Yu; Wu, Hua; Che, Wanxiang

Computer Science > Computation and Language

arXiv:2109.08877 (cs)

[Submitted on 18 Sep 2021]

Title:DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation

Authors:Zeming Liu, Haifeng Wang, Zheng-Yu Niu, Hua Wu, Wanxiang Che

View PDF

Abstract:In this paper, we provide a bilingual parallel human-to-human recommendation dialog dataset (DuRecDial 2.0) to enable researchers to explore a challenging task of multilingual and cross-lingual conversational recommendation. The difference between DuRecDial 2.0 and existing conversational recommendation datasets is that the data item (Profile, Goal, Knowledge, Context, Response) in DuRecDial 2.0 is annotated in two languages, both English and Chinese, while other datasets are built with the setting of a single language. We collect 8.2k dialogs aligned across English and Chinese languages (16.5k dialogs and 255k utterances in total) that are annotated by crowdsourced workers with strict quality control procedure. We then build monolingual, multilingual, and cross-lingual conversational recommendation baselines on DuRecDial 2.0. Experiment results show that the use of additional English data can bring performance improvement for Chinese conversational recommendation, indicating the benefits of DuRecDial 2.0. Finally, this dataset provides a challenging testbed for future studies of monolingual, multilingual, and cross-lingual conversational recommendation.

Comments:	Accepted by EMNLP 2021
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2109.08877 [cs.CL]
	(or arXiv:2109.08877v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2109.08877

Submission history

From: Zeming Liu [view email]
[v1] Sat, 18 Sep 2021 08:23:21 UTC (1,719 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-09

Change to browse by:

cs
cs.AI

References & Citations

DBLP - CS Bibliography

listing | bibtex

Haifeng Wang
Zheng-Yu Niu
Hua Wu
Wanxiang Che

export BibTeX citation

Computer Science > Computation and Language

Title:DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators