ETC-NLG: End-to-end Topic-Conditioned Natural Language Generation

Carbone, Ginevra; Sarti, Gabriele

doi:10.4000/ijcol.728

Computer Science > Computation and Language

arXiv:2008.10875 (cs)

[Submitted on 25 Aug 2020 (v1), last revised 22 Jun 2021 (this version, v3)]

Title:ETC-NLG: End-to-end Topic-Conditioned Natural Language Generation

Authors:Ginevra Carbone, Gabriele Sarti

View PDF

Abstract:Plug-and-play language models (PPLMs) enable topic-conditioned natural language generation by pairing large pre-trained generators with attribute models used to steer the predicted token distribution towards the selected topic. Despite their computational efficiency, PPLMs require large amounts of labeled texts to effectively balance generation fluency and proper conditioning, making them unsuitable for low-resource settings. We present ETC-NLG, an approach leveraging topic modeling annotations to enable fully-unsupervised End-to-end Topic-Conditioned Natural Language Generation over emergent topics in unlabeled document collections. We first test the effectiveness of our approach in a low-resource setting for Italian, evaluating the conditioning for both topic models and gold annotations. We then perform a comparative evaluation of ETC-NLG for Italian and English using a parallel corpus. Finally, we propose an automatic approach to estimate the effectiveness of conditioning on the generated utterances.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2008.10875 [cs.CL]
	(or arXiv:2008.10875v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2008.10875
Journal reference:	Italian Journal of Computational Linguistics (IJCoL) 6-2 (2020) 61-77
Related DOI:	https://doi.org/10.4000/ijcol.728

Submission history

From: Ginevra Carbone [view email]
[v1] Tue, 25 Aug 2020 08:22:38 UTC (2,953 KB)
[v2] Fri, 5 Feb 2021 08:29:42 UTC (6,025 KB)
[v3] Tue, 22 Jun 2021 08:45:23 UTC (2,914 KB)

Computer Science > Computation and Language

Title:ETC-NLG: End-to-end Topic-Conditioned Natural Language Generation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:ETC-NLG: End-to-end Topic-Conditioned Natural Language Generation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators