CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training

Guo, Qipeng; Jin, Zhijing; Qiu, Xipeng; Zhang, Weinan; Wipf, David; Zhang, Zheng

Computer Science > Computation and Language

arXiv:2006.04702v2 (cs)

[Submitted on 8 Jun 2020 (v1), revised 11 Jun 2020 (this version, v2), latest version 9 Dec 2020 (v3)]

Title:CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training

Authors:Qipeng Guo, Zhijing Jin, Xipeng Qiu, Weinan Zhang, David Wipf, Zheng Zhang

View PDF

Abstract:Two important tasks at the intersection of knowledge graphs and natural language processing are graph-to-text (G2T) and text-to-graph (T2G) conversion. Due to the difficulty and high cost of data collection, the supervised data available in the two fields are usually on the magnitude of tens of thousands, for example, 18K in the WebNLG dataset, which is far fewer than the millions of data for other tasks such as machine translation. Consequently, deep learning models in these two fields suffer largely from scarce training data. This work presents the first attempt to unsupervised learning of T2G and G2T via cycle training. We present CycleGT, an unsupervised training framework that can bootstrap from fully non-parallel graph and text datasets, iteratively back translate between the two forms, and use a novel pretraining strategy. Experiments on the benchmark WebNLG dataset show that, impressively, our unsupervised model trained on the same amount of data can achieve performance on par with the supervised models. This validates our framework as an effective approach to overcome the data scarcity problem in the fields of G2T and T2G.

Comments:	Submitted to NeurIPS 2020
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2006.04702 [cs.CL]
	(or arXiv:2006.04702v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2006.04702

Submission history

From: Zhijing Jin [view email]
[v1] Mon, 8 Jun 2020 15:59:00 UTC (399 KB)
[v2] Thu, 11 Jun 2020 17:26:44 UTC (402 KB)
[v3] Wed, 9 Dec 2020 19:29:27 UTC (7,302 KB)

Computer Science > Computation and Language

Title:CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:CycleGT: Unsupervised Graph-to-Text and Text-to-Graph Generation via Cycle Training

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators