Unsupervised Cross-lingual Image Captioning

Gao, Jiahui; Zhou, Yi; Yu, Philip L. H.; Gu, Jiuxiang

Computer Science > Computation and Language

arXiv:2010.01288v1 (cs)

[Submitted on 3 Oct 2020 (this version), latest version 7 Feb 2022 (v3)]

Title:Unsupervised Cross-lingual Image Captioning

Authors:Jiahui Gao, Yi Zhou, Philip L. H. Yu, Jiuxiang Gu

View PDF

Abstract:Most recent image captioning works are conducted in English as the majority of image-caption datasets are in English. However, there are a large amount of non-native English speakers worldwide. Generating image captions in different languages is worth exploring. In this paper, we present a novel unsupervised method to generate image captions without using any caption corpus. Our method relies on 1) a cross-lingual auto-encoding, which learns the scene graph mapping function along with the scene graph encoders and sentence decoders on machine translation parallel corpora, and 2) an unsupervised feature mapping, which seeks to map the encoded scene graph features from image modality to sentence modality. By leveraging cross-lingual auto-encoding, cross-modal feature mapping, and adversarial learning, our method can learn an image captioner to generate captions in different languages. We verify the effectiveness of our proposed method on the Chinese image caption generation. The comparisons against several baseline methods demonstrate the effectiveness of our approach.

Comments:	8 pages
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2010.01288 [cs.CL]
	(or arXiv:2010.01288v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.01288

Submission history

From: Jiahui Gao [view email]
[v1] Sat, 3 Oct 2020 06:14:06 UTC (3,380 KB)
[v2] Wed, 7 Apr 2021 13:11:16 UTC (2,341 KB)
[v3] Mon, 7 Feb 2022 16:17:45 UTC (6,409 KB)

Computer Science > Computation and Language

Title:Unsupervised Cross-lingual Image Captioning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Unsupervised Cross-lingual Image Captioning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators