N\"UWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

Ni, Minheng; Wu, Chenfei; Huang, Haoyang; Jiang, Daxin; Zuo, Wangmeng; Duan, Nan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2202.05009 (cs)

[Submitted on 10 Feb 2022]

Title:NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

Authors:Minheng Ni, Chenfei Wu, Haoyang Huang, Daxin Jiang, Wangmeng Zuo, Nan Duan

View PDF

Abstract:Language guided image inpainting aims to fill in the defective regions of an image under the guidance of text while keeping non-defective regions unchanged. However, the encoding process of existing models suffers from either receptive spreading of defective regions or information loss of non-defective regions, giving rise to visually unappealing inpainting results. To address the above issues, this paper proposes NÜWA-LIP by incorporating defect-free VQGAN (DF-VQGAN) with multi-perspective sequence to sequence (MP-S2S). In particular, DF-VQGAN introduces relative estimation to control receptive spreading and adopts symmetrical connections to protect information. MP-S2S further enhances visual information from complementary perspectives, including both low-level pixels and high-level tokens. Experiments show that DF-VQGAN performs more robustness than VQGAN. To evaluate the inpainting performance of our model, we built up 3 open-domain benchmarks, where NÜWA-LIP is also superior to recent strong baselines.

Comments:	10 pages, 6 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
MSC classes:	I.5.4
Cite as:	arXiv:2202.05009 [cs.CV]
	(or arXiv:2202.05009v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2202.05009

Submission history

From: Minheng Ni [view email]
[v1] Thu, 10 Feb 2022 13:10:23 UTC (6,961 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:NÜWA-LIP: Language Guided Image Inpainting with Defect-free VQGAN

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators