A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution

Ma, Jianqi; Liang, Zhetong; Zhang, Lei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2203.09388 (cs)

[Submitted on 17 Mar 2022 (v1), last revised 18 Mar 2022 (this version, v2)]

Title:A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution

Authors:Jianqi Ma, Zhetong Liang, Lei Zhang

View PDF

Abstract:Scene text image super-resolution aims to increase the resolution and readability of the text in low-resolution images. Though significant improvement has been achieved by deep convolutional neural networks (CNNs), it remains difficult to reconstruct high-resolution images for spatially deformed texts, especially rotated and curve-shaped ones. This is because the current CNN-based methods adopt locality-based operations, which are not effective to deal with the variation caused by deformations. In this paper, we propose a CNN based Text ATTention network (TATT) to address this problem. The semantics of the text are firstly extracted by a text recognition module as text prior information. Then we design a novel transformer-based module, which leverages global attention mechanism, to exert the semantic guidance of text prior to the text reconstruction process. In addition, we propose a text structure consistency loss to refine the visual appearance by imposing structural consistency on the reconstructions of regular and deformed texts. Experiments on the benchmark TextZoom dataset show that the proposed TATT not only achieves state-of-the-art performance in terms of PSNR/SSIM metrics, but also significantly improves the recognition accuracy in the downstream text recognition task, particularly for text instances with multi-orientation and curved shapes. Code is available at this https URL.

Comments:	Accepted to CVPR2022
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2203.09388 [cs.CV]
	(or arXiv:2203.09388v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2203.09388

Submission history

From: Jianqi Ma [view email]
[v1] Thu, 17 Mar 2022 15:28:29 UTC (10,767 KB)
[v2] Fri, 18 Mar 2022 03:07:32 UTC (10,767 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:A Text Attention Network for Spatial Deformation Robust Scene Text Image Super-resolution

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators