TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

Zeng, Weichao; Shu, Yan; Li, Zhenhang; Yang, Dongbao; Zhou, Yu

Abstract:Centred on content modification and style preservation, Scene Text Editing (STE) remains a challenging task despite considerable progress in text-to-image synthesis and text-driven image manipulation recently. GAN-based STE methods generally encounter a common issue of model generalization, while Diffusion-based STE methods suffer from undesired style deviations. To address these problems, we propose TextCtrl, a diffusion-based method that edits text with prior guidance control. Our method consists of two key components: (i) By constructing fine-grained text style disentanglement and robust text glyph structure representation, TextCtrl explicitly incorporates Style-Structure guidance into model design and network training, significantly improving text style consistency and rendering accuracy. (ii) To further leverage the style prior, a Glyph-adaptive Mutual Self-attention mechanism is proposed which deconstructs the implicit fine-grained features of the source image to enhance style consistency and vision quality during inference. Furthermore, to fill the vacancy of the real-world STE evaluation benchmark, we create the first real-world image-pair dataset termed ScenePair for fair comparisons. Experiments demonstrate the effectiveness of TextCtrl compared with previous methods concerning both style fidelity and text accuracy.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2410.10133 [cs.CV]
	(or arXiv:2410.10133v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2410.10133

Computer Science > Computer Vision and Pattern Recognition

Title:TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators