MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution

Li, Xinrui; Wu, Jianlong; Huang, Xinchuan; Chen, Chong; Guan, Weili; Hua, Xian-Sheng; Nie, Liqiang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.08096 (cs)

[Submitted on 11 Mar 2025]

Title:MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution

Authors:Xinrui Li, Jianlong Wu, Xinchuan Huang, Chong Chen, Weili Guan, Xian-Sheng Hua, Liqiang Nie

View PDF HTML (experimental)

Abstract:Pioneering text-to-image (T2I) diffusion models have ushered in a new era of real-world image super-resolution (Real-ISR), significantly enhancing the visual perception of reconstructed images. However, existing methods typically integrate uniform abstract textual semantics across all blocks, overlooking the distinct semantic requirements at different depths and the fine-grained, concrete semantics inherently present in the images themselves. Moreover, relying solely on a single type of guidance further disrupts the consistency of reconstruction. To address these issues, we propose MegaSR, a novel framework that mines customized block-wise semantics and expressive guidance for diffusion-based ISR. Compared to uniform textual semantics, MegaSR enables flexible adaptation to multi-granularity semantic awareness by dynamically incorporating image attributes at each block. Furthermore, we experimentally identify HED edge maps, depth maps, and segmentation maps as the most expressive guidance, and propose a multi-stage aggregation strategy to modulate them into the T2I models. Extensive experiments demonstrate the superiority of MegaSR in terms of semantic richness and structural consistency.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.08096 [cs.CV]
	(or arXiv:2503.08096v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.08096

Submission history

From: Xinrui Li [view email]
[v1] Tue, 11 Mar 2025 07:00:20 UTC (4,027 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MegaSR: Mining Customized Semantics and Expressive Guidance for Image Super-Resolution

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators