GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models

De, Diptanu; Mitra, Shankhanil; Soundararajan, Rajiv

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2406.04654 (eess)

[Submitted on 7 Jun 2024]

Title:GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models

Authors:Diptanu De, Shankhanil Mitra, Rajiv Soundararajan

View PDF HTML (experimental)

Abstract:The design of no-reference (NR) image quality assessment (IQA) algorithms is extremely important to benchmark and calibrate user experiences in modern visual systems. A major drawback of state-of-the-art NR-IQA methods is their limited ability to generalize across diverse IQA settings with reasonable distribution shifts. Recent text-to-image generative models such as latent diffusion models generate meaningful visual concepts with fine details related to text concepts. In this work, we leverage the denoising process of such diffusion models for generalized IQA by understanding the degree of alignment between learnable quality-aware text prompts and images. In particular, we learn cross-attention maps from intermediate layers of the denoiser of latent diffusion models to capture quality-aware representations of images. In addition, we also introduce learnable quality-aware text prompts that enable the cross-attention features to be better quality-aware. Our extensive cross database experiments across various user-generated, synthetic, and low-light content-based benchmarking databases show that latent diffusion models can achieve superior generalization in IQA when compared to other methods in the literature.

Subjects:	Image and Video Processing (eess.IV); Machine Learning (cs.LG)
Cite as:	arXiv:2406.04654 [eess.IV]
	(or arXiv:2406.04654v1 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2406.04654

Submission history

From: Shankhanil Mitra [view email]
[v1] Fri, 7 Jun 2024 05:46:39 UTC (3,916 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:GenzIQA: Generalized Image Quality Assessment using Prompt-Guided Latent Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators