2AFC Prompting of Large Multimodal Models for Image Quality Assessment

Zhu, Hanwei; Sui, Xiangjie; Chen, Baoliang; Liu, Xuelin; Chen, Peilin; Fang, Yuming; Wang, Shiqi

Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.01162 (cs)

[Submitted on 2 Feb 2024]

Title:2AFC Prompting of Large Multimodal Models for Image Quality Assessment

Authors:Hanwei Zhu, Xiangjie Sui, Baoliang Chen, Xuelin Liu, Peilin Chen, Yuming Fang, Shiqi Wang

View PDF

Abstract:While abundant research has been conducted on improving high-level visual understanding and reasoning capabilities of large multimodal models~(LMMs), their visual quality assessment~(IQA) ability has been relatively under-explored. Here we take initial steps towards this goal by employing the two-alternative forced choice~(2AFC) prompting, as 2AFC is widely regarded as the most reliable way of collecting human opinions of visual quality. Subsequently, the global quality score of each image estimated by a particular LMM can be efficiently aggregated using the maximum a posterior estimation. Meanwhile, we introduce three evaluation criteria: consistency, accuracy, and correlation, to provide comprehensive quantifications and deeper insights into the IQA capability of five LMMs. Extensive experiments show that existing LMMs exhibit remarkable IQA ability on coarse-grained quality comparison, but there is room for improvement on fine-grained quality discrimination. The proposed dataset sheds light on the future development of IQA models based on LMMs. The codes will be made publicly available at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.01162 [cs.CV]
	(or arXiv:2402.01162v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.01162

Submission history

From: Hanwei Zhu [view email]
[v1] Fri, 2 Feb 2024 06:05:18 UTC (6,235 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:2AFC Prompting of Large Multimodal Models for Image Quality Assessment

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:2AFC Prompting of Large Multimodal Models for Image Quality Assessment

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators