Exploring Cognitive and Aesthetic Causality for Multimodal Aspect-Based Sentiment Analysis

Xiao, Luwei; Mao, Rui; Zhao, Shuai; Lin, Qika; Jia, Yanhao; He, Liang; Cambria, Erik

Abstract:Multimodal aspect-based sentiment classification (MASC) is an emerging task due to an increase in user-generated multimodal content on social platforms, aimed at predicting sentiment polarity toward specific aspect targets (i.e., entities or attributes explicitly mentioned in text-image pairs). Despite extensive efforts and significant achievements in existing MASC, substantial gaps remain in understanding fine-grained visual content and the cognitive rationales derived from semantic content and impressions (cognitive interpretations of emotions evoked by image content). In this study, we present Chimera: a cognitive and aesthetic sentiment causality understanding framework to derive fine-grained holistic features of aspects and infer the fundamental drivers of sentiment expression from both semantic perspectives and affective-cognitive resonance (the synergistic effect between emotional responses and cognitive interpretations). Specifically, this framework first incorporates visual patch features for patch-word alignment. Meanwhile, it extracts coarse-grained visual features (e.g., overall image representation) and fine-grained visual regions (e.g., aspect-related regions) and translates them into corresponding textual descriptions (e.g., facial, aesthetic). Finally, we leverage the sentimental causes and impressions generated by a large language model (LLM) to enhance the model's awareness of sentimental cues evoked by semantic content and affective-cognitive resonance. Experimental results on standard MASC datasets demonstrate the effectiveness of the proposed model, which also exhibits greater flexibility to MASC compared to LLMs such as GPT-4o. We have publicly released the complete implementation and dataset at this https URL

Comments:	Accepted by TAFFC 2025
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2504.15848 [cs.CL]
	(or arXiv:2504.15848v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.15848

Computer Science > Computation and Language

Title:Exploring Cognitive and Aesthetic Causality for Multimodal Aspect-Based Sentiment Analysis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators