Lightweight Models for Emotional Analysis in Video

Nguyen, Quoc-Tien; Nguyen, Hong-Hai; Huynh, Van-Thong

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.10530 (cs)

[Submitted on 13 Mar 2025 (v1), last revised 25 Mar 2025 (this version, v2)]

Title:Lightweight Models for Emotional Analysis in Video

Authors:Quoc-Tien Nguyen, Hong-Hai Nguyen, Van-Thong Huynh

View PDF HTML (experimental)

Abstract:In this study, we present an approach for efficient spatiotemporal feature extraction using MobileNetV4 and a multi-scale 3D MLP-Mixer-based temporal aggregation module. MobileNetV4, with its Universal Inverted Bottleneck (UIB) blocks, serves as the backbone for extracting hierarchical feature representations from input image sequences, ensuring both computational efficiency and rich semantic encoding. To capture temporal dependencies, we introduce a three-level MLP-Mixer module, which processes spatial features at multiple resolutions while maintaining structural integrity. Experimental results on the ABAW 8th competition demonstrate the effectiveness of our approach, showing promising performance in affective behavior analysis. By integrating an efficient vision backbone with a structured temporal modeling mechanism, the proposed framework achieves a balance between computational efficiency and predictive accuracy, making it well-suited for real-time applications in mobile and embedded computing environments.

Comments:	this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2503.10530 [cs.CV]
	(or arXiv:2503.10530v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.10530

Submission history

From: Van Thong Huynh [view email]
[v1] Thu, 13 Mar 2025 16:38:33 UTC (27 KB)
[v2] Tue, 25 Mar 2025 03:50:11 UTC (28 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Lightweight Models for Emotional Analysis in Video

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Lightweight Models for Emotional Analysis in Video

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators