Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

Zhang, Haimin; Xu, Min

Computer Science > Computer Vision and Pattern Recognition

arXiv:1603.06568 (cs)

[Submitted on 20 Mar 2016 (v1), last revised 3 Aug 2016 (this version, v2)]

Title:Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

Authors:Haimin Zhang, Min Xu

View PDF

Abstract:With the widespread of user-generated Internet videos, emotion recognition in those videos attracts increasing research efforts. However, most existing works are based on framelevel visual features and/or audio features, which might fail to model the temporal information, e.g. characteristics accumulated along time. In order to capture video temporal information, in this paper, we propose to analyse features in frequency domain transformed by discrete Fourier transform (DFT features). Frame-level features are firstly extract by a pre-trained deep convolutional neural network (CNN). Then, time domain features are transferred and interpolated into DFT features. CNN and DFT features are further encoded and fused for emotion classification. By this way, static image features extracted from a pre-trained deep CNN and temporal information represented by DFT features are jointly considered for video emotion recognition. Experimental results demonstrate that combining DFT features can effectively capture temporal information and therefore improve emotion recognition performance. Our approach has achieved a state-of-the-art performance on the largest video emotion dataset (VideoEmotion-8 dataset), improving accuracy from 51.1% to 62.6%.

Comments:	5 pages. arXiv admin note: substantial text overlap with arXiv:1603.06182
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1603.06568 [cs.CV]
	(or arXiv:1603.06568v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1603.06568

Submission history

From: Haimin Zhang [view email]
[v1] Sun, 20 Mar 2016 04:46:00 UTC (587 KB)
[v2] Wed, 3 Aug 2016 00:53:23 UTC (518 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Modelling Temporal Information Using Discrete Fourier Transform for Recognizing Emotions in User-generated Videos

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators