NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection

Chen, Yu; Yu, Yang; Ni, Rongrong; Zhao, Yao; Li, Haoliang

Abstract:Deepfake technologies empowered by deep learning are rapidly evolving, creating new security concerns for society. Existing multimodal detection methods usually capture audio-visual inconsistencies to expose Deepfake videos. More seriously, the advanced Deepfake technology realizes the audio-visual calibration of the critical phoneme-viseme regions, achieving a more realistic tampering effect, which brings new challenges. To address this problem, we propose a novel Deepfake detection method to mine the correlation between Non-critical Phonemes and Visemes, termed NPVForensics. Firstly, we propose the Local Feature Aggregation block with Swin Transformer (LFA-ST) to construct non-critical phoneme-viseme and corresponding facial feature streams effectively. Secondly, we design a loss function for the fine-grained motion of the talking face to measure the evolutionary consistency of non-critical phoneme-viseme. Next, we design a phoneme-viseme awareness module for cross-modal feature fusion and representation alignment, so that the modality gap can be reduced and the intrinsic complementarity of the two modalities can be better explored. Finally, a self-supervised pre-training strategy is leveraged to thoroughly learn the audio-visual correspondences in natural videos. In this manner, our model can be easily adapted to the downstream Deepfake datasets with fine-tuning. Extensive experiments on existing benchmarks demonstrate that the proposed approach outperforms state-of-the-art methods.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.06885 [cs.CV]
	(or arXiv:2306.06885v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2306.06885

Computer Science > Computer Vision and Pattern Recognition

Title:NPVForensics: Jointing Non-critical Phonemes and Visemes for Deepfake Detection

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators