4D Multimodal Co-attention Fusion Network with Latent Contrastive Alignment for Alzheimer's Diagnosis

Wei, Yuxiang; Zhang, Yanteng; Xiao, Xi; Wang, Tianyang; Wang, Xiao; Calhoun, Vince D.

Computer Science > Multimedia

arXiv:2504.16798 (cs)

[Submitted on 23 Apr 2025]

Title:4D Multimodal Co-attention Fusion Network with Latent Contrastive Alignment for Alzheimer's Diagnosis

Authors:Yuxiang Wei, Yanteng Zhang, Xi Xiao, Tianyang Wang, Xiao Wang, Vince D. Calhoun

View PDF HTML (experimental)

Abstract:Multimodal neuroimaging provides complementary structural and functional insights into both human brain organization and disease-related dynamics. Recent studies demonstrate enhanced diagnostic sensitivity for Alzheimer's disease (AD) through synergistic integration of neuroimaging data (e.g., sMRI, fMRI) with behavioral cognitive scores tabular data biomarkers. However, the intrinsic heterogeneity across modalities (e.g., 4D spatiotemporal fMRI dynamics vs. 3D anatomical sMRI structure) presents critical challenges for discriminative feature fusion. To bridge this gap, we propose M2M-AlignNet: a geometry-aware multimodal co-attention network with latent alignment for early AD diagnosis using sMRI and fMRI. At the core of our approach is a multi-patch-to-multi-patch (M2M) contrastive loss function that quantifies and reduces representational discrepancies via geometry-weighted patch correspondence, explicitly aligning fMRI components across brain regions with their sMRI structural substrates without one-to-one constraints. Additionally, we propose a latent-as-query co-attention module to autonomously discover fusion patterns, circumventing modality prioritization biases while minimizing feature redundancy. We conduct extensive experiments to confirm the effectiveness of our method and highlight the correspondance between fMRI and sMRI as AD biomarkers.

Subjects:	Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2504.16798 [cs.MM]
	(or arXiv:2504.16798v1 [cs.MM] for this version)
	https://doi.org/10.48550/arXiv.2504.16798

Submission history

From: Yuxiang Wei [view email]
[v1] Wed, 23 Apr 2025 15:18:55 UTC (1,746 KB)

Computer Science > Multimedia

Title:4D Multimodal Co-attention Fusion Network with Latent Contrastive Alignment for Alzheimer's Diagnosis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Multimedia

Title:4D Multimodal Co-attention Fusion Network with Latent Contrastive Alignment for Alzheimer's Diagnosis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators