What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning Methods

Gapp, Christian; Tappeiner, Elias; Welk, Martin; Fritscher, Karl; Gizewski, Elke Ruth; Schubert, Rainer

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.01904 (cs)

[Submitted on 28 Feb 2025]

Title:What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning Methods

Authors:Christian Gapp, Elias Tappeiner, Martin Welk, Karl Fritscher, Elke Ruth Gizewski, Rainer Schubert

View PDF HTML (experimental)

Abstract:Purpose High dimensional, multimodal data can nowadays be analyzed by huge deep neural networks with little effort. Several fusion methods for bringing together different modalities have been developed. Particularly, in the field of medicine with its presence of high dimensional multimodal patient data, multimodal models characterize the next step. However, what is yet very underexplored is how these models process the source information in detail. Methods To this end, we implemented an occlusion-based both model and performance agnostic modality contribution method that quantitatively measures the importance of each modality in the dataset for the model to fulfill its task. We applied our method to three different multimodal medical problems for experimental purposes. Results Herein we found that some networks have modality preferences that tend to unimodal collapses, while some datasets are imbalanced from the ground up. Moreover, we could determine a link between our metric and the performance of single modality trained nets. Conclusion The information gain through our metric holds remarkable potential to improve the development of multimodal models and the creation of datasets in the future. With our method we make a crucial contribution to the field of interpretability in deep learning based multimodal research and thereby notably push the integrability of multimodal AI into clinical practice. Our code is publicly available at this https URL.

Comments:	Contribution to Conference for Computer Assisted Radiology and Surgery (CARS 2025)
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
ACM classes:	I.2.1
Cite as:	arXiv:2503.01904 [cs.CV]
	(or arXiv:2503.01904v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.01904

Submission history

From: Christian Gapp [view email]
[v1] Fri, 28 Feb 2025 12:39:39 UTC (2,265 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning Methods

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:What are You Looking at? Modality Contribution in Multimodal Medical Deep Learning Methods

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators