How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

Rumala, Dewinda Julianensi

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2309.00350 (eess)

[Submitted on 1 Sep 2023]

Title:How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

Authors:Dewinda Julianensi Rumala

View PDF

Abstract:Deep learning models have revolutionized the field of medical image analysis, offering significant promise for improved diagnostics and patient care. However, their performance can be misleadingly optimistic due to a hidden pitfall called 'data leakage'. In this study, we investigate data leakage in 3D medical imaging, specifically using 3D Convolutional Neural Networks (CNNs) for brain MRI analysis. While 3D CNNs appear less prone to leakage than 2D counterparts, improper data splitting during cross-validation (CV) can still pose issues, especially with longitudinal imaging data containing repeated scans from the same subject. We explore the impact of different data splitting strategies on model performance for longitudinal brain MRI analysis and identify potential data leakage concerns. GradCAM visualization helps reveal shortcuts in CNN models caused by identity confounding, where the model learns to identify subjects along with diagnostic features. Our findings, consistent with prior research, underscore the importance of subject-wise splitting and evaluating our model further on hold-out data from different subjects to ensure the integrity and reliability of deep learning models in medical image analysis.

Comments:	submitted to MICCAI FAIMI 2023
Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2309.00350 [eess.IV]
	(or arXiv:2309.00350v1 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2309.00350

Submission history

From: Dewinda Julianensi Rumala [view email]
[v1] Fri, 1 Sep 2023 09:15:06 UTC (3,107 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:How You Split Matters: Data Leakage and Subject Characteristics Studies in Longitudinal Brain MRI Analysis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators