Gaze-directed Vision GNN for Mitigating Shortcut Learning in Medical Image

Wu, Shaoxuan; Zhang, Xiao; Wang, Bin; Jin, Zhuo; Li, Hansheng; Feng, Jun

Computer Science > Computer Vision and Pattern Recognition

arXiv:2406.14050 (cs)

[Submitted on 20 Jun 2024 (v1), last revised 30 Jul 2024 (this version, v2)]

Title:Gaze-directed Vision GNN for Mitigating Shortcut Learning in Medical Image

Authors:Shaoxuan Wu, Xiao Zhang, Bin Wang, Zhuo Jin, Hansheng Li, Jun Feng

View PDF HTML (experimental)

Abstract:Deep neural networks have demonstrated remarkable performance in medical image analysis. However, its susceptibility to spurious correlations due to shortcut learning raises concerns about network interpretability and reliability. Furthermore, shortcut learning is exacerbated in medical contexts where disease indicators are often subtle and sparse. In this paper, we propose a novel gaze-directed Vision GNN (called GD-ViG) to leverage the visual patterns of radiologists from gaze as expert knowledge, directing the network toward disease-relevant regions, and thereby mitigating shortcut learning. GD-ViG consists of a gaze map generator (GMG) and a gaze-directed classifier (GDC). Combining the global modelling ability of GNNs with the locality of CNNs, GMG generates the gaze map based on radiologists' visual patterns. Notably, it eliminates the need for real gaze data during inference, enhancing the network's practical applicability. Utilizing gaze as the expert knowledge, the GDC directs the construction of graph structures by incorporating both feature distances and gaze distances, enabling the network to focus on disease-relevant foregrounds. Thereby avoiding shortcut learning and improving the network's interpretability. The experiments on two public medical image datasets demonstrate that GD-ViG outperforms the state-of-the-art methods, and effectively mitigates shortcut learning. Our code is available at this https URL.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2406.14050 [cs.CV]
	(or arXiv:2406.14050v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2406.14050

Submission history

From: Shaoxuan Wu [view email]
[v1] Thu, 20 Jun 2024 07:16:41 UTC (2,222 KB)
[v2] Tue, 30 Jul 2024 07:41:44 UTC (890 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Gaze-directed Vision GNN for Mitigating Shortcut Learning in Medical Image

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Gaze-directed Vision GNN for Mitigating Shortcut Learning in Medical Image

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators