DSCA: A Dual-Stream Network with Cross-Attention on Whole-Slide Image Pyramids for Cancer Prognosis

Liu, Pei; Fu, Bo; Ye, Feng; Yang, Rui; Xu, Bin; Ji, Luping

doi:10.1016/j.eswa.2023.120280

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2206.05782 (eess)

[Submitted on 12 Jun 2022 (v1), last revised 28 Mar 2023 (this version, v4)]

Title:DSCA: A Dual-Stream Network with Cross-Attention on Whole-Slide Image Pyramids for Cancer Prognosis

Authors:Pei Liu, Bo Fu, Feng Ye, Rui Yang, Bin Xu, Luping Ji

View PDF

Abstract:The cancer prognosis on gigapixel Whole-Slide Images (WSIs) has always been a challenging task. To further enhance WSI visual representations, existing methods have explored image pyramids, instead of single-resolution images, in WSIs. In spite of this, they still face two major problems: high computational cost and the unnoticed semantical gap in multi-resolution feature fusion. To tackle these problems, this paper proposes to efficiently exploit WSI pyramids from a new perspective, the dual-stream network with cross-attention (DSCA). Our key idea is to utilize two sub-streams to process the WSI patches with two resolutions, where a square pooling is devised in a high-resolution stream to significantly reduce computational costs, and a cross-attention-based method is proposed to properly handle the fusion of dual-stream features. We validate our DSCA on three publicly-available datasets with a total number of 3,101 WSIs from 1,911 patients. Our experiments and ablation studies verify that (i) the proposed DSCA could outperform existing state-of-the-art methods in cancer prognosis, by an average C-Index improvement of around 4.6%; (ii) our DSCA network is more efficient in computation -- it has more learnable parameters (6.31M vs. 860.18K) but less computational costs (2.51G vs. 4.94G), compared to a typical existing multi-resolution network. (iii) the key components of DSCA, dual-stream and cross-attention, indeed contribute to our model's performance, gaining an average C-Index rise of around 2.0% while maintaining a relatively-small computational load. Our DSCA could serve as an alternative and effective tool for WSI-based cancer prognosis.

Comments:	12 pages, 6 figures, 7 tables
Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2206.05782 [eess.IV]
	(or arXiv:2206.05782v4 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2206.05782
Journal reference:	Expert Systems with Applications, 120280 (2023)
Related DOI:	https://doi.org/10.1016/j.eswa.2023.120280

Submission history

From: Pei Liu [view email]
[v1] Sun, 12 Jun 2022 16:29:56 UTC (4,146 KB)
[v2] Wed, 22 Jun 2022 02:21:33 UTC (4,047 KB)
[v3] Fri, 16 Sep 2022 15:51:47 UTC (5,533 KB)
[v4] Tue, 28 Mar 2023 10:44:40 UTC (5,353 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:DSCA: A Dual-Stream Network with Cross-Attention on Whole-Slide Image Pyramids for Cancer Prognosis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:DSCA: A Dual-Stream Network with Cross-Attention on Whole-Slide Image Pyramids for Cancer Prognosis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators