AResNet-ViT: A Hybrid CNN-Transformer Network for Benign and Malignant Breast Nodule Classification in Ultrasound Images

Zhao, Xin; Zhu, Qianqian; Wu, Jialing

Abstract:To address the challenges of similarity between lesions and surrounding tissues, overlapping appearances of partially benign and malignant nodules, and difficulty in classification, a deep learning network that integrates CNN and Transformer is proposed for the classification of benign and malignant breast lesions in ultrasound images. This network adopts a dual-branch architecture for local-global feature extraction, making full use of the advantages of CNN in extracting local features and the ability of ViT to extract global features to enhance the network's feature extraction capabilities for breast nodules. The local feature extraction branch employs a residual network with multiple attention-guided modules, which can effectively capture the local details and texture features of breast nodules, enhance sensitivity to subtle changes within the nodules, and thus can aid in accurate classification of their benign and malignancy. The global feature extraction branch utilizes the multi-head self-attention ViT network, which can capture the overall shape, boundary, and relationship with surrounding tissues, and thereby enhancing the understanding and modeling of both nodule and global image features. Experimental results on a public ultrasound breast nodule data set show that the proposed method is better than other comparison networks, This indicates that the fusion of CNN and Transformer networks can effectively improve the performance of the classification model and provide a powerful solution for the benign-malignant classification of ultrasound breast.

Comments:	12 pages, 3 figures
Subjects:	Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2407.19316 [eess.IV]
	(or arXiv:2407.19316v1 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2407.19316

Electrical Engineering and Systems Science > Image and Video Processing

Title:AResNet-ViT: A Hybrid CNN-Transformer Network for Benign and Malignant Breast Nodule Classification in Ultrasound Images

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators