Electrical Engineering and Systems Science > Image and Video Processing
[Submitted on 7 Jan 2025]
Title:Efficient and Accurate Tuberculosis Diagnosis: Attention Residual U-Net and Vision Transformer Based Detection Framework
View PDF HTML (experimental)Abstract:Tuberculosis (TB), an infectious disease caused by Mycobacterium tuberculosis, continues to be a major global health threat despite being preventable and curable. This burden is particularly high in low and middle income countries. Microscopy remains essential for diagnosing TB by enabling direct visualization of Mycobacterium tuberculosis in sputum smear samples, offering a cost effective approach for early detection and effective treatment. Given the labour-intensive nature of microscopy, automating the detection of bacilli in microscopic images is crucial to improve both the expediency and reliability of TB diagnosis. The current methodologies for detecting tuberculosis bacilli in bright field microscopic sputum smear images are hindered by limited automation capabilities, inconsistent segmentation quality, and constrained classification precision. This paper proposes a twostage deep learning methodology for tuberculosis bacilli detection, comprising bacilli segmentation followed by classification. In the initial phase, an advanced U-Net model employing attention blocks and residual connections is proposed to segment microscopic sputum smear images, enabling the extraction of Regions of Interest (ROIs). The extracted ROIs are then classified using a Vision Transformer, which we specifically customized as TBViT to enhance the precise detection of bacilli within the images. For the experiments, a newly developed dataset of microscopic sputum smear images derived from Ziehl-Neelsen-stained slides is used in conjunction with existing public datasets. The qualitative and quantitative evaluation of the experiments using various metrics demonstrates that the proposed model achieves significantly improved segmentation performance, higher classification accuracy, and a greater level of automation, surpassing existing methods.
Current browse context:
eess.IV
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.