SmartFL: Semantics Based Probabilistic Fault Localization

Wu, Yiqian; Liu, Yujie; Yin, Yi; Zeng, Muhan; Ye, Zhentao; Zhang, Xin; Xiong, Yingfei; Zhang, Lu

Computer Science > Software Engineering

arXiv:2503.23224 (cs)

[Submitted on 29 Mar 2025 (v1), last revised 3 Apr 2025 (this version, v2)]

Title:SmartFL: Semantics Based Probabilistic Fault Localization

Authors:Yiqian Wu, Yujie Liu, Yi Yin, Muhan Zeng, Zhentao Ye, Xin Zhang, Yingfei Xiong, Lu Zhang

View PDF HTML (experimental)

Abstract:Testing-based fault localization has been a research focus in software engineering in the past decades. It localizes faulty program elements based on a set of passing and failing test executions. Since whether a fault could be triggered and detected by a test is related to program semantics, it is crucial to model program semantics in fault localization approaches. Existing approaches either consider the full semantics of the program (e.g., mutation-based fault localization and angelic debugging), leading to scalability issues, or ignore the semantics of the program (e.g., spectrum-based fault localization), leading to imprecise localization results. Our key idea is: by modeling only the correctness of program values but not their full semantics, a balance could be reached between effectiveness and scalability. To realize this idea, we introduce a probabilistic model by efficient approximation of program semantics and several techniques to address scalability challenges. Our approach, SmartFL(SeMantics bAsed pRobabilisTic Fault Localization), is evaluated on a real-world dataset, Defects4J 2.0. The top-1 statement-level accuracy of our approach is {14\%}, which improves 130\% over the best SBFL and MBFL methods. The average time cost is {205} seconds per fault, which is half of SBFL methods. After combining our approach with existing approaches using the CombineFL framework, the performance of the combined approach is significantly boosted by an average of 10\% on top-1, top-3, and top-5 accuracy compared to state-of-the-art combination methods.

Comments:	Submitted to IEEE Transactions on Software Engineering Code: this https URL This update corrects the author's name
Subjects:	Software Engineering (cs.SE)
Cite as:	arXiv:2503.23224 [cs.SE]
	(or arXiv:2503.23224v2 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2503.23224

Submission history

From: Yiqian Wu [view email]
[v1] Sat, 29 Mar 2025 21:00:51 UTC (189 KB)
[v2] Thu, 3 Apr 2025 16:35:04 UTC (189 KB)

Computer Science > Software Engineering

Title:SmartFL: Semantics Based Probabilistic Fault Localization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:SmartFL: Semantics Based Probabilistic Fault Localization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators