DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps

Chatzimparmpas, Angelos; Martins, Rafael M.; Telea, Alexandru C.; Kerren, Andreas

doi:10.1111/cgf.15004

Computer Science > Machine Learning

arXiv:2304.00133 (cs)

[Submitted on 31 Mar 2023 (v1), last revised 18 Apr 2024 (this version, v5)]

Title:DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps

Authors:Angelos Chatzimparmpas, Rafael M. Martins, Alexandru C. Telea, Andreas Kerren

View PDF HTML (experimental)

Abstract:As the complexity of machine learning (ML) models increases and their application in different (and critical) domains grows, there is a strong demand for more interpretable and trustworthy ML. A direct, model-agnostic, way to interpret such models is to train surrogate models-such as rule sets and decision trees-that sufficiently approximate the original ones while being simpler and easier-to-explain. Yet, rule sets can become very lengthy, with many if-else statements, and decision tree depth grows rapidly when accurately emulating complex ML models. In such cases, both approaches can fail to meet their core goal-providing users with model interpretability. To tackle this, we propose DeforestVis, a visual analytics tool that offers summarization of the behaviour of complex ML models by providing surrogate decision stumps (one-level decision trees) generated with the Adaptive Boosting (AdaBoost) technique. DeforestVis helps users to explore the complexity versus fidelity trade-off by incrementally generating more stumps, creating attribute-based explanations with weighted stumps to justify decision making, and analysing the impact of rule overriding on training instance allocation between one or more stumps. An independent test set allows users to monitor the effectiveness of manual rule changes and form hypotheses based on case-by-case analyses. We show the applicability and usefulness of DeforestVis with two use cases and expert interviews with data analysts and model developers.

Comments:	This manuscript is accepted for publication in Computer Graphics Forum (CGF)
Subjects:	Machine Learning (cs.LG); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2304.00133 [cs.LG]
	(or arXiv:2304.00133v5 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2304.00133
Related DOI:	https://doi.org/10.1111/cgf.15004

Submission history

From: Angelos Chatzimparmpas [view email]
[v1] Fri, 31 Mar 2023 21:17:15 UTC (5,608 KB)
[v2] Tue, 4 Jul 2023 19:01:09 UTC (6,033 KB)
[v3] Mon, 23 Oct 2023 19:37:22 UTC (4,597 KB)
[v4] Mon, 18 Mar 2024 17:35:16 UTC (13,087 KB)
[v5] Thu, 18 Apr 2024 16:46:45 UTC (13,087 KB)

Computer Science > Machine Learning

Title:DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:DeforestVis: Behavior Analysis of Machine Learning Models with Surrogate Decision Stumps

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators