Enhancing CBMs Through Binary Distillation with Applications to Test-Time Intervention

Shen, Matthew; Hsu, Aliyah; Agarwal, Abhineet; Yu, Bin

Computer Science > Machine Learning

arXiv:2503.06730 (cs)

[Submitted on 9 Mar 2025]

Title:Enhancing CBMs Through Binary Distillation with Applications to Test-Time Intervention

Authors:Matthew Shen, Aliyah Hsu, Abhineet Agarwal, Bin Yu

View PDF HTML (experimental)

Abstract:Concept bottleneck models~(CBM) aim to improve model interpretability by predicting human level ``concepts" in a bottleneck within a deep learning model architecture. However, how the predicted concepts are used in predicting the target still either remains black-box or is simplified to maintain interpretability at the cost of prediction performance. We propose to use Fast Interpretable Greedy Sum-Trees~(FIGS) to obtain Binary Distillation~(BD). This new method, called FIGS-BD, distills a binary-augmented concept-to-target portion of the CBM into an interpretable tree-based model, while mimicking the competitive prediction performance of the CBM teacher. FIGS-BD can be used in downstream tasks to explain and decompose CBM predictions into interpretable binary-concept-interaction attributions and guide adaptive test-time intervention. Across $4$ datasets, we demonstrate that adaptive test-time intervention identifies key concepts that significantly improve performance for realistic human-in-the-loop settings that allow for limited concept interventions.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2503.06730 [cs.LG]
	(or arXiv:2503.06730v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2503.06730

Submission history

From: Matthew Shen [view email]
[v1] Sun, 9 Mar 2025 19:03:48 UTC (939 KB)

Computer Science > Machine Learning

Title:Enhancing CBMs Through Binary Distillation with Applications to Test-Time Intervention

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Enhancing CBMs Through Binary Distillation with Applications to Test-Time Intervention

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators