Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow

Azarkhalili, Behrooz; Libbrecht, Maxwell

Computer Science > Machine Learning

arXiv:2502.15765 (cs)

[Submitted on 14 Feb 2025]

Title:Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow

Authors:Behrooz Azarkhalili, Maxwell Libbrecht

View PDF HTML (experimental)

Abstract:This paper introduces Generalized Attention Flow (GAF), a novel feature attribution method for Transformer-based models to address the limitations of current approaches. By extending Attention Flow and replacing attention weights with the generalized Information Tensor, GAF integrates attention weights, their gradients, the maximum flow problem, and the barrier method to enhance the performance of feature attributions. The proposed method exhibits key theoretical properties and mitigates the shortcomings of prior techniques that rely solely on simple aggregation of attention weights. Our comprehensive benchmarking on sequence classification tasks demonstrates that a specific variant of GAF consistently outperforms state-of-the-art feature attribution methods in most evaluation settings, providing a more reliable interpretation of Transformer model outputs.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2502.15765 [cs.LG]
	(or arXiv:2502.15765v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2502.15765

Submission history

From: Behrooz Azarkhalili [view email]
[v1] Fri, 14 Feb 2025 19:50:58 UTC (16,524 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2025-02

Change to browse by:

cs
cs.AI

References & Citations

export BibTeX citation

Computer Science > Machine Learning

Title:Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators