Mitigating Transformer Overconfidence via Lipschitz Regularization

Ye, Wenqian; Ma, Yunsheng; Cao, Xu; Tang, Kun

Computer Science > Machine Learning

arXiv:2306.06849 (cs)

[Submitted on 12 Jun 2023 (v1), last revised 18 Jul 2023 (this version, v2)]

Title:Mitigating Transformer Overconfidence via Lipschitz Regularization

Authors:Wenqian Ye, Yunsheng Ma, Xu Cao, Kun Tang

View PDF

Abstract:Though Transformers have achieved promising results in many computer vision tasks, they tend to be over-confident in predictions, as the standard Dot Product Self-Attention (DPSA) can barely preserve distance for the unbounded input domain. In this work, we fill this gap by proposing a novel Lipschitz Regularized Transformer (LRFormer). Specifically, we present a new similarity function with the distance within Banach Space to ensure the Lipschitzness and also regularize the term by a contractive Lipschitz Bound. The proposed method is analyzed with a theoretical guarantee, providing a rigorous basis for its effectiveness and reliability. Extensive experiments conducted on standard vision benchmarks demonstrate that our method outperforms the state-of-the-art single forward pass approaches in prediction, calibration, and uncertainty estimation.

Comments:	Accepted by UAI 2023. (this https URL)
Subjects:	Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.06849 [cs.LG]
	(or arXiv:2306.06849v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2306.06849

Submission history

From: Wenqian Ye [view email]
[v1] Mon, 12 Jun 2023 03:47:43 UTC (863 KB)
[v2] Tue, 18 Jul 2023 16:20:43 UTC (765 KB)

Computer Science > Machine Learning

Title:Mitigating Transformer Overconfidence via Lipschitz Regularization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Mitigating Transformer Overconfidence via Lipschitz Regularization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators