On Learning the Transformer Kernel

Chowdhury, Sankalan Pal; Solomou, Adamos; Dubey, Avinava; Sachan, Mrinmaya

Computer Science > Machine Learning

arXiv:2110.08323 (cs)

[Submitted on 15 Oct 2021 (v1), last revised 21 Jul 2022 (this version, v2)]

Title:On Learning the Transformer Kernel

Authors:Sankalan Pal Chowdhury, Adamos Solomou, Avinava Dubey, Mrinmaya Sachan

View PDF

Abstract:In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers. Our framework approximates the Transformer kernel as a dot product between spectral feature maps and learns the kernel by learning the spectral distribution. This not only helps in learning a generic kernel end-to-end, but also reduces the time and space complexity of Transformers from quadratic to linear. We show that KERNELIZED TRANSFORMERS achieve performance comparable to existing efficient Transformer architectures, both in terms of accuracy as well as computational efficiency. Our study also demonstrates that the choice of the kernel has a substantial impact on performance, and kernel learning variants are competitive alternatives to fixed kernel Transformers, both in long as well as short sequence tasks.

Comments:	Accepted to TMLR
Subjects:	Machine Learning (cs.LG); Computation and Language (cs.CL)
Cite as:	arXiv:2110.08323 [cs.LG]
	(or arXiv:2110.08323v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2110.08323

Submission history

From: Sankalan Pal Chowdhury [view email]
[v1] Fri, 15 Oct 2021 19:20:25 UTC (9,776 KB)
[v2] Thu, 21 Jul 2022 16:07:06 UTC (5,715 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-10

Change to browse by:

cs
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Avinava Dubey
Mrinmaya Sachan

export BibTeX citation

Computer Science > Machine Learning

Title:On Learning the Transformer Kernel

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:On Learning the Transformer Kernel

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators