FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Nguyen, Tan M.; Suliafu, Vai; Osher, Stanley J.; Chen, Long; Wang, Bao

Computer Science > Machine Learning

arXiv:2108.02347 (cs)

[Submitted on 5 Aug 2021]

Title:FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Authors:Tan M. Nguyen, Vai Suliafu, Stanley J. Osher, Long Chen, Bao Wang

View PDF

Abstract:We propose FMMformers, a class of efficient and flexible transformers inspired by the celebrated fast multipole method (FMM) for accelerating interacting particle simulation. FMM decomposes particle-particle interaction into near-field and far-field components and then performs direct and coarse-grained computation, respectively. Similarly, FMMformers decompose the attention into near-field and far-field attention, modeling the near-field attention by a banded matrix and the far-field attention by a low-rank matrix. Computing the attention matrix for FMMformers requires linear complexity in computational time and memory footprint with respect to the sequence length. In contrast, standard transformers suffer from quadratic complexity. We analyze and validate the advantage of FMMformers over the standard transformer on the Long Range Arena and language modeling benchmarks. FMMformers can even outperform the standard transformer in terms of accuracy by a significant margin. For instance, FMMformers achieve an average classification accuracy of $60.74\%$ over the five Long Range Arena tasks, which is significantly better than the standard transformer's average accuracy of $58.70\%$.

Comments:	18 pages, 8 figures
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Numerical Analysis (math.NA)
MSC classes:	68T07
ACM classes:	I.2
Cite as:	arXiv:2108.02347 [cs.LG]
	(or arXiv:2108.02347v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2108.02347

Submission history

From: Bao Wang [view email]
[v1] Thu, 5 Aug 2021 03:21:30 UTC (5,158 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2021-08

Change to browse by:

cs
cs.AI
cs.NA
math
math.NA

References & Citations

DBLP - CS Bibliography

listing | bibtex

Stanley J. Osher
Long Chen
Bao Wang

export BibTeX citation

Computer Science > Machine Learning

Title:FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators