Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model

Saghatchian, Omid; Moghadam, Atiyeh Gh.; Nickabadi, Ahmad

Computer Science > Computer Vision and Pattern Recognition

arXiv:2501.00946 (cs)

[Submitted on 1 Jan 2025]

Title:Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model

Authors:Omid Saghatchian, Atiyeh Gh. Moghadam, Ahmad Nickabadi

View PDF HTML (experimental)

Abstract:Diffusion models have emerged as a promising approach for generating high-quality, high-dimensional images. Nevertheless, these models are hindered by their high computational cost and slow inference, partly due to the quadratic computational complexity of the self-attention mechanisms with respect to input size. Various approaches have been proposed to address this drawback. One such approach focuses on reducing the number of tokens fed into the self-attention, known as token merging (ToMe). In our method, which is called cached adaptive token merging(CA-ToMe), we calculate the similarity between tokens and then merge the r proportion of the most similar tokens. However, due to the repetitive patterns observed in adjacent steps and the variation in the frequency of similarities, we aim to enhance this approach by implementing an adaptive threshold for merging tokens and adding a caching mechanism that stores similar pairs across several adjacent steps. Empirical results demonstrate that our method operates as a training-free acceleration method, achieving a speedup factor of 1.24 in the denoising process while maintaining the same FID scores compared to existing approaches.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2501.00946 [cs.CV]
	(or arXiv:2501.00946v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2501.00946

Submission history

From: Omid Saghatchian [view email]
[v1] Wed, 1 Jan 2025 20:16:27 UTC (7,565 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Cached Adaptive Token Merging: Dynamic Token Reduction and Redundant Computation Elimination in Diffusion Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators