STAA-SNN: Spatial-Temporal Attention Aggregator for Spiking Neural Networks

Zhang, Tianqing; Yu, Kairong; Zhong, Xian; Wang, Hongwei; Xu, Qi; Zhang, Qiang

Computer Science > Computer Vision and Pattern Recognition

arXiv:2503.02689 (cs)

[Submitted on 4 Mar 2025 (v1), last revised 5 Mar 2025 (this version, v2)]

Title:STAA-SNN: Spatial-Temporal Attention Aggregator for Spiking Neural Networks

Authors:Tianqing Zhang, Kairong Yu, Xian Zhong, Hongwei Wang, Qi Xu, Qiang Zhang

View PDF HTML (experimental)

Abstract:Spiking Neural Networks (SNNs) have gained significant attention due to their biological plausibility and energy efficiency, making them promising alternatives to Artificial Neural Networks (ANNs). However, the performance gap between SNNs and ANNs remains a substantial challenge hindering the widespread adoption of SNNs. In this paper, we propose a Spatial-Temporal Attention Aggregator SNN (STAA-SNN) framework, which dynamically focuses on and captures both spatial and temporal dependencies. First, we introduce a spike-driven self-attention mechanism specifically designed for SNNs. Additionally, we pioneeringly incorporate position encoding to integrate latent temporal relationships into the incoming features. For spatial-temporal information aggregation, we employ step attention to selectively amplify relevant features at different steps. Finally, we implement a time-step random dropout strategy to avoid local optima. As a result, STAA-SNN effectively captures both spatial and temporal dependencies, enabling the model to analyze complex patterns and make accurate predictions. The framework demonstrates exceptional performance across diverse datasets and exhibits strong generalization capabilities. Notably, STAA-SNN achieves state-of-the-art results on neuromorphic datasets CIFAR10-DVS, with remarkable performances of 97.14%, 82.05% and 70.40% on the static datasets CIFAR-10, CIFAR-100 and ImageNet, respectively. Furthermore, our model exhibits improved performance ranging from 0.33\% to 2.80\% with fewer time steps. The code for the model is available on GitHub.

Comments:	Accepted by CVPR 2025
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2503.02689 [cs.CV]
	(or arXiv:2503.02689v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2503.02689

Submission history

From: Kairong Yu [view email]
[v1] Tue, 4 Mar 2025 15:02:32 UTC (4,550 KB)
[v2] Wed, 5 Mar 2025 03:41:41 UTC (4,550 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:STAA-SNN: Spatial-Temporal Attention Aggregator for Spiking Neural Networks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:STAA-SNN: Spatial-Temporal Attention Aggregator for Spiking Neural Networks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators