Spatial-Temporal Transformer for Video Snapshot Compressive Imaging

Wang, Lishun; Cao, Miao; Zhong, Yong; Yuan, Xin

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2209.01578 (eess)

[Submitted on 4 Sep 2022 (v1), last revised 8 Sep 2022 (this version, v2)]

Title:Spatial-Temporal Transformer for Video Snapshot Compressive Imaging

Authors:Lishun Wang, Miao Cao, Yong Zhong, Xin Yuan

View PDF

Abstract:Video snapshot compressive imaging (SCI) captures multiple sequential video frames by a single measurement using the idea of computational imaging. The underlying principle is to modulate high-speed frames through different masks and these modulated frames are summed to a single measurement captured by a low-speed 2D sensor (dubbed optical encoder); following this, algorithms are employed to reconstruct the desired high-speed frames (dubbed software decoder) if needed. In this paper, we consider the reconstruction algorithm in video SCI, i.e., recovering a series of video frames from a compressed measurement. Specifically, we propose a Spatial-Temporal transFormer (STFormer) to exploit the correlation in both spatial and temporal domains. STFormer network is composed of a token generation block, a video reconstruction block, and these two blocks are connected by a series of STFormer blocks. Each STFormer block consists of a spatial self-attention branch, a temporal self-attention branch and the outputs of these two branches are integrated by a fusion network. Extensive results on both simulated and real data demonstrate the state-of-the-art performance of STFormer. The code and models are publicly available at this https URL

Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2209.01578 [eess.IV]
	(or arXiv:2209.01578v2 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2209.01578

Submission history

From: Lishun Wang [view email]
[v1] Sun, 4 Sep 2022 09:24:17 UTC (21,715 KB)
[v2] Thu, 8 Sep 2022 04:56:25 UTC (21,715 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:Spatial-Temporal Transformer for Video Snapshot Compressive Imaging

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:Spatial-Temporal Transformer for Video Snapshot Compressive Imaging

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators