Efficient Masked Image Compression with Position-Indexed Self-Attention

Dai, Chengjie; Song, Tiantian; Tang, Hui; Chen, Fangdong; Yang, Bowei; Song, Guanghua

Computer Science > Computer Vision and Pattern Recognition

arXiv:2504.12923 (cs)

[Submitted on 17 Apr 2025]

Title:Efficient Masked Image Compression with Position-Indexed Self-Attention

Authors:Chengjie Dai, Tiantian Song, Hui Tang, Fangdong Chen, Bowei Yang, Guanghua Song

View PDF HTML (experimental)

Abstract:In recent years, image compression for high-level vision tasks has attracted considerable attention from researchers. Given that object information in images plays a far more crucial role in downstream tasks than background information, some studies have proposed semantically structuring the bitstream to selectively transmit and reconstruct only the information required by these tasks. However, such methods structure the bitstream after encoding, meaning that the coding process still relies on the entire image, even though much of the encoded information will not be transmitted. This leads to redundant computations. Traditional image compression methods require a two-dimensional image as input, and even if the unimportant regions of the image are set to zero by applying a semantic mask, these regions still participate in subsequent computations as part of the image. To address such limitations, we propose an image compression method based on a position-indexed self-attention mechanism that encodes and decodes only the visible parts of the masked image. Compared to existing semantic-structured compression methods, our approach can significantly reduce computational costs.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2504.12923 [cs.CV]
	(or arXiv:2504.12923v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2504.12923

Submission history

From: Chengjie Dai [view email]
[v1] Thu, 17 Apr 2025 13:12:39 UTC (29,302 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Efficient Masked Image Compression with Position-Indexed Self-Attention

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Efficient Masked Image Compression with Position-Indexed Self-Attention

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators