Synchronous Multi-modal Semantic Communication System with Packet-level Coding

Tian, Yun; Ying, Jingkai; Qin, Zhijin; Jin, Ye; Tao, Xiaoming

Electrical Engineering and Systems Science > Image and Video Processing

arXiv:2408.04535 (eess)

[Submitted on 8 Aug 2024 (v1), last revised 11 Aug 2024 (this version, v2)]

Title:Synchronous Multi-modal Semantic Communication System with Packet-level Coding

Authors:Yun Tian, Jingkai Ying, Zhijin Qin, Ye Jin, Xiaoming Tao

View PDF HTML (experimental)

Abstract:Although the semantic communication with joint semantic-channel coding design has shown promising performance in transmitting data of different modalities over physical layer channels, the synchronization and packet-level forward error correction of multimodal semantics have not been well studied. Due to the independent design of semantic encoders, synchronizing multimodal features in both the semantic and time domains is a challenging problem. In this paper, we take the facial video and speech transmission as an example and propose a Synchronous Multimodal Semantic Communication System (SyncSC) with Packet-Level Coding. To achieve semantic and time synchronization, 3D Morphable Mode (3DMM) coefficients and text are transmitted as semantics, and we propose a semantic codec that achieves similar quality of reconstruction and synchronization with lower bandwidth, compared to traditional methods. To protect semantic packets under the erasure channel, we propose a packet-Level Forward Error Correction (FEC) method, called PacSC, that maintains a certain visual quality performance even at high packet loss rates. Particularly, for text packets, a text packet loss concealment module, called TextPC, based on Bidirectional Encoder Representations from Transformers (BERT) is proposed, which significantly improves the performance of traditional FEC methods. The simulation results show that our proposed SyncSC reduce transmission overhead and achieve high-quality synchronous transmission of video and speech over the packet loss network.

Comments:	12 pages, 9 figures
Subjects:	Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2408.04535 [eess.IV]
	(or arXiv:2408.04535v2 [eess.IV] for this version)
	https://doi.org/10.48550/arXiv.2408.04535

Submission history

From: Yun Tian [view email]
[v1] Thu, 8 Aug 2024 15:42:00 UTC (3,787 KB)
[v2] Sun, 11 Aug 2024 02:37:42 UTC (3,787 KB)

Electrical Engineering and Systems Science > Image and Video Processing

Title:Synchronous Multi-modal Semantic Communication System with Packet-level Coding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Electrical Engineering and Systems Science > Image and Video Processing

Title:Synchronous Multi-modal Semantic Communication System with Packet-level Coding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators