Information Theory
See recent articles
- [1] arXiv:2407.00579 [pdf, html, other]
-
Title: Active-RIS-Aided Covert Communications in NOMA-Inspired ISAC Wireless SystemsMiaomiao Zhu, Pengxu Chen, Liang Yang, Alexandros-Apostolos A. Boulogeorgos, Theodoros A. Tsiftsis, Hongwu LiuSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Non-orthogonal multiple access (NOMA)-inspired integrated sensing and communication (ISAC) facilitates spectrum sharing for radar sensing and NOMA communications, whereas facing privacy and security challenges due to open wireless propagation. In this paper, active reconfigurable intelligent surface (RIS) is employed to aid covert communications in NOMA-inspired ISAC wireless system with the aim of maximizing the covert rate. Specifically, a dual-function base-station (BS) transmits the superposition signal to sense multiple targets, while achieving covert and reliable communications for a pair of NOMA covert and public users, respectively, in the presence of a warden. Two superposition transmission schemes, namely, the transmissions with dedicated sensing signal (w-DSS) and without dedicated sensing signal (w/o-DSS), are respectively considered in the formulations of the joint transmission and reflection beamforming optimization problems. Numerical results demonstrate that active-RIS-aided NOMA-ISAC system outperforms the passive-RIS-aided and without-RIS counterparts in terms of covert rate and trade-off between covert communication and sensing performance metrics. Finally, the w/o-DSS scheme, which omits the dedicated sensing signal, achieves a higher covert rate than the w-DSS scheme by allocating more transmit power for the covert transmissions, while preserving a comparable multi-target sensing performance.
- [2] arXiv:2407.00677 [pdf, html, other]
-
Title: Combinatorial Multi-Access Coded Caching with Private CachesComments: 13 pages and 6 figuresSubjects: Information Theory (cs.IT)
We consider a variant of the coded caching problem where users connect to two types of caches, called private and access caches. The problem setting consists of a server with a library of files and a set of access caches. Each user, equipped with a private cache, connects to a distinct $r-$subset of the access caches. The server populates both types of caches with files in uncoded format. For this setting, we provide an achievable scheme and derive a lower bound on the number of transmissions for this scheme. We also present a lower and upper bound for the optimal worst-case rate under uncoded placement for this setting using the rates of the Maddah-Ali--Niesen scheme for dedicated and combinatorial multi-access coded caching settings, respectively. Further, we derive a lower bound on the optimal worst-case rate for any general placement policy using cut-set arguments. We also provide numerical plots comparing the rate of the proposed achievability scheme with the above bounds, from which it can be observed that the proposed scheme approaches the lower bound when the amount of memory accessed by a user is large. Finally, we discuss the optimality w.r.t worst-case rate when the system has four access caches.
- [3] arXiv:2407.00955 [pdf, html, other]
-
Title: Task-oriented Over-the-air Computation for Edge-device Co-inference with Balanced Classification AccuracyComments: This paper was accepted by IEEE Transactions on Vehicular Technology on June 30, 2024Subjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI); Signal Processing (eess.SP)
Edge-device co-inference, which concerns the cooperation between edge devices and an edge server for completing inference tasks over wireless networks, has been a promising technique for enabling various kinds of intelligent services at the network edge, e.g., auto-driving. In this paradigm, the concerned design objective of the network shifts from the traditional communication throughput to the effective and efficient execution of the inference task underpinned by the network, measured by, e.g., the inference accuracy and latency. In this paper, a task-oriented over-the-air computation scheme is proposed for a multidevice artificial intelligence system. Particularly, a novel tractable inference accuracy metric is proposed for classification tasks, which is called minimum pair-wise discriminant gain. Unlike prior work measuring the average of all class pairs in feature space, it measures the minimum distance of all class pairs. By maximizing the minimum pair-wise discriminant gain instead of its average counterpart, any pair of classes can be better separated in the feature space, and thus leading to a balanced and improved inference accuracy for all classes. Besides, this paper jointly optimizes the minimum discriminant gain of all feature elements instead of separately maximizing that of each element in the existing designs. As a result, the transmit power can be adaptively allocated to the feature elements according to their different contributions to the inference accuracy, opening an extra degree of freedom to improve inference performance. Extensive experiments are conducted using a concrete use case of human motion recognition to verify the superiority of the proposed design over the benchmarking scheme.
- [4] arXiv:2407.01018 [pdf, other]
-
Title: Experimental Comparison of Average-Power Constrained and Peak-Power Constrained 64QAM under Optimal Clipping in 400Gbps Unamplified Coherent LinksComments: Submitted to European Conference on Optical Communications (ECOC) 2024Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
We experimentally demonstrated an end-to-end link budget optimization over clipping in 400Gbps unamplified links, showing that the clipped MB distribution outperforms the peak-power constrained 64QAM by 1dB link budget.
- [5] arXiv:2407.01167 [pdf, html, other]
-
Title: Information Density Bounds for PrivacySara Saeidian (1), Leonhard Grosse (1), Parastoo Sadeghi (2), Mikael Skoglund (1), Tobias J. Oechtering (1) ((1) KTH Royal Institute of Technology, (2) University of New South Wales)Subjects: Information Theory (cs.IT); Cryptography and Security (cs.CR)
This paper explores the implications of guaranteeing privacy by imposing a lower bound on the information density between the private and the public data. We introduce an operationally meaningful privacy measure called pointwise maximal cost (PMC) and demonstrate that imposing an upper bound on PMC is equivalent to enforcing a lower bound on the information density. PMC quantifies the information leakage about a secret to adversaries who aim to minimize non-negative cost functions after observing the outcome of a privacy mechanism. When restricted to finite alphabets, PMC can equivalently be defined as the information leakage to adversaries aiming to minimize the probability of incorrectly guessing randomized functions of the secret. We study the properties of PMC and apply it to standard privacy mechanisms to demonstrate its practical relevance. Through a detailed examination, we connect PMC with other privacy measures that impose upper or lower bounds on the information density. Our results highlight that lower bounding the information density is a more stringent requirement than upper bounding it. Overall, our work significantly bridges the gaps in understanding the relationships between various privacy frameworks and provides insights for selecting a suitable framework for a given application.
- [6] arXiv:2407.01229 [pdf, html, other]
-
Title: On the Parameters of Codes for Data AccessSubjects: Information Theory (cs.IT)
This paper studies two crucial problems in the context of coded distributed storage systems directly related to their performance: 1) for a fixed alphabet size, determine the minimum number of servers the system must have for its service rate region to contain a prescribed set of points; 2) for a given number of servers, determine the minimum alphabet size for which the service rate region of the system contains a prescribed set of points. The paper establishes rigorous upper and lower bounds, as well as code constructions based on techniques from coding theory, optimization, and projective geometry.
- [7] arXiv:2407.01263 [pdf, html, other]
-
Title: Capacity-Maximizing Input Symbol Selection for Discrete Memoryless ChannelsSubjects: Information Theory (cs.IT)
Motivated by communication systems with constrained complexity, we consider the problem of input symbol selection for discrete memoryless channels (DMCs). Given a DMC, the goal is to find a subset of its input alphabet, so that the optimal input distribution that is only supported on these symbols maximizes the capacity among all other subsets of the same size (or smaller). We observe that the resulting optimization problem is non-concave and non-submodular, and so generic methods for such cases do not have theoretical guarantees. We derive an analytical upper bound on the capacity loss when selecting a subset of input symbols based only on the properties of the transition matrix of the channel. We propose a selection algorithm that is based on input-symbols clustering, and an appropriate choice of representatives for each cluster, which uses the theoretical bound as a surrogate objective function. We provide numerical experiments to support the findings.
- [8] arXiv:2407.01336 [pdf, html, other]
-
Title: Compressed Sensing Inspired User Acquisition for Downlink Integrated Sensing and Communication TransmissionsSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
This paper investigates radar-assisted user acquisition for downlink multi-user multiple-input multiple-output (MIMO) transmission using Orthogonal Frequency Division Multiplexing (OFDM) signals. Specifically, we formulate a concise mathematical model for the user acquisition problem, where each user is characterized by its delay and beamspace response. Therefore, we propose a two-stage method for user acquisition, where the Multiple Signal Classification (MUSIC) algorithm is adopted for delay estimation, and then a least absolute shrinkage and selection operator (LASSO) is applied for estimating the user response in the beamspace. Furthermore, we also provide a comprehensive performance analysis of the considered problem based on the pair-wise error probability (PEP). Particularly, we show that the rank and the geometric mean of non-zero eigenvalues of the squared beamspace difference matrix determines the user acquisition performance. More importantly, we reveal that simultaneously probing multiple beams outperforms concentrating power on a specific beam direction in each time slot under the power constraint, when only limited OFDM symbols are transmitted. Our numerical results confirm our conclusions and also demonstrate a promising acquisition performance of the proposed two-stage method.
- [9] arXiv:2407.01401 [pdf, html, other]
-
Title: Finite-Length Analysis of Polar Secrecy Codes for Wiretap ChannelsSubjects: Information Theory (cs.IT)
In a classical wiretap channel setting, Alice communicates with Bob through a main communication channel, while her transmission also reaches an eavesdropper Eve through a wiretap channel. In this paper, we consider a general class of polar secrecy codes for wiretap channels and study their finite-length performance. In particular, bounds on the normalized mutual information security (MIS) leakage, a fundamental measure of secrecy in information-theoretic security frameworks, are presented for polar secrecy codes. The bounds are utilized to characterize the finite-length scaling behavior of polar secrecy codes, where scaling here refers to the non-asymptotic behavior of both the gap to the secrecy capacity as well as the MIS leakage. Furthermore, the bounds are shown to facilitate characterizing numerical bounds on the secrecy guarantees of polar secrecy codes in finite block lengths of practical relevance, where directly calculating the MIS leakage is in general infeasible.
- [10] arXiv:2407.01498 [pdf, html, other]
-
Title: The Inverted 3-Sum Box: General Formulation and Quantum Information Theoretic OptimalitySubjects: Information Theory (cs.IT)
The $N$-sum box protocol specifies a class of $\mathbb{F}_d$ linear functions $f(W_1,\cdots,W_K)=V_1W_1+V_2W_2+\cdots+V_KW_K\in\mathbb{F}_d^{m\times 1}$ that can be computed at information theoretically optimal communication cost (minimum number of qudits $\Delta_1,\cdots,\Delta_K$ sent by the transmitters Alice$_1$, Alice$_2$,$\cdots$, Alice$_K$, respectively, to the receiver, Bob, per computation instance) over a noise-free quantum multiple access channel (QMAC), when the input data streams $W_k\in\mathbb{F}_d^{m_k\times 1}, k\in[K]$, originate at the distributed transmitters, who share quantum entanglement in advance but are not otherwise allowed to communicate with each other. In prior work this set of optimally computable functions is identified in terms of a strong self-orthogonality (SSO) condition on the transfer function of the $N$-sum box. In this work we consider an `inverted' scenario, where instead of a feasible $N$-sum box transfer function, we are given an arbitrary $\mathbb{F}_d$ linear function, i.e., arbitrary matrices $V_k\in\mathbb{F}_d^{m\times m_k}$ are specified, and the goal is to characterize the set of all feasible communication cost tuples $(\Delta_1,\cdots,\Delta_K)$, not just based on $N$-sum box protocols, but across all possible quantum coding schemes. As our main result, we fully solve this problem for $K=3$ transmitters ($K\geq 4$ settings remain open). Coding schemes based on the $N$-sum box protocol (along with elementary ideas such as treating qudits as classical dits, time-sharing and batch-processing) are shown to be information theoretically optimal in all cases. As an example, in the symmetric case where rk$(V_1)$=rk$(V_2)$=rk$(V_3) \triangleq r_1$, rk$([V_1, V_2])$=rk$([V_2, V_3])$=rk$([V_3, V_1])\triangleq r_2$, and rk$([V_1, V_2, V_3])\triangleq r_3$ (rk = rank), the minimum total-download cost is $\max \{1.5r_1 + 0.75(r_3 - r_2), r_3\}$.
New submissions for Tuesday, 2 July 2024 (showing 10 of 10 entries )
- [11] arXiv:2407.00020 (cross-list from cs.CV) [pdf, html, other]
-
Title: Visual Language Model based Cross-modal Semantic Communication SystemsComments: 12 pages, 10 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Theory (cs.IT); Machine Learning (cs.LG)
Semantic Communication (SC) has emerged as a novel communication paradigm in recent years, successfully transcending the Shannon physical capacity limits through innovative semantic transmission concepts. Nevertheless, extant Image Semantic Communication (ISC) systems face several challenges in dynamic environments, including low semantic density, catastrophic forgetting, and uncertain Signal-to-Noise Ratio (SNR). To address these challenges, we propose a novel Vision-Language Model-based Cross-modal Semantic Communication (VLM-CSC) system. The VLM-CSC comprises three novel components: (1) Cross-modal Knowledge Base (CKB) is used to extract high-density textual semantics from the semantically sparse image at the transmitter and reconstruct the original image based on textual semantics at the receiver. The transmission of high-density semantics contributes to alleviating bandwidth pressure. (2) Memory-assisted Encoder and Decoder (MED) employ a hybrid long/short-term memory mechanism, enabling the semantic encoder and decoder to overcome catastrophic forgetting in dynamic environments when there is a drift in the distribution of semantic features. (3) Noise Attention Module (NAM) employs attention mechanisms to adaptively adjust the semantic coding and the channel coding based on SNR, ensuring the robustness of the CSC system. The experimental simulations validate the effectiveness, adaptability, and robustness of the CSC system.
- [12] arXiv:2407.00064 (cross-list from cs.DB) [pdf, html, other]
-
Title: Constraint based Modeling according to Reference DesignJournal-ref: Conference on Perspectives in Business Informatics Research (BIR 2023)Subjects: Databases (cs.DB); Information Retrieval (cs.IR); Information Theory (cs.IT); Software Engineering (cs.SE)
Reference models in form of best practices are an essential element to ensured knowledge as design for reuse. Popular modeling approaches do not offer mechanisms to embed reference models in a supporting way, let alone a repository of it. Therefore, it is hardly possible to profit from this expertise. The problem is that the reference models are not described formally enough to be helpful in developing solutions. Consequently, the challenge is about the process, how a user can be supported in designing dedicated solutions assisted by reference models. In this paper, we present a generic approach for the formal description of reference models using semantic technologies and their application. Our modeling assistant allows the construction of solution models using different techniques based on reference building blocks. This environment enables the subsequent verification of the developed designs against the reference models for conformity. Therefore, our reference modeling assistant highlights the interdependency. The application of these techniques contributes to the formalization of requirements and finally to quality assurance in context of maturity model. It is possible to use multiple reference models in context of system of system designs. The approach is evaluated in industrial area and it can be integrated into different modeling landscapes.
- [13] arXiv:2407.00241 (cross-list from quant-ph) [pdf, html, other]
-
Title: Exploiting Structure in Quantum Relative Entropy ProgramsComments: 36 pages, 8 tablesSubjects: Quantum Physics (quant-ph); Information Theory (cs.IT); Optimization and Control (math.OC)
Quantum relative entropy programs are convex optimization problems which minimize a linear functional over an affine section of the epigraph of the quantum relative entropy function. Recently, the self-concordance of a natural barrier function was proved for this set. This has opened up the opportunity to use interior-point methods for nonsymmetric cone programs to solve these optimization problems. In this paper, we show how common structures arising from applications in quantum information theory can be exploited to improve the efficiency of solving quantum relative entropy programs using interior-point methods. First, we show that the natural barrier function for the epigraph of the quantum relative entropy composed with positive linear operators is optimally self-concordant, even when these linear operators map to singular matrices. Second, we show how we can exploit a catalogue of common structures in these linear operators to compute the inverse Hessian products of the barrier function more efficiently. This step is typically the bottleneck when solving quantum relative entropy programs using interior-point methods, and therefore improving the efficiency of this step can significantly improve the computational performance of the algorithm. We demonstrate how these methods can be applied to important applications in quantum information theory, including quantum key distribution, quantum rate-distortion, quantum channel capacities, and estimating the ground state energy of Hamiltonians. Our numerical results show that these techniques improve computation times by up to several orders of magnitude, and allow previously intractable problems to be solved.
- [14] arXiv:2407.00289 (cross-list from eess.SY) [pdf, html, other]
-
Title: Personalised Outfit Recommendation via History-aware TransformersSubjects: Systems and Control (eess.SY); Information Theory (cs.IT)
We present the history-aware transformer (HAT), a transformer-based model that uses shoppers' purchase history to personalise outfit predictions. The aim of this work is to recommend outfits that are internally coherent while matching an individual shopper's style and taste. To achieve this, we stack two transformer models, one that produces outfit representations and another one that processes the history of purchased outfits for a given shopper. We use these models to score an outfit's compatibility in the context of a shopper's preferences as inferred from their previous purchases. During training, the model learns to discriminate between purchased and random outfits using 3 losses: the focal loss for outfit compatibility typically used in the literature, a contrastive loss to bring closer learned outfit embeddings from a shopper's history, and an adaptive margin loss to facilitate learning from weak negatives. Together, these losses enable the model to make personalised recommendations based on a shopper's purchase history.
Our experiments on the IQON3000 and Polyvore datasets show that HAT outperforms strong baselines on the outfit Compatibility Prediction (CP) and the Fill In The Blank (FITB) tasks. The model improves AUC for the CP hard task by 15.7% (IQON3000) and 19.4% (Polyvore) compared to previous SOTA results. It further improves accuracy on the FITB hard task by 6.5% and 9.7%, respectively. We provide ablation studies on the personalisation, constrastive loss, and adaptive margin loss that highlight the importance of these modelling choices. - [15] arXiv:2407.00412 (cross-list from cs.RO) [pdf, html, other]
-
Title: C-MASS: Combinatorial Mobility-Aware Sensor Scheduling for Collaborative Perception with Second-Order Topology ApproximationComments: 14 pages, 10 figuresSubjects: Robotics (cs.RO); Information Theory (cs.IT); Multiagent Systems (cs.MA); Networking and Internet Architecture (cs.NI)
Collaborative Perception (CP) has been a promising solution to address occlusions in the traffic environment by sharing sensor data among collaborative vehicles (CoV) via vehicle-to-everything (V2X) network. With limited wireless bandwidth, CP necessitates task-oriented and receiver-aware sensor scheduling to prioritize important and complementary sensor data. However, due to vehicular mobility, it is challenging and costly to obtain the up-to-date perception topology, i.e., whether a combination of CoVs can jointly detect an object. In this paper, we propose a combinatorial mobility-aware sensor scheduling (C-MASS) framework for CP with minimal communication overhead. Specifically, detections are replayed with sensor data from individual CoVs and pairs of CoVs to maintain an empirical perception topology up to the second order, which approximately represents the complete perception topology. A hybrid greedy algorithm is then proposed to solve a variant of the budgeted maximum coverage problem with a worst-case performance guarantee. The C-MASS scheduling algorithm adapts the greedy algorithm by incorporating the topological uncertainty and the unexplored time of CoVs to balance exploration and exploitation, addressing the mobility challenge. Extensive numerical experiments demonstrate the near-optimality of the proposed C-MASS framework in both edge-assisted and distributed CP configurations. The weighted recall improvements over object-level CP are 5.8% and 4.2%, respectively. Compared to distance-based and area-based greedy heuristics, the gaps to the offline optimal solutions are reduced by up to 75% and 71%, respectively.
- [16] arXiv:2407.00482 (cross-list from cs.LG) [pdf, html, other]
-
Title: Quantifying Spuriousness of Biased Datasets Using Partial Information DecompositionComments: Accepted at ICML 2024 Workshop on Data-centric Machine Learning Research (DMLR): Datasets for Foundation ModelsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Information Theory (cs.IT)
Spurious patterns refer to a mathematical association between two or more variables in a dataset that are not causally related. However, this notion of spuriousness, which is usually introduced due to sampling biases in the dataset, has classically lacked a formal definition. To address this gap, this work presents the first information-theoretic formalization of spuriousness in a dataset (given a split of spurious and core features) using a mathematical framework called Partial Information Decomposition (PID). Specifically, we disentangle the joint information content that the spurious and core features share about another target variable (e.g., the prediction label) into distinct components, namely unique, redundant, and synergistic information. We propose the use of unique information, with roots in Blackwell Sufficiency, as a novel metric to formally quantify dataset spuriousness and derive its desirable properties. We empirically demonstrate how higher unique information in the spurious features in a dataset could lead a model into choosing the spurious features over the core features for inference, often having low worst-group-accuracy. We also propose a novel autoencoder-based estimator for computing unique information that is able to handle high-dimensional image data. Finally, we also show how this unique information in the spurious feature is reduced across several dataset-based spurious-pattern-mitigation techniques such as data reweighting and varying levels of background mixing, demonstrating a novel tradeoff between unique information (spuriousness) and worst-group-accuracy.
- [17] arXiv:2407.00750 (cross-list from cs.CR) [pdf, html, other]
-
Title: Physical Layer Deception with Non-Orthogonal MultiplexingComments: Submitted to IEEE Transactions on Wireless CommunicationsSubjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)
Physical layer security (PLS) is a promising technology to secure wireless communications by exploiting the physical properties of the wireless channel. However, the passive nature of PLS creates a significant imbalance between the effort required by eavesdroppers and legitimate users to secure data. To address this imbalance, in this article, we propose a novel framework of physical layer deception (PLD), which combines PLS with deception technologies to actively counteract wiretapping attempts. Combining a two-stage encoder with randomized ciphering and non-orthogonal multiplexing, the PLD approach enables the wireless communication system to proactively counter eavesdroppers with deceptive messages. Relying solely on the superiority of the legitimate channel over the eavesdropping channel, the PLD framework can effectively protect the confidentiality of the transmitted messages, even against eavesdroppers who possess knowledge equivalent to that of the legitimate receiver. We prove the validity of the PLD framework with in-depth analyses and demonstrate its superiority over conventional PLS approaches with comprehensive numerical benchmarks.
- [18] arXiv:2407.01250 (cross-list from cs.LG) [pdf, html, other]
-
Title: Metric-Entropy Limits on Nonlinear Dynamical System LearningSubjects: Machine Learning (cs.LG); Information Theory (cs.IT); Dynamical Systems (math.DS)
This paper is concerned with the fundamental limits of nonlinear dynamical system learning from input-output traces. Specifically, we show that recurrent neural networks (RNNs) are capable of learning nonlinear systems that satisfy a Lipschitz property and forget past inputs fast enough in a metric-entropy optimal manner. As the sets of sequence-to-sequence maps realized by the dynamical systems we consider are significantly more massive than function classes generally considered in deep neural network approximation theory, a refined metric-entropy characterization is needed, namely in terms of order, type, and generalized dimension. We compute these quantities for the classes of exponentially-decaying and polynomially-decaying Lipschitz fading-memory systems and show that RNNs can achieve them.
- [19] arXiv:2407.01305 (cross-list from eess.SP) [pdf, html, other]
-
Title: Linear and Nonlinear MMSE Estimation in One-Bit Quantized Systems under a Gaussian Mixture PriorSubjects: Signal Processing (eess.SP); Information Theory (cs.IT)
We present new fundamental results for the mean square error (MSE)-optimal conditional mean estimator (CME) in one-bit quantized systems for a Gaussian mixture model (GMM) distributed signal of interest, possibly corrupted by additive white Gaussian noise (AWGN). We first derive novel closed-form analytic expressions for the Bussgang estimator, the well-known linear minimum mean square error (MMSE) estimator in quantized systems. Afterward, closed-form analytic expressions for the CME in special cases are presented, revealing that the optimal estimator is linear in the one-bit quantized observation, opposite to higher resolution cases. Through a comparison to the recently studied Gaussian case, we establish a novel MSE inequality and show that that the signal of interest is correlated with the auxiliary quantization noise. We extend our analysis to multiple observation scenarios, examining the MSE-optimal transmit sequence and conducting an asymptotic analysis, yielding analytic expressions for the MSE and its limit. These contributions have broad impact for the analysis and design of various signal processing applications.
Cross submissions for Tuesday, 2 July 2024 (showing 9 of 9 entries )
- [20] arXiv:2106.02797 (replaced) [pdf, html, other]
-
Title: Neural Distributed Source CodingComments: To be published in JSAITSubjects: Information Theory (cs.IT); Machine Learning (cs.LG)
Distributed source coding (DSC) is the task of encoding an input in the absence of correlated side information that is only available to the decoder. Remarkably, Slepian and Wolf showed in 1973 that an encoder without access to the side information can asymptotically achieve the same compression rate as when the side information is available to it. While there is vast prior work on this topic, practical DSC has been limited to synthetic datasets and specific correlation structures. Here we present a framework for lossy DSC that is agnostic to the correlation structure and can scale to high dimensions. Rather than relying on hand-crafted source modeling, our method utilizes a conditional Vector-Quantized Variational Autoencoder (VQ-VAE) to learn the distributed encoder and decoder. We evaluate our method on multiple datasets and show that our method can handle complex correlations and achieves state-of-the-art PSNR. Our code is made available at this https URL.
- [21] arXiv:2301.10980 (replaced) [pdf, html, other]
-
Title: Beyond scalar quasi-arithmetic means: Quasi-arithmetic averages and quasi-arithmetic mixtures in information geometryComments: 21 pagesSubjects: Information Theory (cs.IT)
We generalize quasi-arithmetic means beyond scalars by considering the gradient map of a Legendre type real-valued function. The gradient map of a Legendre type function is proven strictly comonotone with a global inverse. It thus yields a generalization of strictly mononotone and differentiable functions generating scalar quasi-arithmetic means. Furthermore, the Legendre transformation gives rise to pairs of dual quasi-arithmetic averages via the convex duality. We study the invariance and equivariance properties under affine transformations of quasi-arithmetic averages via the lens of dually flat spaces of information geometry. We show how these quasi-arithmetic averages are used to express points on dual geodesics and sided barycenters in the dual affine coordinate systems. We then consider quasi-arithmetic mixtures and describe several parametric and non-parametric statistical models which are closed under the quasi-arithmetic mixture operation.
- [22] arXiv:2304.10391 (replaced) [pdf, html, other]
-
Title: DNA-Correcting Codes: End-to-end Correction in DNA Storage SystemsComments: Extended version of the paper that appeared in ISIT 2023Subjects: Information Theory (cs.IT)
This paper introduces a new solution to DNA storage that integrates all three steps of retrieval, namely clustering, reconstruction, and error correction. DNA-correcting codes are presented as a unique solution to the problem of ensuring that the output of the storage system is unique for any valid set of input strands. To this end, we introduce a novel distance metric to capture the unique behavior of the DNA storage system and provide necessary and sufficient conditions for DNA-correcting codes. The paper also includes several bounds and constructions of DNA-correcting codes.
- [23] arXiv:2306.16861 (replaced) [pdf, html, other]
-
Title: Beamfocusing Optimization for Near-Field Wideband Multi-User CommunicationsComments: 16 pages, 14 figuresSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
A near-field wideband communication system is investigated in which a base station (BS) employs an extra-large scale antenna array (ELAA) to serve multiple users in its near-field region. To facilitate near-field multi-user beamforming and mitigate the spatial wideband effect, the BS employs a hybrid beamforming architecture based on true-time delayers (TTDs). In addition to the conventional fully-connected TTD-based hybrid beamforming architecture, a new sub-connected architecture is proposed to improve energy efficiency and reduce hardware requirements. Two wideband beamforming optimization approaches are proposed to maximize spectral efficiency for both architectures. 1) Fully-digital approximation (FDA) approach: In this method, the TTD-based hybrid beamformer is optimized by the block-coordinate descent and penalty method to approximate the optimal digital beamformer. This approach ensures convergence to the stationary point of the spectral efficiency maximization problem. 2) Heuristic two-stage (HTS) approach: In this approach, the analog and digital beamformers are designed in two stages. In particular, two low-complexity methods are proposed to design the high-dimensional analog beamformers based on approximate and exact line-of-sight channels, respectively. Subsequently, the low-dimensional digital beamformer is optimized based on the low-dimensional equivalent channels, resulting in reduced computational complexity and channel estimation complexity. Our numerical results show that 1) the proposed approach effectively eliminates the spatial wideband effect, and 2) the proposed sub-connected architecture is more energy efficient and has fewer hardware constraints on the TTD and system bandwidth compared to the fully-connected architecture.
- [24] arXiv:2308.12619 (replaced) [pdf, html, other]
-
Title: Low-complexity eigenvector prediction-based precoding matrix prediction in massive MIMO with mobilityComments: 13pages, 8 figures, 1 table, journalSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
In practical massive multiple-input multiple-output (MIMO) systems, the precoding matrix is often obtained from the eigenvectors of channel matrices and is challenging to update in time due to finite computation resources at the base station, especially in mobile scenarios. In order to reduce the precoding complexity while enhancing the spectral efficiency (SE), a novel precoding matrix prediction method based on the eigenvector prediction (EGVP) is proposed. The basic idea is to decompose the periodic uplink channel eigenvector samples into a linear combination of the channel state information (CSI) and channel weights. We further prove that the channel weights can be interpolated by an exponential model corresponding to the Doppler characteristics of the CSI. A fast matrix pencil prediction (FMPP) method is also devised to predict the CSI. We also prove that our scheme achieves asymptotically error-free precoder prediction with a distinct complexity advantage. Simulation results show that under the perfect non-delayed CSI, the proposed EGVP method reduces floating point operations by 80\% without losing SE performance compared to the traditional full-time precoding scheme. In more realistic cases with CSI delays, the proposed EGVP-FMPP scheme has clear SE performance gains compared to the precoding scheme widely used in current communication systems.
- [25] arXiv:2309.14145 (replaced) [pdf, html, other]
-
Title: Feedback Increases the Capacity of Queues with Bounded Service TimesComments: 11 pages; two-columnSubjects: Information Theory (cs.IT)
In the "Bits Through Queues" paper, it was hypothesized that full feedback always increases the capacity of first-in-first-out queues, except when the service time distribution is memoryless. More recently, a non-explicit sufficient condition under which feedback increases capacity was provided, along with simple examples of service times meeting this condition. While this condition yields examples where feedback is beneficial, it does not offer explicit structural properties of such service times.
In this paper, we show that full feedback increases capacity whenever the service time has bounded support. This is achieved by investigating a generalized notion of feedback, with full feedback and weak feedback as particular cases. - [26] arXiv:2401.16647 (replaced) [pdf, html, other]
-
Title: A Family of Low-Complexity Binary Codes with Constant Hamming WeightsComments: Submitted to Designs, Codes and CryptographySubjects: Information Theory (cs.IT); Combinatorics (math.CO)
In this paper, we focus on the design of binary constant weight codes that admit low-complexity encoding and decoding algorithms, and that have a size $M=2^k$. For every integer $\ell \geq 3$, we construct a $(n=2^\ell, M=2^{k_{\ell}}, d=2)$ constant weight code ${\cal C}[\ell]$ of weight $\ell$ by encoding information in the gaps between successive $1$'s. The code is associated with an integer sequence of length $\ell$ with a constraint defined as {\em anchor-decodability} that ensures low complexity for encoding and decoding. The complexity of the encoding is linear in the input size $k$, and that of the decoding is poly-logarithmic in the input size $n$, discounting the linear time spent on parsing the input. Both the algorithms do not require expensive computation of binomial coefficients, unlike the case in many existing schemes. Among codes generated by all anchor-decodable sequences, we show that ${\cal C}[\ell]$ has the maximum size with $k_{\ell} \geq \ell^2-\ell\log_2\ell + \log_2\ell - 0.279\ell - 0.721$. As $k$ is upper bounded by $\ell^2-\ell\log_2\ell +O(\ell)$ information-theoretically, the code ${\cal C}[\ell]$ is optimal in its size with respect to two higher order terms of $\ell$. In particular, $k_\ell$ meets the upper bound for $\ell=3$ and one-bit away for $\ell=4$. On the other hand, we show that ${\cal C}[\ell]$ is not unique in attaining $k_{\ell}$ by constructing an alternate code ${\cal \hat{C}}[\ell]$ again parameterized by an integer $\ell \geq 3$ with a different low-complexity decoder, yet having the same size $2^{k_{\ell}}$ when $3 \leq \ell \leq 7$. Finally, we also derive new codes by modifying ${\cal C}[\ell]$ that offer a wider range on blocklength and weight while retaining low complexity for encoding and decoding. For certain selected values of parameters, these modified codes too have an optimal $k$.
- [27] arXiv:2402.05625 (replaced) [pdf, html, other]
-
Title: Coded Many-User Multiple Access via Approximate Message PassingComments: 23 pages, 8 figures. A shorter version of this paper to appear in the Proceedings of IEEE ISIT 2024Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
We consider communication over the Gaussian multiple-access channel in the regime where the number of users grows linearly with the codelength. In this regime, schemes based on sparse superposition coding can achieve a near-optimal tradeoff between spectral efficiency and signal-to-noise ratio. However, these schemes are feasible only for small values of user payload. This paper investigates efficient schemes for larger user payloads, focusing on coded CDMA schemes where each user's information is encoded via a linear code before being modulated with a signature sequence. We propose an efficient approximate message passing (AMP) decoder that can be tailored to the structure of the linear code, and provide an exact asymptotic characterization of its performance. Based on this result, we consider a decoder that integrates AMP and belief propagation and characterize its tradeoff between spectral efficiency and signal-to-noise ratio, for a given target error rate. Simulation results show that the decoder achieves state-of-the-art performance at finite lengths, with a coded CDMA scheme defined using LDPC codes and a spatially coupled matrix of signature sequences.
- [28] arXiv:2402.11656 (replaced) [pdf, html, other]
-
Title: Integrating Pre-Trained Language Model with Physical Layer CommunicationsSubjects: Information Theory (cs.IT); Computation and Language (cs.CL); Machine Learning (cs.LG); Signal Processing (eess.SP)
The burgeoning field of on-device AI communication, where devices exchange information directly through embedded foundation models, such as language models (LMs), requires robust, efficient, and generalizable communication frameworks. However, integrating these frameworks with existing wireless systems and effectively managing noise and bit errors pose significant challenges. In this work, we introduce a practical ondevice AI communication framework, integrated with physical layer (PHY) communication functions, demonstrated through its performance on a link-level simulator. Our framework incorporates end-to-end training with channel noise to enhance resilience, incorporates vector quantized variational autoencoders (VQ-VAE) for efficient and robust communication, and utilizes pre-trained encoder-decoder transformers for improved generalization capabilities. Simulations, across various communication scenarios, reveal that our framework achieves a 50% reduction in transmission size while demonstrating substantial generalization ability and noise robustness under standardized 3GPP channel models.
- [29] arXiv:2403.16458 (replaced) [pdf, html, other]
-
Title: Next Generation Advanced Transceiver Technologies for 6G and BeyondChangsheng You, Yunlong Cai, Yuanwei Liu, Marco Di Renzo, Tolga M. Duman, Aylin Yener, A. Lee SwindlehurstComments: This paper gives a comprehensive tutorial overview of next generation advanced transceiver (NGAT) technologies for 6G and beyondSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
To accommodate new applications such as extended reality, fully autonomous vehicular networks and the metaverse, next generation wireless networks are going to be subject to much more stringent performance requirements than the fifth-generation (5G) in terms of data rates, reliability, latency, and connectivity. It is thus necessary to develop next generation advanced transceiver (NGAT) technologies for efficient signal transmission and reception. In this tutorial, we explore the evolution of NGAT from three different perspectives. Specifically, we first provide an overview of new-field NGAT technology, which shifts from conventional far-field channel models to new near-field channel models. Then, three new-form NGAT technologies and their design challenges are presented, including reconfigurable intelligent surfaces, flexible antennas, and holographic multi-input multi-output (MIMO) systems. Subsequently, we discuss recent advances in semantic-aware NGAT technologies, which can utilize new metrics for advanced transceiver designs. Finally, we point out other promising transceiver technologies for future research.
- [30] arXiv:2405.19965 (replaced) [pdf, html, other]
-
Title: Several classes of BCH codes of length $n=\frac{q^{m}-1}{2}$Subjects: Information Theory (cs.IT)
BCH codes are an important class of linear codes and find extensive utilization in communication and disk storage systems.This paper mainly analyzes the negacyclic BCH code and cyclic BCH code of length $\frac{q^m-1}{2}$. For negacyclic BCH code, we give the dimensions of $C_{(n,-1,\left\lceil \frac{\delta+1}{2}\right\rceil,0)}$ for $\delta =a\frac{q^m-1}{q-1},aq^{m-1}-1$($1\leq a <\frac{q-1}{2}$) and $\delta =a\frac{q^m-1}{q-1}+b\frac{q^m-1}{q^2-1},aq^{m-1}+(a+b)q^{m-2}-1$ $(2\mid m,1\leq a+b \leq q-1$,$\left\lceil \frac{q-a-2}{2}\right\rceil\geq 1)$. Furthermore, the dimensions of negacyclic BCH codes $C_{(n,-1,\delta,0)}$ with few nonzeros and $C_{(n,-1,\delta,b)}$ with $b\neq 0$ are settled. For cyclic BCH code, we give the weight distribution of extended code $\overline{C}_{(n,1,\delta,1)}$ and the parameters of dual code $C^{\perp}_{(n,1,\delta,1)}$, where $\delta_2\leq \delta \leq \delta_1$.
- [31] arXiv:2406.19583 (replaced) [pdf, html, other]
-
Title: Interference Cancellation Information Geometry Approach for Massive MIMO Channel EstimationComments: 38 pages, 9 figuresSubjects: Information Theory (cs.IT)
In this paper, the interference cancellation information geometry approaches (IC-IGAs) for massive MIMO channel estimation are proposed. The proposed algorithms are low-complexity approximations of the minimum mean square error (MMSE) estimation. To illustrate the proposed algorithms, a unified framework of the information geometry approach for channel estimation and its geometric explanation are described first. Then, a modified form that has the same mean as the MMSE estimation is constructed. Based on this, the IC-IGA algorithm and the interference cancellation simplified information geometry approach (IC-SIGA) are derived by applying the information geometry framework. The a posteriori means on the equilibrium of the proposed algorithms are proved to be equal to the mean of MMSE estimation, and the complexity of the IC-SIGA algorithm in practical massive MIMO systems is further reduced by considering the beam-based statistical channel model (BSCM) and fast Fourier transform (FFT). Simulation results show that the proposed methods achieve similar performance as the existing information geometry approach (IGA) with lower complexity.
- [32] arXiv:2303.02432 (replaced) [pdf, html, other]
-
Title: Good Gottesman-Kitaev-Preskill codes from the NTRU cryptosystemComments: 23 pages, 10 figures, comments welcome! The final contains added clarifications and an additional proof of the Gaussian heuristic for a class of NTRU-like latticesSubjects: Quantum Physics (quant-ph); Cryptography and Security (cs.CR); Information Theory (cs.IT)
We introduce a new class of random Gottesman-Kitaev-Preskill (GKP) codes derived from the cryptanalysis of the so-called NTRU cryptosystem. The derived codes are good in that they exhibit constant rate and average distance scaling $\Delta \propto \sqrt{n}$ with high probability, where $n$ is the number of bosonic modes, which is a distance scaling equivalent to that of a GKP code obtained by concatenating single mode GKP codes into a qubit-quantum error correcting code with linear distance. The derived class of NTRU-GKP codes has the additional property that decoding for a stochastic displacement noise model is equivalent to decrypting the NTRU cryptosystem, such that every random instance of the code naturally comes with an efficient decoder. This construction highlights how the GKP code bridges aspects of classical error correction, quantum error correction as well as post-quantum cryptography. We underscore this connection by discussing the computational hardness of decoding GKP codes and propose, as a new application, a simple public key quantum communication protocol with security inherited from the NTRU cryptosystem.
- [33] arXiv:2304.07906 (replaced) [pdf, html, other]
-
Title: Sidon sets, sum-free sets and linear codesComments: Fixed an issue in Lemma 2.6 of the arXiv versionJournal-ref: Advances in Mathematics of Communications, 2024, 18(2): 549-566Subjects: Combinatorics (math.CO); Information Theory (cs.IT)
Finding the maximum size of a Sidon set in $\mathbb{F}_2^t$ is of research interest for more than 40 years. In order to tackle this problem we recall a one-to-one correspondence between sum-free Sidon sets and linear codes with minimum distance greater or equal 5. Our main contribution about codes is a new non-existence result for linear codes with minimum distance 5 based on a sharpening of the Johnson bound. This gives, on the Sidon set side, an improvement of the general upper bound for the maximum size of a Sidon set. Additionally, we characterise maximal Sidon sets, that are those Sidon sets which can not be extended by adding elements without loosing the Sidon property, up to dimension 6 and give all possible sizes for dimension 7 and 8 determined by computer calculations.
- [34] arXiv:2309.15709 (replaced) [pdf, html, other]
-
Title: Distributed Pilot Assignment for Distributed Massive-MIMO NetworksComments: Presented at the IEEE Wireless Communications and Networking Conference (WCNC) 2024Subjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT)
Pilot contamination is a critical issue in distributed massive MIMO networks, where the reuse of pilot sequences due to limited availability of orthogonal pilots for channel estimation leads to performance degradation. In this work, we propose a novel distributed pilot assignment scheme to effectively mitigate the impact of pilot contamination. Our proposed scheme not only reduces signaling overhead, but it also enhances fault-tolerance. Extensive numerical simulations are conducted to evaluate the performance of the proposed scheme. Our results establish that the proposed scheme outperforms existing centralized and distributed schemes in terms of mitigating pilot contamination and significantly enhancing network throughput.
- [35] arXiv:2310.06742 (replaced) [pdf, html, other]
-
Title: Reinforcement Learning for Optimal Transmission of Markov Sources: Belief Quantization vs Sliding Finite Window CodesComments: Submitted to Journal of Machine Learning Research. 41 pages, 5 figuresSubjects: Optimization and Control (math.OC); Information Theory (cs.IT)
We study the problem of zero-delay coding for the transmission a Markov source over a noisy channel with feedback and present a rigorous reinforcement theoretic solution which is guaranteed to achieve near-optimality. To this end, we formulate the problem as a Markov decision process (MDP) where the state is a probability-measure valued predictor/belief and the actions are quantizer maps. This MDP formulation has been used to show the optimality of certain classes of encoder policies in prior work. Despite such an analytical approach in determining optimal policies, their computation is prohibitively complex due to the uncountable nature of the constructed state space and the lack of minorization or strong ergodicity results which are commonly assumed for average cost optimal stochastic control. These challenges invite rigorous reinforcement learning methods, which entail several open questions addressed in our paper. We present two complementary approaches for this problem. In the first approach, we approximate the set of all beliefs by a finite set and use nearest-neighbor quantization to obtain a finite state MDP, whose optimal policies become near-optimal for the original MDP as the quantization becomes arbitrarily fine. In the second approach, a sliding finite window of channel outputs and quantizers together with a prior belief state serve as the state of the MDP. We then approximate this state by marginalizing over all possible beliefs, so that our policies only use the finite window term to encode the source. Under an appropriate notion of predictor stability, we show that such policies are near-optimal for the zero-delay coding problem as the window length increases. We give sufficient conditions for predictor stability to hold. Finally, we propose a reinforcement learning algorithm to compute near-optimal policies and provide a detailed comparison of the coding policies.
- [36] arXiv:2402.07025 (replaced) [pdf, html, other]
-
Title: Generalization Error of Graph Neural Networks in the Mean-field RegimeComments: Accepted in ICML 2024Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG)
This work provides a theoretical framework for assessing the generalization error of graph neural networks in the over-parameterized regime, where the number of parameters surpasses the quantity of data points. We explore two widely utilized types of graph neural networks: graph convolutional neural networks and message passing graph neural networks. Prior to this study, existing bounds on the generalization error in the over-parametrized regime were uninformative, limiting our understanding of over-parameterized network performance. Our novel approach involves deriving upper bounds within the mean-field regime for evaluating the generalization error of these graph neural networks. We establish upper bounds with a convergence rate of $O(1/n)$, where $n$ is the number of graph samples. These upper bounds offer a theoretical assurance of the networks' performance on unseen data in the challenging over-parameterized regime and overall contribute to our understanding of their performance.
- [37] arXiv:2403.16986 (replaced) [pdf, html, other]
-
Title: Dynamic Relative Representations for Goal-Oriented Semantic CommunicationsSubjects: Networking and Internet Architecture (cs.NI); Information Theory (cs.IT); Machine Learning (cs.LG)
In future 6G wireless networks, semantic and effectiveness aspects of communications will play a fundamental role, incorporating meaning and relevance into transmissions. However, obstacles arise when devices employ diverse languages, logic, or internal representations, leading to semantic mismatches that might jeopardize understanding. In latent space communication, this challenge manifests as misalignment within high-dimensional representations where deep neural networks encode data. This paper presents a novel framework for goal-oriented semantic communication, leveraging relative representations to mitigate semantic mismatches via latent space alignment. We propose a dynamic optimization strategy that adapts relative representations, communication parameters, and computation resources for energy-efficient, low-latency, goal-oriented semantic communications. Numerical results demonstrate our methodology's effectiveness in mitigating mismatches among devices, while optimizing energy consumption, delay, and effectiveness.
- [38] arXiv:2404.05271 (replaced) [pdf, html, other]
-
Title: Scheduling Multi-Server Jobs is Not EasySubjects: Data Structures and Algorithms (cs.DS); Information Theory (cs.IT)
The problem of online scheduling of multi-server jobs is considered, where there are a total of $K$ servers, and each job requires concurrent service from multiple servers for it to be processed. Each job on its arrival reveals its processing time, the number of servers from which it needs concurrent service and an online algorithm has to make scheduling decisions using only causal information, with the goal of minimizing the response/flow time. The worst case input model is considered and the performance metric is the competitive ratio. For the case, when all job processing time (sizes) are the same, we show that the competitive ratio of any deterministic/randomized algorithm is at least $\Omega(K)$ and propose an online algorithm whose competitive ratio is at most $K+1$. With equal job sizes, we also consider the resource augmentation regime where an online algorithm has access to more servers than an optimal offline algorithm. With resource augmentation, we propose a simple algorithm and show that it has a competitive ratio of $1$ when provided with $2K$ servers with respect to an optimal offline algorithm with $K$ servers. With unequal job sizes, we propose an online algorithm whose competitive ratio is at most $2K \log (K w_{\max})$, where $w_{\max}$ is the maximum size of any job.
- [39] arXiv:2404.05962 (replaced) [pdf, html, other]
-
Title: Wasserstein Dependent Graph Attention Network for Collaborative Filtering with UncertaintyComments: Accepted by IEEE TCSSSubjects: Information Retrieval (cs.IR); Information Theory (cs.IT)
Collaborative filtering (CF) is an essential technique in recommender systems that provides personalized recommendations by only leveraging user-item interactions. However, most CF methods represent users and items as fixed points in the latent space, lacking the ability to capture uncertainty. While probabilistic embedding is proposed to intergrate uncertainty, they suffer from several limitations when introduced to graph-based recommender systems. Graph convolutional network framework would confuse the semantic of uncertainty in the nodes, and similarity measured by Kullback-Leibler (KL) divergence suffers from degradation problem and demands an exponential number of samples. To address these challenges, we propose a novel approach, called the Wasserstein dependent Graph Attention network (W-GAT), for collaborative filtering with uncertainty. We utilize graph attention network and Wasserstein distance to learn Gaussian embedding for each user and item. Additionally, our method incorporates Wasserstein-dependent mutual information further to increase the similarity between positive pairs. Experimental results on three benchmark datasets show the superiority of W-GAT compared to several representative baselines. Extensive experimental analysis validates the effectiveness of W-GAT in capturing uncertainty by modeling the range of user preferences and categories associated with items.