Information Theory
See recent articles
- [1] arXiv:2406.10556 [pdf, html, other]
-
Title: Multi-User Semantic Fusion for Semantic Communications over Degraded Broadcast ChannelsComments: accepted by China CommunicationsSubjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI)
Degraded broadcast channels (DBC) are a typical multiuser communication scenario, Semantic communications over DBC still lack in-depth research. In this paper, we design a semantic communications approach based on multi-user semantic fusion for wireless image transmission over DBC. In the proposed method, the transmitter extracts semantic features for two users separately. It then effectively fuses these semantic features for broadcasting by leveraging semantic similarity. Unlike traditional allocation of time, power, or bandwidth, the semantic fusion scheme can dynamically control the weight of the semantic features of the two users to balance the performance between the two users. Considering the different channel state information (CSI) of both users over DBC, a DBC-Aware method is developed that embeds the CSI of both users into the joint source-channel coding encoder and fusion module to adapt to the channel. Experimental results show that the proposed system outperforms the traditional broadcasting schemes.
- [2] arXiv:2406.10730 [pdf, html, other]
-
Title: Order-theoretic models for decision-making: Learning, optimization, complexity and computationComments: PhD thesisSubjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI); Logic in Computer Science (cs.LO)
The study of intelligent systems explains behaviour in terms of economic rationality. This results in an optimization principle involving a function or utility, which states that the system will evolve until the configuration of maximum utility is achieved. Recently, this theory has incorporated constraints, i.e., the optimum is achieved when the utility is maximized while respecting some information-processing constraints. This is reminiscent of thermodynamic systems. As such, the study of intelligent systems has benefited from the tools of thermodynamics. The first aim of this thesis is to clarify the applicability of these results in the study of intelligent systems.
We can think of the local transition steps in thermodynamic or intelligent systems as being driven by uncertainty. In fact, the transitions in both systems can be described in terms of majorization. Hence, real-valued uncertainty measures like Shannon entropy are simply a proxy for their more involved behaviour. More in general, real-valued functions are fundamental to study optimization and complexity in the order-theoretic approach to several topics, including economics, thermodynamics, and quantum mechanics. The second aim of this thesis is to improve on this classification.
The basic similarity between thermodynamic and intelligent systems is based on an uncertainty notion expressed by a preorder. We can also think of the transitions in the steps of a computational process as a decision-making procedure. In fact, by adding some requirements on the considered order structures, we can build an abstract model of uncertainty reduction that allows to incorporate computability, that is, to distinguish the objects that can be constructed by following a finite set of instructions from those that cannot. The third aim of this thesis is to clarify the requirements on the order structure that allow such a framework. - [3] arXiv:2406.10825 [pdf, html, other]
-
Title: Griesmer and Optimal Linear Codes from the Affine Solomon-Stiffler ConstructionComments: 21 pagesSubjects: Information Theory (cs.IT)
In their fundamental paper published in 1965, G. Solomon and J. J. Stiffler invented infinite families of codes meeting the Griesmer bound. These codes are then called Solomon-Stiffler codes and have motivated various constructions of codes meeting or close the Griesmer bound.
In this paper, we give a geometric construction of infinite families of affine and modified affine Solomon-Stiffler codes. Projective Solomon-Stiffler codes are special cases of our modified affine Solomon-Stiffler codes. Several infinite families of $q$-ary Griesmer, optimal, almost optimal two-weight, three-weight, four-weight and five-weight linear codes are constructed as special cases of our construction. Weight distributions of these Griesmer, optimal or almost optimal codes are determined. Many optimal linear codes documented in Grassl's list are re-constructed as (modified) affine Solomon-Stiffler codes.
Several infinite families of optimal or Griesmer codes were constructed in two published papers in IEEE Transactions on Information Theory 2017 and 2019, via Gray images of codes over finite rings. Parameters and weight distributions of these Griesmer or optimal codes and very special case codes in our construction are the same. We also indicate that more general distance-optimal binary linear codes than that constructed in a recent paper of IEEE Transactions on Information Theory can be obtained directly from codimension one subcodes in binary Solomon-Stiffler codes. - [4] arXiv:2406.10872 [pdf, html, other]
-
Title: On entropy Marton-type inequalities and small symmetric differences with cosets of abelian groupsComments: 10 pages, submittedSubjects: Information Theory (cs.IT); Combinatorics (math.CO); Group Theory (math.GR); Number Theory (math.NT); Probability (math.PR)
We recognise that an entropy inequality akin to the main intermediate goal of recent works (Gowers, Green, Manners, Tao [3],[2]) regarding a conjecture of Marton provides a black box from which we can also through a short deduction recover another description: if a finite subset $A$ of an abelian group $G$ is such that the distribution of the sums $a+b$ with $(a,b) \in A \times A$ is only slightly more spread out than the uniform distribution on $A$, then $A$ has small symmetric difference with some finite coset of $G$. The resulting bounds are necessarily sharp up to a logarithmic factor.
- [5] arXiv:2406.10910 [pdf, html, other]
-
Title: Fast Fractional Programming for Multi-Cell Integrated Sensing and CommunicationsSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
This paper concerns the coordinate multi-cell beamforming design for integrated sensing and communications (ISAC). In particular, we assume that each base station (BS) has massive antennas. The optimization objective is to maximize a weighted sum of the data rates (for communications) and the Fisher information (for sensing). We first show that the conventional beamforming method for the multiple-input multiple-output (MIMO) transmission, i.e., the weighted minimum mean square error (WMMSE) algorithm, has a natural extension to the ISAC problem scenario from a fractional programming (FP) perspective. However, the extended WMMSE algorithm requires computing the $N\times N$ matrix inverse extensively, where $N$ is proportional to the antenna array size, so the algorithm becomes quite costly when antennas are massively deployed. To address this issue, we develop a nonhomogeneous bound and use it in conjunction with the FP technique to solve the ISAC beamforming problem without the need to invert any large matrices. It is further shown that the resulting new FP algorithm has an intimate connection with gradient projection, based on which we can accelerate the convergence via Nesterov's gradient extrapolation.
- [6] arXiv:2406.11038 [pdf, html, other]
-
Title: Physical-Layer Security for 6G: Safe Jamming against Malicious SensingComments: accepted for presentation at 2024 IEEE/CIC International Conference on Communications in China (ICCC)Subjects: Information Theory (cs.IT)
The integration of sensing, communications, array signal processing, etc. into 6G mobile networks has ushered in an era of heightened situational awareness. However, this progress brings forth significant concerns regarding privacy and security, particularly due to the proliferation of devices equipped with radar-like sensing capability, including malicious ones. In response, this paper proposes a novel actor-critic (AC) method-based frequency selection scheme for noise jamming, in order to effectively counter malicious multifunction frequency agility sensing. In the meanwhile, to mitigate potential interference (caused by sidelobes of the jamming beam) with uplink transmissions conducted by legitimate but non-cooperative users, a robust action correction mechanism, which is capable of learning and predicting the spectrum utilization state, is proposed to find feasible but near-optimal frequency configuration for jamming. Numerical results demonstrate that benefiting from the robust action correction mechanism, the proposed AC-based safe jamming can not only make the malicious sensing device continuously get stuck in the searching mode but also guarantee minimal disruption to the legitimate non-cooperative users.
- [7] arXiv:2406.11082 [pdf, html, other]
-
Title: Simultaneously Transmitting and Reflecting Surfaces for Ubiquitous Next Generation Multiple Access in 6G and BeyondComments: 25 pages, 18 figures, 7 tablesSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
The ultimate goal of next generation multiple access (NGMA) is to support massive terminals and facilitate multiple functionalities over the limited radio resources of wireless networks in the most efficient manner possible. However, the random and uncontrollable wireless radio environment is a major obstacle to realizing this NGMA vision. Given the prominent feature of achieving 360° smart radio environment, simultaneously transmitting and reflecting surfaces (STARS) are emerging as one key enabling technology among the family of reconfigurable intelligent surfaces for NGMA. This paper provides a comprehensive overview of the recent research progress of STARS, focusing on fundamentals, performance analysis, and full-space beamforming design, as well as promising employments of STARS in NGMA. In particular, we first introduce the basics of STARS by elaborating on the foundational principles and operating protocols as well as discussing different STARS categories and prototypes. Moreover, we systematically survey the existing performance analysis and beamforming design for STARS-aided wireless communications in terms of diverse objectives and different mathematical approaches. Given the superiority of STARS, we further discuss advanced STARS applications as well as the attractive interplay between STARS and other emerging techniques to motivate future works for realizing efficient NGMA.
- [8] arXiv:2406.11241 [pdf, other]
-
Title: Reconfigurable Intelligent Surface Equipped UAV in Emergency Wireless Communications: A New Fading-Shadowing Model and Performance AnalysisJournal-ref: IEEE Transactions on Communications ( Volume: 72, Issue: 3, March 2024)Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Communication infrastructure is often severely disrupted in post-disaster areas, which interrupts communications and impedes rescue. Recently, the technology of reconfigurable intelligent surface (RIS)-equipped-UAV has been investigated as a feasible approach to assist communication under such conditions. However, the channel characteristics in the post-disaster area rapidly change due to the topographical changes caused by secondary disasters and the high mobility of UAVs. In this paper we develop a new fading-shadowing model to fit the path loss caused by the debris. Following this, we derive the exact distribution of the new channel statistics for a small number of RIS elements and the approximate distribution for a large number of RIS elements, respectively. Then, we derive the closed-form expressions for performance analysis, including average capacity (AC), energy efficiency (EE), and outage probability (OP). Based on the above analytical derivations, we maximize the energy efficiency by optimizing the number of RIS elements and the coverage area by optimizing the altitude of the RIS-equipped UAV, respectively. Finally, simulation results validate the accuracy of derived expressions and show insights related to the optimal number of RIS elements and the optimal UAV altitude for emergency wireless communication (EWC).
- [9] arXiv:2406.11782 [pdf, html, other]
-
Title: Soft-output Guessing Codeword DecodingSubjects: Information Theory (cs.IT)
We establish that it is possible to extract accurate blockwise and bitwise soft output from Guessing Codeword Decoding with minimal additional computational complexity by considering it as a variant of Guessing Random Additive Noise Decoding. Blockwise soft output can be used to control decoding misdetection rate while bitwise soft output results in a soft-input soft-output decoder that can be used for efficient iterative decoding of long, high redundancy codes.
New submissions for Tuesday, 18 June 2024 (showing 9 of 9 entries )
- [10] arXiv:2406.11075 (cross-list from cs.DS) [pdf, html, other]
-
Title: Perturbation-Resilient Sets for Dynamic Service BalancingComments: arXiv admin note: text overlap with arXiv:2303.12996Subjects: Data Structures and Algorithms (cs.DS); Information Theory (cs.IT); Combinatorics (math.CO)
A combinatorial trade is a pair of sets of blocks of elements that can be exchanged while preserving relevant subset intersection constraints. The class of balanced and swap-robust minimal trades was proposed in [1] for exchanging blocks of data chunks stored on distributed storage systems in an access- and load-balanced manner. More precisely, data chunks in the trades of interest are labeled by popularity ranks and the blocks are required to have both balanced overall popularity and stability properties with respect to swaps in chunk popularities. The original construction of such trades relied on computer search and paired balanced sets obtained through iterative combining of smaller sets that have provable stability guarantees. To reduce the substantial gap between the results of prior approaches and the known theoretical lower bound, we present new analytical upper and lower bounds on the minimal disbalance of blocks introduced by limited-magnitude popularity ranking swaps. Our constructive and near-optimal approach relies on pairs of graphs whose vertices are two balanced sets with edges/arcs that capture the balance and potential balance changes induced by limited-magnitude popularity swaps. In particular, we show that if we start with carefully selected balanced trades and limit the magnitude of rank swaps to one, the new upper and lower bound on the maximum block disbalance caused by a swap only differ by a factor of $1.07$. We also extend these results for larger popularity swap magnitudes.
- [11] arXiv:2406.11504 (cross-list from cs.LG) [pdf, html, other]
-
Title: On the Feasibility of Fidelity$^-$ for Graph PruningComments: 6 pages, 3 figures, 2 tables; IJCAI Workshop on Explainable AI (XAI 2024) (to appear) (Please cite our workshop version.)Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE); Social and Information Networks (cs.SI)
As one of popular quantitative metrics to assess the quality of explanation of graph neural networks (GNNs), fidelity measures the output difference after removing unimportant parts of the input graph. Fidelity has been widely used due to its straightforward interpretation that the underlying model should produce similar predictions when features deemed unimportant from the explanation are removed. This raises a natural question: "Does fidelity induce a global (soft) mask for graph pruning?" To solve this, we aim to explore the potential of the fidelity measure to be used for graph pruning, eventually enhancing the GNN models for better efficiency. To this end, we propose Fidelity$^-$-inspired Pruning (FiP), an effective framework to construct global edge masks from local explanations. Our empirical observations using 7 edge attribution methods demonstrate that, surprisingly, general eXplainable AI methods outperform methods tailored to GNNs in terms of graph pruning performance.
- [12] arXiv:2406.11569 (cross-list from cs.LG) [pdf, html, other]
-
Title: Pre-Training and Personalized Fine-Tuning via Over-the-Air Federated Meta-Learning: Convergence-Generalization Trade-OffsComments: 37 pages, 7 figures, submitted for possible journal publicationSubjects: Machine Learning (cs.LG); Information Theory (cs.IT); Signal Processing (eess.SP)
For modern artificial intelligence (AI) applications such as large language models (LLMs), the training paradigm has recently shifted to pre-training followed by fine-tuning. Furthermore, owing to dwindling open repositories of data and thanks to efforts to democratize access to AI models, pre-training is expected to increasingly migrate from the current centralized deployments to federated learning (FL) implementations. Meta-learning provides a general framework in which pre-training and fine-tuning can be formalized. Meta-learning-based personalized FL (meta-pFL) moves beyond basic personalization by targeting generalization to new agents and tasks. This paper studies the generalization performance of meta-pFL for a wireless setting in which the agents participating in the pre-training phase, i.e., meta-learning, are connected via a shared wireless channel to the server. Adopting over-the-air computing, we study the trade-off between generalization to new agents and tasks, on the one hand, and convergence, on the other hand. The trade-off arises from the fact that channel impairments may enhance generalization, while degrading convergence. Extensive numerical results validate the theory.
Cross submissions for Tuesday, 18 June 2024 (showing 3 of 3 entries )
- [13] arXiv:2112.13787 (replaced) [pdf, html, other]
-
Title: Degree-of-Freedom of Modulating Information in the Phases of Reconfigurable Intelligent SurfaceComments: 20 pages, 7 figures, published in IEEE Transactions on Information Theory. Comments are most welcome and appreciatedJournal-ref: IEEE Transactions on Information Theory ( Volume: 70, Issue: 1, Pages 170-188, January 2024)Subjects: Information Theory (cs.IT); Signal Processing (eess.SP)
This paper investigates the information theoretic limit of a reconfigurable intelligent surface (RIS) aided communication scenario in which the RIS and the transmitter either jointly or independently send information to the receiver. The RIS is an emerging technology that uses a large number of passive reflective elements with adjustable phases to intelligently reflect the transmit signal to the intended receiver. While most previous studies of the RIS focus on its ability to beamform and to boost the received signal-to-noise ratio (SNR), this paper shows that if the information data stream is also available at the RIS and can be modulated through the adjustable phases at the RIS, significant improvement in the {degree-of-freedom} (DoF) of the overall channel is possible. For example, for an RIS system in which the signals are reflected from a transmitter with $M$ antennas to a receiver with $K$ antennas through an RIS with $N$ reflective elements, assuming no direct path between the transmitter and the receiver, joint transmission of the transmitter and the RIS can achieve a DoF of $\min\left(M+\frac{N}{2}-\frac{1}{2},N,K\right)$ as compared to the DoF of $\min(M,K)$ for the conventional multiple-input multiple-output (MIMO) channel. This result is obtained by establishing a connection between the RIS system and the MIMO channel with phase noise and by using results for characterizing the information dimension under projection. The result is further extended to the case with a direct path between the transmitter and the receiver, and also to the multiple access scenario, in which the transmitter and the RIS send independent information. Finally, this paper proposes a symbol-level precoding approach for modulating data through the phases of the RIS, and provides numerical simulation results to verify the theoretical DoF results.
- [14] arXiv:2206.13459 (replaced) [pdf, other]
-
Title: An Efficient Frequency Diversity Scheme for Ultra-Reliable Communications in Two-Path Fading ChannelsComments: 15 pages, 13 figuresSubjects: Information Theory (cs.IT)
We consider a two-ray ground reflection scenario with unknown distance between transmitter and receiver. By utilizing two frequencies in parallel, we can mitigate possible destructive interference and ensure ultra-reliability with only very limited knowledge at the transmitter. In order to achieve this ultra-reliability, we optimize the frequency spacing such that the worst-case receive power is maximized. Additionally, we provide an algorithm to calculate the optimal frequency spacing. Besides the receive power, we also analyze the achievable rate and outage probability. It is shown that the frequency diversity scheme achieves a significant improvement in terms of reliability over using a single frequency. In particular, we demonstrate the effectiveness of the proposed approach by a numerical simulation of an unmanned aerial vehicle (UAV) flying above flat terrain.
- [15] arXiv:2311.04831 (replaced) [pdf, html, other]
-
Title: Derivatives of entropy and the MMSE conjectureComments: Comments most welcome!Subjects: Information Theory (cs.IT); Probability (math.PR)
We investigate the entropy $H(\mu,t)$ of a probability measure $\mu$ along the heat flow and more precisely we seek for closed algebraic representations of its derivatives. Provided that $\mu$ admits moments of any order, it is indeed proved in [Guo et al., 2010] that $t\mapsto H(\mu,t)$ is smooth, and in [Ledoux, 2016] that its derivatives at zero can be expressed into multivariate polynomials evaluated in the moments (or cumulants) of $\mu$. In the seminal contribution \cite{Led}, these algebraic expressions are derived through $\Gamma$-calculus techniques which provide implicit recursive formulas for these polynomials. Our main contribution consists in a fine combinatorial analysis of these inductive relations and for the first time to derive closed formulas for the leading coefficients of these polynomials expressions.
Building upon these explicit formulas we revisit the so-called "MMSE conjecture" from [Guo et al., 2010] which asserts that two distributions on the real line with the same entropy along the heat flow must coincide up to translation and symmetry. Our approach enables us to provide new conditions on the source distributions ensuring that the MMSE conjecture holds and to refine several criteria proved in [Ledoux, 2016]. As illustrating examples, our findings cover the cases of uniform and Rademacher distributions, for which previous results in the literature were inapplicable. - [16] arXiv:2311.10561 (replaced) [pdf, html, other]
-
Title: A Universal Framework for Multiport Network Analysis of Reconfigurable Intelligent SurfacesComments: Accepted by IEEE for publicationSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Reconfigurable intelligent surface (RIS) is an emerging paradigm able to control the propagation environment in wireless systems. Most of the research on RIS has been dedicated to system optimization and, with the advent of beyond diagonal RIS (BD-RIS), to RIS architecture design. However, developing general and unified electromagnetic (EM)-consistent models for RIS-aided systems remains an open problem. In this study, we propose a universal framework for the multiport network analysis of RIS-aided systems. With our framework, we model RIS-aided systems and RIS architectures through impedance, admittance, and scattering parameter analysis. Based on these analyses, three equivalent models are derived accounting for the effects of impedance mismatching and mutual coupling. The three models are then simplified by assuming large transmission distances, perfect matching, and no mutual coupling to understand the role of the RIS in the communication model. The derived simplified models are consistent with the typical model used in related literature, although we show that an additional approximation is commonly considered in the literature. We discuss the benefits of each analysis in characterizing and optimizing the RIS and how to select the most suitable parameters according to the needs. Numerical results provide additional evidence of the equivalence of the three analyses.
- [17] arXiv:2311.12443 (replaced) [pdf, html, other]
-
Title: Knowledge Base Enabled Semantic Communication: A Generative PerspectiveComments: This paper has been accepted by IEEE Wireless Communications and is scheduled for publicationSubjects: Information Theory (cs.IT); Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI)
Semantic communication is widely touted as a key technology for propelling the sixth-generation (6G) wireless networks. However, providing effective semantic representation is quite challenging in practice. To address this issue, this article takes a crack at exploiting semantic knowledge base (KB) to usher in a new era of generative semantic communication. Via semantic KB, source messages can be characterized in low-dimensional subspaces without compromising their desired meanings, thus significantly enhancing the communication efficiency. The fundamental principle of semantic KB is first introduced, and a generative semantic communication architecture is developed by presenting three sub-KBs, namely source, task, and channel KBs. Then, the detailed construction approaches for each sub-KB are described, followed by their utilization in terms of semantic coding and transmission. A case study is also provided to showcase the superiority of generative semantic communication over conventional syntactic communication and classical semantic communication. In a nutshell, this article establishes a scientific foundation for the exciting uncharted frontier of generative semantic communication.
- [18] arXiv:2311.12609 (replaced) [pdf, html, other]
-
Title: Reinforcement Learning for Near-Optimal Design of Zero-Delay Codes for Markov SourcesComments: 15 pages, 3 figures; accepted for publication in IEEE Transactions on Information TheorySubjects: Information Theory (cs.IT); Optimization and Control (math.OC)
In the classical lossy source coding problem, one encodes long blocks of source symbols that enables the distortion to approach the ultimate Shannon limit. Such a block-coding approach introduces large delays, which is undesirable in many delay-sensitive applications. We consider the zero-delay case, where the goal is to encode and decode a finite-alphabet Markov source without any delay. It has been shown that this problem lends itself to stochastic control techniques, which lead to existence, structural, and general structural approximation results. However, these techniques so far have resulted only in computationally prohibitive algorithmic implementations for code design. To address this problem, we present a reinforcement learning design algorithm and rigorously prove its asymptotic optimality. In particular, we show that a quantized Q-learning algorithm can be used to obtain a near-optimal coding policy for this problem. The proof builds on recent results on quantized Q-learning for weakly Feller controlled Markov chains whose application necessitates the development of supporting technical results on regularity and stability properties, and relating the optimal solutions for discounted and average cost infinite horizon criteria problems. These theoretical results are supported by simulations.
- [19] arXiv:2402.14297 (replaced) [pdf, html, other]
-
Title: Semantics-Empowered Space-Air-Ground-Sea Integrated Network: New Paradigm, Frameworks, and ChallengesComments: This paper has been accepted by IEEE Communication Surveys & TutorialsSubjects: Information Theory (cs.IT)
In the coming sixth generation (6G) communication era, to provide seamless and ubiquitous connections, the space-air-ground-sea integrated network (SAGSIN) is envisioned to address the challenges of communication coverage in areas with difficult conditions, such as the forest, desert, and sea. Considering the fundamental limitations of the SAGSIN including large-scale scenarios, highly dynamic channels, and limited device capabilities, traditional communications based on Shannon information theory cannot satisfy the communication demands. Moreover, bit-level reconstruction is usually redundant for many human-to-machine or machine-to-machine applications in the SAGSIN. Therefore, it is imperative to consider high-level communications towards semantics exchange, called semantic communications. In this survey, according to the interpretations of the term "semantics", including "significance", "meaning", and "effectiveness-related information", we review state-of-the-art works on semantic communications from three perspectives, which are 1) significance representation and protection, 2) meaning similarity measurement and meaning enhancement, and 3) ultimate effectiveness and effectiveness yielding. Sequentially, three types of semantic communication systems can be correspondingly introduced, namely the significance-oriented, meaning-oriented, and effectiveness/task-oriented semantic communication systems. Implementation of the above three types of systems in the SAGSIN necessitates a new perception-communication-computing-actuation-integrated paradigm (PCCAIP), where all the available perception, computing, and actuation techniques jointly facilitates significance-oriented sampling & transmission, semantic extraction & reconstruction, and task decision. Finally, we point out some future challenges on semantic communications in the SAGSIN. ...
- [20] arXiv:2403.15221 (replaced) [pdf, html, other]
-
Title: Mutual Information of a class of Poisson-type Channels using Markov Renewal TheoryComments: 5 main pages, 1 main figure, 5 appendix pages, 3 appendix figures, Accepted at ISIT 2024 conferenceSubjects: Information Theory (cs.IT)
The mutual information (MI) of Poisson-type channels has been linked to a filtering problem since the 70s, but its evaluation for specific continuous-time, discrete-state systems remains a demanding task. As an advantage, Markov renewal processes (MrP) retain their renewal property under state space filtering. This offers a way to solve the filtering problem analytically for small systems. We consider a class of communication systems $X \to Y$ that can be derived from an MrP by a custom filtering procedure. For the subclasses, where (i) $Y$ is a renewal process or (ii) $(X,Y)$ belongs to a class of MrPs, we provide an evolution equation for finite transmission duration $T>0$ and limit theorems for $T \to \infty$ that facilitate simulation-free evaluation of the MI $\mathbb{I}(X_{[0,T]}; Y_{[0,T]})$ and its associated mutual information rate (MIR). In other cases, simulation cost is reduced to the marginal system $(X,Y)$ or $Y$. We show that systems with an additional $X$-modulating level $C$, which statically chooses between different processes $X_{[0,T]}(c)$, can naturally be included in our framework, thereby giving an expression for $\mathbb{I}(C; Y_{[0,T]})$. Our primary contribution is to apply the results of classical (Markov renewal) filtering theory in a novel manner to the problem of exactly computing the MI/MIR. The theoretical framework is showcased in an application to bacterial gene expression, where filtering is analytically tractable.
- [21] arXiv:2404.03455 (replaced) [pdf, html, other]
-
Title: Synergy as the failure of distributivityComments: 22 pages, 8 figures. Reformatted, corrected typos, added acknowledgements. Simplified the proofs of theorems 3,4, lemma 5 without affecting the results. Clarified the choice of examples in the main text. Removed Lemma 11 (in the original enumeration) for being too basicSubjects: Information Theory (cs.IT); Biological Physics (physics.bio-ph); Data Analysis, Statistics and Probability (physics.data-an)
A physical system is synergistic if it cannot be reduced to its constituents. Intuitively this is paraphrased into the common statement that 'the whole is greater than the sum of its parts'. In this manner, many basic elements in combination may give rise to some unexpected collective behavior. A paradigmatic example of such phenomenon is information. Several sources, which are already known individually, may provide some new knowledge when joined together. Here we take the trivial case of discrete random variables and explore whether and how it is possible get more information out of lesser parts. Our approach is inspired by set theory as the fundamental description of part-whole relations. If taken unaltered, synergistic behavior is forbidden by the set theoretical axioms. Indeed, the union of sets cannot contain extra elements not found in any particular one of them. However, random variables are not a perfect analogy of sets. We formalise the distinction, finding a single broken axiom - union/intersection distributivity. Nevertheless, it remains possible to describe information using Venn-type diagrams. We directly connect the existence of synergy to the failure of distributivity for random variables. When compared to the partial information decomposition framework (PID), our technique fully reproduces previous results while resolving the self-contradictions that plagued the field and providing additional constraints on the solutions. This opens the way towards quantifying emergence in large systems.
- [22] arXiv:2406.05481 (replaced) [pdf, html, other]
-
Title: Joint Cooperative Clustering and Power Control for Energy-Efficient Cell-Free XL-MIMO with Multi-Agent Reinforcement LearningSubjects: Information Theory (cs.IT)
In this paper, we investigate the amalgamation of cell-free (CF) and extremely large-scale multiple-input multiple-output (XL-MIMO) technologies, referred to as a CF XL-MIMO, as a promising advancement for enabling future mobile networks. To address the computational complexity and communication power consumption associated with conventional centralized optimization, we focus on user-centric dynamic networks in which each user is served by an adaptive subset of access points (AP) rather than all of them. We begin our research by analyzing a joint resource allocation problem for energy-efficient CF XL-MIMO systems, encompassing cooperative clustering and power control design, where all clusters are adaptively adjustable. Then, we propose an innovative double-layer multi-agent reinforcement learning (MARL)-based scheme, which offers an effective strategy to tackle the challenges of high-dimensional signal processing. In the section of numerical results, we compare various algorithms with different network architectures. These comparisons reveal that the proposed MARL-based cooperative architecture can effectively strike a balance between system performance and communication overhead, thereby improving energy efficiency performance. It is important to note that increasing the number of user equipments participating in information sharing can effectively enhance SE performance, which also leads to an increase in power consumption, resulting in a non-trivial trade-off between the number of participants and EE performance.
- [23] arXiv:2406.09846 (replaced) [pdf, html, other]
-
Title: Multiple Intelligent Reflecting Surfaces Collaborative Wireless Localization SystemComments: 13 pages, 8 figuresSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
This paper studies a multiple intelligent reflecting surfaces (IRSs) collaborative localization system where multiple semi-passive IRSs are deployed in the network to locate one or more targets based on time-of-arrival. It is assumed that each semi-passive IRS is equipped with reflective elements and sensors, which are used to establish the line-of-sight links from the base station (BS) to multiple targets and process echo signals, respectively. Based on the above model, we derive the Fisher information matrix of the echo signal with respect to the time delay. By employing the chain rule and exploiting the geometric relationship between time delay and position, the Cramer-Rao bound (CRB) for estimating the target's Cartesian coordinate position is derived. Then, we propose a two-stage algorithmic framework to minimize CRB in single- and multi-target localization systems by joint optimizing active beamforming at BS, passive beamforming at multiple IRSs and IRS selection. For the single-target case, we derive the optimal closed-form solution for multiple IRSs coefficients design and propose a lowcomplexity algorithm based on alternating direction method of multipliers to obtain the optimal solution for active beaming design. For the multi-target case, alternating optimization is used to transform the original problem into two subproblems where semi-definite relaxation and successive convex approximation are applied to tackle the quadraticity and indefiniteness in the CRB expression, respectively. Finally, numerical simulation results validate the effectiveness of the proposed algorithm for multiple IRSs collaborative localization system compared to other benchmark schemes as well as the significant performance gains.
- [24] arXiv:2201.08464 (replaced) [pdf, html, other]
-
Title: On Good Infinite Families of Toric Codes or the Lack ThereofComments: to appear in Involve, a Journal of MathematicsSubjects: Algebraic Geometry (math.AG); Information Theory (cs.IT)
A toric code, introduced by Hansen to extend the Reed-Solomon code as a $k$-dimensional subspace of $\mathbb{F}_q^n$, is determined by a toric variety or its associated integral convex polytope $P \subseteq [0,q-2]^n$, where $k=|P \cap \mathbb{Z}^n|$ (the number of integer lattice points of $P$). There are two relevant parameters that determine the quality of a code: the information rate, which measures how much information is contained in a single bit of each codeword; and the relative minimum distance, which measures how many errors can be corrected relative to how many bits each codeword has. Soprunov and Soprunova defined a good infinite family of codes to be a sequence of codes of unbounded polytope dimension such that neither the corresponding information rates nor relative minimum distances go to 0 in the limit. We examine different ways of constructing families of codes by considering polytope operations such as the join and direct sum. In doing so, we give conditions under which no good family can exist and strong evidence that there is no such good family of codes.
- [25] arXiv:2212.05015 (replaced) [pdf, other]
-
Title: Robustness Implies Privacy in Statistical EstimationComments: 90 pages, 2 tables. Appeared in STOC, 2023Subjects: Data Structures and Algorithms (cs.DS); Cryptography and Security (cs.CR); Information Theory (cs.IT); Machine Learning (stat.ML)
We study the relationship between adversarial robustness and differential privacy in high-dimensional algorithmic statistics. We give the first black-box reduction from privacy to robustness which can produce private estimators with optimal tradeoffs among sample complexity, accuracy, and privacy for a wide range of fundamental high-dimensional parameter estimation problems, including mean and covariance estimation. We show that this reduction can be implemented in polynomial time in some important special cases. In particular, using nearly-optimal polynomial-time robust estimators for the mean and covariance of high-dimensional Gaussians which are based on the Sum-of-Squares method, we design the first polynomial-time private estimators for these problems with nearly-optimal samples-accuracy-privacy tradeoffs. Our algorithms are also robust to a nearly optimal fraction of adversarially-corrupted samples.
- [26] arXiv:2312.10438 (replaced) [pdf, html, other]
-
Title: Bayes-Optimal Unsupervised Learning for Channel Estimation in Near-Field Holographic MIMOComments: 16 pages, 7 figures, 3 tables, accepted by IEEE Journal of Selected Topics in Signal ProcessingSubjects: Signal Processing (eess.SP); Information Theory (cs.IT)
Holographic MIMO (HMIMO) is being increasingly recognized as a key enabling technology for 6G wireless systems through the deployment of an extremely large number of antennas within a compact space to fully exploit the potentials of the electromagnetic (EM) channel. Nevertheless, the benefits of HMIMO systems cannot be fully unleashed without an efficient means to estimate the high-dimensional channel, whose distribution becomes increasingly complicated due to the accessibility of the near-field region. In this paper, we address the fundamental challenge of designing a low-complexity Bayes-optimal channel estimator in near-field HMIMO systems operating in unknown EM environments. The core idea is to estimate the HMIMO channels solely based on the Stein's score function of the received pilot signals and an estimated noise level, without relying on priors or supervision that is not feasible in practical deployment. A neural network is trained with the unsupervised denoising score matching objective to learn the parameterized score function. Meanwhile, a principal component analysis (PCA)-based algorithm is proposed to estimate the noise level leveraging the low-rank near-field spatial correlation. Building upon these techniques, we develop a Bayes-optimal score-based channel estimator for fully-digital HMIMO transceivers in a closed form. The optimal score-based estimator is also extended to hybrid analog-digital HMIMO systems by incorporating it into a low-complexity message passing algorithm. The (quasi-) Bayes-optimality of the proposed estimators is validated both in theory and by extensive simulation results. In addition to optimality, it is shown that our proposal is robust to various mismatches and can quickly adapt to dynamic EM environments in an online manner thanks to its unsupervised nature, demonstrating its potential in real-world deployment.
- [27] arXiv:2402.07025 (replaced) [pdf, html, other]
-
Title: Generalization Error of Graph Neural Networks in the Mean-field RegimeComments: Accepted in ICML 2024Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG)
This work provides a theoretical framework for assessing the generalization error of graph neural networks in the over-parameterized regime, where the number of parameters surpasses the quantity of data points. We explore two widely utilized types of graph neural networks: graph convolutional neural networks and message passing graph neural networks. Prior to this study, existing bounds on the generalization error in the over-parametrized regime were uninformative, limiting our understanding of over-parameterized network performance. Our novel approach involves deriving upper bounds within the mean-field regime for evaluating the generalization error of these graph neural networks. We establish upper bounds with a convergence rate of $O(1/n)$, where $n$ is the number of graph samples. These upper bounds offer a theoretical assurance of the networks' performance on unseen data in the challenging over-parameterized regime and overall contribute to our understanding of their performance.
- [28] arXiv:2406.09194 (replaced) [pdf, html, other]
-
Title: Benign overfitting in Fixed Dimension via Physics-Informed Learning with Smooth Inductive BiasSubjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG); Numerical Analysis (math.NA); Statistics Theory (math.ST)
Recent advances in machine learning have inspired a surge of research into reconstructing specific quantities of interest from measurements that comply with certain physical laws. These efforts focus on inverse problems that are governed by partial differential equations (PDEs). In this work, we develop an asymptotic Sobolev norm learning curve for kernel ridge(less) regression when addressing (elliptical) linear inverse problems. Our results show that the PDE operators in the inverse problem can stabilize the variance and even behave benign overfitting for fixed-dimensional problems, exhibiting different behaviors from regression problems. Besides, our investigation also demonstrates the impact of various inductive biases introduced by minimizing different Sobolev norms as a form of implicit regularization. For the regularized least squares estimator, we find that all considered inductive biases can achieve the optimal convergence rate, provided the regularization parameter is appropriately chosen. The convergence rate is actually independent to the choice of (smooth enough) inductive bias for both ridge and ridgeless regression. Surprisingly, our smoothness requirement recovered the condition found in Bayesian setting and extend the conclusion to the minimum norm interpolation estimators.