Information Theory
See recent articles
Showing new listings for Wednesday, 2 October 2024
- [1] arXiv:2410.00150 [pdf, html, other]
-
Title: What If We Had Used a Different App? Reliable Counterfactual KPI Analysis in Wireless SystemsComments: This paper has been submitted to a journalSubjects: Information Theory (cs.IT); Machine Learning (cs.LG); Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
In modern wireless network architectures, such as Open Radio Access Network (O-RAN), the operation of the radio access network (RAN) is managed by applications, or apps for short, deployed at intelligent controllers. These apps are selected from a given catalog based on current contextual information. For instance, a scheduling app may be selected on the basis of current traffic and network conditions. Once an app is chosen and run, it is no longer possible to directly test the performance that would have been obtained with another app. This test, however, would be potentially valuable to monitor and optimize the network operation. With this goal in mind, this paper addresses the "what-if" problem of estimating the values of key performance indicators (KPIs) that would have been obtained if a different app had been implemented by the RAN. To this end, we propose a conformal-prediction-based counterfactual analysis method for wireless systems that provides reliable "error bars" for the estimated KPIs, containing the true KPIs with a user-defined probability, despite the inherent covariate shift between logged and test data. Experimental results for medium access control-layer apps and for physical-layer apps demonstrate the merits of the proposed method.
- [2] arXiv:2410.00239 [pdf, html, other]
-
Title: Modulation and Coding for NOMA and RSMAComments: Invited paper; to appear in the Proceedings of the IEEESubjects: Information Theory (cs.IT); Machine Learning (cs.LG)
Next-generation multiple access (NGMA) serves as an umbrella term for transmission schemes distinct from conventional orthogonal methods. A key candidate of NGMA, non-orthogonal multiple access (NOMA), emerges as a solution to enhance connectivity by allowing multiple users to share time, frequency, and space concurrently. However, NOMA faces challenges in implementation, particularly in canceling inter-user interference. In this paper, we discuss the principles behind NOMA and review conventional NOMA methods. Then, to address these challenges, we present asynchronous transmission and interference-aware modulation techniques, enabling decoding without successive interference cancellation. The goal is to design constellations that dynamically adapt to interference, minimizing bit error rates (BERs) and enhancing user throughput in the presence of inter-user, inter-carrier, and inter-cell interference. The traditional link between minimizing BER and increasing spectral efficiency is explored, with deep autoencoders for end-to-end communication emerging as a potential solution to improve BERs. Interference-aware modulation can revolutionize constellation design for non-orthogonal channels. Rate-splitting multiple access (RSMA) is another promising interference management technique in multi-user systems. In addition to addressing challenges in finite-alphabet NOMA, this paper offers new insights and provides an overview of code-domain NOMA, trellis-coded NOMA, and RSMA as key NGMA candidates. We also discuss the evolution of channel coding toward low-latency communication and examine modulation and coding schemes in 5G networks. Finally, we highlight future research directions, emphasizing their importance for realizing NOMA from concept to functional technology.
- [3] arXiv:2410.00313 [pdf, html, other]
-
Title: Pre-Chirp-Domain Index Modulation for Full-Diversity Affine Frequency Division Multiplexing towards 6GSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Affine frequency division multiplexing (AFDM), tailored as a superior multicarrier technique utilizing chirp signals for high-mobility communications, is envisioned as a promising candidate for the sixth-generation (6G) wireless network. AFDM is based on the discrete affine Fourier transform (DAFT) with two adjustable parameters of the chirp signals, termed as the pre-chirp and post-chirp parameters, respectively. We show that the pre-chirp counterpart can be flexibly manipulated for additional degree-of-freedom (DoF). Therefore, this paper proposes a novel AFDM scheme with the pre-chirp index modulation (PIM) philosophy (AFDM-PIM), which can implicitly convey extra information bits through dynamic pre-chirp parameter assignment, thus enhancing both spectral and energy efficiency. Specifically, we first demonstrate that the subcarrier orthogonality is still maintained by applying distinct pre-chirp parameters to various subcarriers in the AFDM modulation process. Inspired by this property, each AFDM subcarrier is constituted with a unique pre-chirp signal according to the incoming bits. By such arrangement, extra binary bits can be embedded into the index patterns of pre-chirp parameter assignment without additional energy consumption. For performance analysis, we derive the asymptotically tight upper bounds on the average bit error rates (BERs) of the proposed schemes with maximum-likelihood (ML) detection, and validate that the proposed AFDM-PIM can achieve the optimal diversity order under doubly dispersive channels. Based on the derivations, we further propose an optimal pre-chirp alphabet design to enhance the BER performance via intelligent optimization algorithms. Simulations demonstrate that the proposed AFDM-PIM outperforms the classical benchmarks under doubly dispersive channel.
- [4] arXiv:2410.00376 [pdf, html, other]
-
Title: Frequency Diverse Array-enabled RIS-aided Integrated Sensing and CommunicationComments: 36 pages, 9 figuresSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
Integrated sensing and communication (ISAC) has been envisioned as a prospective technology to enable ubiquitous sensing and communications in next-generation wireless networks. In contrast to existing works on reconfigurable intelligent surface (RIS) aided ISAC systems using conventional phased arrays (PAs), this paper investigates a frequency diverse array (FDA)-enabled RIS-aided ISAC system, where the FDA aims to provide a distance-angle-dependent beampattern to effectively suppress the clutter, and RIS is employed to establish high-quality links between the BS and users/target. We aim to maximize sum rate by jointly optimizing the BS transmit beamforming vectors, the covariance matrix of the dedicated radar signal, the RIS phase shift matrix, the FDA frequency offsets and the radar receive equalizer, while guaranteeing the required signal-to-clutter-plus-noise ratio (SCNR) of the radar echo signal. To tackle this challenging problem, we first theoretically prove that the dedicated radar signal is unnecessary for enhancing target sensing performance, based on which the original problem is much simplified. Then, we turn our attention to the single-user single-target (SUST) scenario to demonstrate that the FDA-RIS-aided ISAC system always achieves a higher SCNR than its PA-RIS-aided counterpart. Moreover, it is revealed that the SCNR increment exhibits linear growth with the BS transmit power and the number of BS receive antennas. In order to effectively solve this simplified problem, we leverage the fractional programming (FP) theory and subsequently develop an efficient alternating optimization (AO) algorithm based on symmetric alternating direction method of multipliers (SADMM) and successive convex approximation (SCA) techniques. Numerical results demonstrate the superior performance of our proposed algorithm in terms of sum rate and radar SCNR.
- [5] arXiv:2410.00698 [pdf, html, other]
-
Title: Analysis of Cross-Domain Message Passing for OTFS TransmissionsSubjects: Information Theory (cs.IT); Signal Processing (eess.SP)
In this paper, we investigate the performance of the cross-domain iterative detection (CDID) framework with orthogonal time frequency space (OTFS) modulation, where two distinct CDID algorithms are presented. The proposed schemes estimate/detect the information symbols iteratively across the frequency domain and the delay-Doppler (DD) domain via passing either the a posteriori or extrinsic information. Building upon this framework, we investigate the error performance by considering the bias evolution and state evolution. Furthermore, we discuss their error performance in convergence and the DD domain error state lower bounds in each iteration. Specifically, we demonstrate that in convergence, the ultimate error performance of the CDID passing the a posteriori information can be characterized by two potential convergence points. In contrast, the ultimate error performance of the CDID passing the extrinsic information has only one convergence point, which, interestingly, aligns with the matched filter bound. Our numerical results confirm our analytical findings and unveil the promising error performance achieved by the proposed designs.
New submissions (showing 5 of 5 entries)
- [6] arXiv:2410.00078 (cross-list from math.ST) [pdf, html, other]
-
Title: Shuffled Linear Regression via Spectral MatchingComments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibleSubjects: Statistics Theory (math.ST); Information Theory (cs.IT); Machine Learning (cs.LG); Signal Processing (eess.SP); Spectral Theory (math.SP); Machine Learning (stat.ML)
Shuffled linear regression (SLR) seeks to estimate latent features through a linear transformation, complicated by unknown permutations in the measurement dimensions. This problem extends traditional least-squares (LS) and Least Absolute Shrinkage and Selection Operator (LASSO) approaches by jointly estimating the permutation, resulting in shuffled LS and shuffled LASSO formulations. Existing methods, constrained by the combinatorial complexity of permutation recovery, often address small-scale cases with limited measurements. In contrast, we focus on large-scale SLR, particularly suited for environments with abundant measurement samples. We propose a spectral matching method that efficiently resolves permutations by aligning spectral components of the measurement and feature covariances. Rigorous theoretical analyses demonstrate that our method achieves accurate estimates in both shuffled LS and shuffled LASSO settings, given a sufficient number of samples. Furthermore, we extend our approach to address simultaneous pose and correspondence estimation in image registration tasks. Experiments on synthetic datasets and real-world image registration scenarios show that our method outperforms existing algorithms in both estimation accuracy and registration performance.
- [7] arXiv:2410.00509 (cross-list from cs.LG) [pdf, html, other]
-
Title: Learning Personalized Treatment Decisions in Precision Medicine: Disentangling Treatment Assignment Bias in Counterfactual Outcome Prediction and Biomarker IdentificationMichael Vollenweider, Manuel Schürch, Chiara Rohrer, Gabriele Gut, Michael Krauthammer, Andreas WickiComments: 9 pages, 5 figures, conferenceSubjects: Machine Learning (cs.LG); Information Theory (cs.IT); Quantitative Methods (q-bio.QM)
Precision medicine offers the potential to tailor treatment decisions to individual patients, yet it faces significant challenges due to the complex biases in clinical observational data and the high-dimensional nature of biological data. This study models various types of treatment assignment biases using mutual information and investigates their impact on machine learning (ML) models for counterfactual prediction and biomarker identification. Unlike traditional counterfactual benchmarks that rely on fixed treatment policies, our work focuses on modeling different characteristics of the underlying observational treatment policy in distinct clinical settings. We validate our approach through experiments on toy datasets, semi-synthetic tumor cancer genome atlas (TCGA) data, and real-world biological outcomes from drug and CRISPR screens. By incorporating empirical biological mechanisms, we create a more realistic benchmark that reflects the complexities of real-world data. Our analysis reveals that different biases lead to varying model performances, with some biases, especially those unrelated to outcome mechanisms, having minimal effect on prediction accuracy. This highlights the crucial need to account for specific biases in clinical observational data in counterfactual ML model development, ultimately enhancing the personalization of treatment decisions in precision medicine.
- [8] arXiv:2410.00535 (cross-list from cs.LG) [pdf, html, other]
-
Title: Optimal Causal Representations and the Causal Information BottleneckComments: Submitted to ICLR 2025. Code available at this http URLSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Information Theory (cs.IT); Machine Learning (stat.ML)
To effectively study complex causal systems, it is often useful to construct representations that simplify parts of the system by discarding irrelevant details while preserving key features. The Information Bottleneck (IB) method is a widely used approach in representation learning that compresses random variables while retaining information about a target variable. Traditional methods like IB are purely statistical and ignore underlying causal structures, making them ill-suited for causal tasks. We propose the Causal Information Bottleneck (CIB), a causal extension of the IB, which compresses a set of chosen variables while maintaining causal control over a target variable. This method produces representations which are causally interpretable, and which can be used when reasoning about interventions. We present experimental results demonstrating that the learned representations accurately capture causality as intended.
- [9] arXiv:2410.00611 (cross-list from math.CO) [pdf, html, other]
-
Title: The combinatorial structure and value distributions of plateaued functionsComments: 19 pages. Comments are welcomeSubjects: Combinatorics (math.CO); Information Theory (cs.IT)
We study combinatorial properties of plateaued functions. All quadratic functions, bent functions and most known APN functions are plateaued, so many cryptographic primitives rely on plateaued functions as building blocks. The main focus of our study is the interplay of the Walsh transform and linearity of a plateaued function, its differential properties, and their value distributions, i.e., the sizes of image and preimage sets. In particular, we study the special case of ``almost balanced'' plateaued functions, which only have two nonzero preimage set sizes, generalizing for instance all monomial functions. We achieve several direct connections and (non)existence conditions for these functions, showing for instance that plateaued $d$-to-$1$ functions (and thus plateaued monomials) only exist for a very select choice of $d$, and we derive for all these functions their linearity as well as bounds on their differential uniformity. We also specifically study the Walsh transform of plateaued APN functions and their relation to their value distribution.
Cross submissions (showing 4 of 4 entries)
- [10] arXiv:2308.03547 (replaced) [pdf, html, other]
-
Title: Near-optimal pilot assignment in cell-free massive MIMOComments: This version updates metadata and expands contentSubjects: Information Theory (cs.IT); Networking and Internet Architecture (cs.NI)
Cell-free massive MIMO systems are currently being considered as potential enablers of future (6G) technologies for wireless communications. By combining distributed processing and massive MIMO, they are expected to deliver improved user coverage and efficiency. A possible source of performance degradation in such systems is pilot contamination, which contributes to causing interference during uplink training and affects channel estimation negatively. Contamination occurs when the same pilot sequence is assigned to more than one user. This is in general inevitable, as the number of mutually orthogonal pilot sequences corresponds to only a fraction of the coherence interval. We introduce a new algorithm for pilot assignment and analyze its performance both from a theoretical perspective and in computational experiments. We show that it has an approximation ratio close to 1 for a plausibly large number of orthogonal pilot sequences, as well as low computational complexity under massive parallelism. We also show that, on average, it outperforms other methods in terms of per-user SINR and throughput on the uplink.
- [11] arXiv:2308.07663 (replaced) [pdf, html, other]
-
Title: Coherent set identification via direct low rank maximum likelihood estimationSubjects: Information Theory (cs.IT); Dynamical Systems (math.DS)
We analyze connections between two low rank modeling approaches from the last decade for treating dynamical data. The first one is the coherence problem (or coherent set approach), where groups of states are sought that evolve under the action of a stochastic transition matrix in a way maximally distinguishable from other groups. The second one is a low rank factorization approach for stochastic matrices, called Direct Bayesian Model Reduction (DBMR), which estimates the low rank factors directly from observed data. We show that DBMR results in a low rank model that is a projection of the full model, and exploit this insight to infer bounds on a quantitative measure of coherence within the reduced model. Both approaches can be formulated as optimization problems, and we also prove a bound between their respective objectives. On a broader scope, this work relates the two classical loss functions of nonnegative matrix factorization, namely the Frobenius norm and the generalized Kullback--Leibler divergence, and suggests new links between likelihood-based and projection-based estimation of probabilistic models.
- [12] arXiv:2409.03467 (replaced) [pdf, html, other]
-
Title: Cubic power functions with optimal second-order differential uniformitySubjects: Information Theory (cs.IT); Cryptography and Security (cs.CR); Number Theory (math.NT)
We discuss the second-order differential uniformity of vectorial Boolean functions. The closely related notion of second-order zero differential uniformity has recently been studied in connection to resistance to the boomerang attack. We prove that monomial functions with univariate form $x^d$ where $d=2^{2k}+2^k+1$ and $\gcd(k,n)=1$ have optimal second-order differential uniformity. Computational results suggest that, up to affine equivalence, these might be the only optimal cubic power functions. We begin work towards generalising such conditions to all monomial functions of algebraic degree 3. We also discuss further questions arising from computational results.
- [13] arXiv:2409.19674 (replaced) [pdf, html, other]
-
Title: Alternating Maximization Algorithm for Mismatch Capacity with Oblivious RelayingSubjects: Information Theory (cs.IT); Numerical Analysis (math.NA)
An approach is established for maximizing the Lower bound on the Mismatch capacity (hereafter abbreviated as LM rate), a key performance bound in mismatched decoding, by optimizing the channel input probability distribution. Under a fixed channel input probability distribution, the computation of the corresponding LM rate is a convex optimization problem. When optimizing the channel input probability distribution, however, the corresponding optimization problem adopts a max-min formulation, which is generally non-convex and is intractable with standard approaches. To solve this problem, a novel dual form of the LM rate is proposed, thereby transforming the max-min formulation into an equivalent double maximization formulation. This new formulation leads to a maximization problem setup wherein each individual optimization direction is convex. Consequently, an alternating maximization algorithm is established to solve the resultant maximization problem setup. Each step of the algorithm only involves a closed-form iteration, which is efficiently implemented with standard optimization procedures. Numerical experiments show the proposed approach for optimizing the LM rate leads to noticeable rate gains.
- [14] arXiv:2211.02032 (replaced) [pdf, html, other]
-
Title: To spike or not to spike: the whims of the Wonham filter in the strong noise regimeComments: v1, v2: Preliminary versions. v3: Submitted versionSubjects: Probability (math.PR); Information Theory (cs.IT); Optimization and Control (math.OC); Statistics Theory (math.ST)
We study the celebrated Shiryaev-Wonham filter (1964) in its historical setup where the hidden Markov jump process has two states. We are interested in the weak noise regime for the observation equation. Interestingly, this becomes a strong noise regime for the filtering equations.
Earlier results of the authors show the appearance of spikes in the filtered process, akin to a metastability phenomenon. This paper is aimed at understanding the smoothed optimal filter, which is relevant for any system with feedback. In particular, we exhibit a sharp phase transition between a spiking regime and a regime with perfect smoothing. - [15] arXiv:2212.04223 (replaced) [pdf, html, other]
-
Title: Vicious Classifiers: Assessing Inference-time Data Reconstruction Risk in Edge ComputingComments: Published at BMVC 2024 workshop on Privacy, Fairness, Accountability and Transparency in Computer VisionSubjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Information Theory (cs.IT)
Privacy-preserving inference in edge computing paradigms encourages the users of machine-learning services to locally run a model on their private input and only share the models outputs for a target task with the server. We study how a vicious server can reconstruct the input data by observing only the models outputs while keeping the target accuracy very close to that of a honest server by jointly training a target model (to run at users' side) and an attack model for data reconstruction (to secretly use at servers' side). We present a new measure to assess the inference-time reconstruction risk. Evaluations on six benchmark datasets show the model's input can be approximately reconstructed from the outputs of a single inference. We propose a primary defense mechanism to distinguish vicious versus honest classifiers at inference time. By studying such a risk associated with emerging ML services our work has implications for enhancing privacy in edge computing. We discuss open challenges and directions for future studies and release our code as a benchmark for the community at this https URL .
- [16] arXiv:2310.00090 (replaced) [pdf, html, other]
-
Title: On the Counting of Involutory MDS MatricesSubjects: Cryptography and Security (cs.CR); Information Theory (cs.IT)
The optimal branch number of MDS matrices has established their importance in designing diffusion layers for various block ciphers and hash functions. As a result, numerous matrix structures, including Hadamard and circulant matrices, have been proposed for constructing MDS matrices. Also, in the literature, significant attention is typically given to identifying MDS candidates with optimal implementations or proposing new constructions across different orders. However, this paper takes a different approach by not emphasizing efficiency issues or introducing new constructions. Instead, its primary objective is to enumerate Hadamard MDS and involutory Hadamard MDS matrices of order $4$ within the field $\mathbb{F}_{2^r}$. Specifically, it provides an explicit formula for the count of both Hadamard MDS and involutory Hadamard MDS matrices of order $4$ over $\mathbb{F}_{2^r}$. Additionally, it derives the count of Hadamard Near-MDS (NMDS) and involutory Hadamard NMDS matrices, each with exactly one zero in each row, of order $4$ over $\mathbb{F}_{2^r}$. Furthermore, the paper discusses some circulant-like matrices for constructing NMDS matrices and proves that when $n$ is even, any $2n \times 2n$ Type-II circulant-like matrix can never be an NMDS matrix. While it is known that NMDS matrices may be singular, this paper establishes that singular Hadamard matrices can never be NMDS matrices. Moreover, it proves that there exist exactly two orthogonal Type-I circulant-like matrices of order $4$ over $\mathbb{F}_{2^r}$.
- [17] arXiv:2409.01247 (replaced) [pdf, html, other]
-
Title: Conversational Complexity for Assessing Risk in Large Language ModelsComments: 15 pages, 6 figuresSubjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Theory (cs.IT)
Large Language Models (LLMs) present a dual-use dilemma: they enable beneficial applications while harboring potential for harm, particularly through conversational interactions. Despite various safeguards, advanced LLMs remain vulnerable. A watershed case was Kevin Roose's notable conversation with Bing, which elicited harmful outputs after extended interaction. This contrasts with simpler early jailbreaks that produced similar content more easily, raising the question: How much conversational effort is needed to elicit harmful information from LLMs? We propose two measures: Conversational Length (CL), which quantifies the conversation length used to obtain a specific response, and Conversational Complexity (CC), defined as the Kolmogorov complexity of the user's instruction sequence leading to the response. To address the incomputability of Kolmogorov complexity, we approximate CC using a reference LLM to estimate the compressibility of user instructions. Applying this approach to a large red-teaming dataset, we perform a quantitative analysis examining the statistical distribution of harmful and harmless conversational lengths and complexities. Our empirical findings suggest that this distributional analysis and the minimisation of CC serve as valuable tools for understanding AI safety, offering insights into the accessibility of harmful information. This work establishes a foundation for a new perspective on LLM safety, centered around the algorithmic complexity of pathways to harm.