Data Analysis, Statistics and Probability
See recent articles
Showing new listings for Friday, 11 April 2025
- [1] arXiv:2306.03829 (replaced) [pdf, html, other]
-
Title: Small-Coupling Dynamic Cavity: a Bayesian mean-field framework for epidemic inferenceComments: 28 pages, 11 figures, 2 tables (including appendices)Subjects: Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Data Analysis, Statistics and Probability (physics.data-an); Populations and Evolution (q-bio.PE)
We present the Small-Coupling Dynamic Cavity (SCDC) method, a novel generalized mean-field approximation for epidemic inference and risk assessment within a fully Bayesian framework. SCDC accounts for non-causal effects of observations and uses a graphical model representation of epidemic processes to derive self-consistent equations for edge probability marginals. A small-coupling expansion yields time-dependent cavity messages capturing individual infection probabilities and observational conditioning. With linear computational cost per iteration in the epidemic duration, SCDC is particularly efficient and valid even for recurrent epidemic processes, where standard methods are exponentially complex. Tested on synthetic networks, it matches Belief Propagation in accuracy and outperforms individual-based mean-field methods. Notably, despite being derived as a small-infectiousness expansion, SCDC maintains good accuracy even for relatively large infection probabilities. While convergence issues may arise on graphs with long-range correlations, SCDC reliably estimates risk. Future extensions include non-Markovian models and higher-order terms in the dynamic cavity framework.
- [2] arXiv:2407.08343 (replaced) [pdf, other]
-
Title: Many wrong models approach to localize an odor source in turbulence with static sensorsSubjects: Fluid Dynamics (physics.flu-dyn); Atmospheric and Oceanic Physics (physics.ao-ph); Data Analysis, Statistics and Probability (physics.data-an)
The problem of locating an odor source in turbulent flows is central to key applications such as environmental monitoring and disaster response. We address this challenge by designing an algorithm based on Bayesian inference, which uses odor measurements from an ensemble of static sensors to estimate the source position through a stochastic model of the environment. The problem is difficult because of the multiscale and out-of-equilibrium properties of turbulent transport, which lack accurate analytical and phenomenological modeling, thus preventing a guaranteed convergence for Bayesian approaches. To overcome the risk of relying on a single unavoidably wrong model approximation, we propose a method to rank ``many wrong models'' and to blend their predictions. We evaluated our \emph{weighted Bayesian update} algorithm by its ability to estimate the source location with predefined accuracy and/or within a specified time frame and compare it to standard Monte Carlo sampling methods. To demonstrate the robustness and potential applications of both approaches under realistic environmental conditions, we use high-quality direct numerical simulations of the Navier-Stokes equations to mimic the turbulent transport of odors in presence of a strong mean wind. Despite minimal prior information on the source and environmental conditions, our proposed approach consistently proves to be more accurate, reliable, and robust than Monte Carlo methods, thus showing promise as a new tool for addressing the odor source localization problem in real-world scenarios.
- [3] arXiv:2411.00062 (replaced) [pdf, html, other]
-
Title: Scalable Reinforcement Post-Training Beyond Static Human Prompts: Evolving Alignment via Asymmetric Self-PlayZiyu Ye, Rishabh Agarwal, Tianqi Liu, Rishabh Joshi, Sarmishta Velury, Quoc V. Le, Qijun Tan, Yuan LiuComments: spotlight @ neurips language gamification workshop. updated the problem description and added new online RL experiments in this versionSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Data Analysis, Statistics and Probability (physics.data-an); Machine Learning (stat.ML)
Current reinforcement learning (RL) frameworks for large language models (LLM) post-training typically assume a fixed prompt distribution, which is sub-optimal and bottlenecks scalability. Prior works have explored prompt evolving, but are often limited to the supervised fine-tuning stage, and prompts are sampled and evolved uniformly without signals. This empirical work presents a paradigm shift: Evolving Alignment via Asymmetric Self-Play (eva), that casts post-training as an infinite game with regret-based signals for 2 players: (i) a creator, who strategically samples and creates new informative prompts and (ii) a solver, who learns to produce preferred responses. eva is the first method that allows language models to adaptively create training prompts in both offline and online RL post-training. The design is simple, easy-to-use yet remarkably effective: eva sets a new SOTA on challenging benchmarks, without any extra human prompts, e.g. it boosts the win-rate of gemma-2-9b-it on Arena-Hard by 51.6% -> 60.1% for DPO and 52.6% -> 62.4% for RLOO, surpassing claude-3-opus and catching up to gemini-1.5-pro, both of which are orders of magnitude larger. Extensive experiments show eva can create effective RL curricula and is robust across ablations. We believe adaptively evolving prompts are key to designing the next-generation RL post-training scheme.