Data Analysis, Statistics and Probability
See recent articles
- [1] arXiv:2408.12072 [pdf, html, other]
-
Title: Preservation of the Direct Photon and Neutral Meson Analysis in the PHENIX Experiment at RHICComments: Paper submitted to the Proceedings of the Advanced Computing and Analysis Techniques 2024 (ACAT) conferenceSubjects: Data Analysis, Statistics and Probability (physics.data-an); Nuclear Experiment (nucl-ex)
The PHENIX Collaboration has actively pursued a Data and Analysis Preservation program since 2019, the first such dedicated effort at RHIC. A particularly challenging aspect of this endeavor is preservation of complex physics analyses, selected for their scientific importance and the value of the specific techniques developed as a part of the research. For this, we have chosen one of the most impactful PHENIX results, the joint study of direct photons and neutral pions in high-energy d+Au collisions. To ensure reproducibility of this analysis going forward, we partitioned it into self-contained tasks and used a combination of containerization techniques, code management, and robust documentation. We then leveraged REANA (the platform for reproducible analysis developed at CERN) to run the required software. We present our experience based on this example, and outline our future plans for analysis preservation.
New submissions for Friday, 23 August 2024 (showing 1 of 1 entries )
- [2] arXiv:2408.11870 (cross-list from cond-mat.mtrl-sci) [pdf, html, other]
-
Title: Improved precision and accuracy of electron energy-loss spectroscopy quantification via fine structure fitting with constrained optimizationComments: 15 pages, 9 figuresSubjects: Materials Science (cond-mat.mtrl-sci); Applied Physics (physics.app-ph); Atomic Physics (physics.atom-ph); Data Analysis, Statistics and Probability (physics.data-an)
By working out the Bethe sum rule, a boundary condition that takes the form of a linear equality is derived for the fine structure observed in ionization edges present in electron energy-loss spectra. This condition is subsequently used as a constraint in the estimation process of the elemental abundances, demonstrating starkly improved precision and accuracy and reduced sensitivity to the number of model parameters. Furthermore, the fine structure is reliably extracted from the spectra in an automated way, thus providing critical information on the sample's electronic properties that is hard or impossible to obtain otherwise. Since this approach allows dispensing with the need for user-provided input, a potential source of bias is prevented.
- [3] arXiv:2408.11872 (cross-list from cond-mat.mtrl-sci) [pdf, html, other]
-
Title: Two points are enoughHao Liu, Yanbin Zhao, Huarong Zheng, Xiulin Fan, Zhihua Deng, Mengchi Chen, Xingkai Wang, Zhiyang Liu, Jianguo Lu, Jian ChenSubjects: Materials Science (cond-mat.mtrl-sci); Data Analysis, Statistics and Probability (physics.data-an)
Prognosis and diagnosis play an important role in accelerating the development of lithium-ion batteries, as well as reliable and long-life operation. In this work, we answer an important question: What is the minimum amount of data required to extract features for accurate battery prognosis and diagnosis? Based on the first principle, we successfully extracted the best two-point feature (BTPF) for accurate battery prognosis and diagnosis using the fewest data points (only two) and the simplest feature selection method (Pearson correlation coefficient). The BTPF extraction method is tested on 820 cells from 6 open-source datasets (covering five different chemistry types, seven manufacturers, and three data types). It achieves comparable accuracy to state-of-the-art features in both prognosis and diagnosis tasks. This work challenges the cognition of existing studies on the difficulty of battery prognosis and diagnosis tasks, subverts the fixed pattern of establishing prognosis and diagnosis methods for complex dynamic systems through deliberate feature engineering, highlights the promise of data-driven methods for field battery prognosis and diagnosis applications, and provides a new benchmark for future studies.
- [4] arXiv:2408.12083 (cross-list from physics.ed-ph) [pdf, html, other]
-
Title: Dominant misconceptions and alluvial flows between Engineering and Physical Science studentsComments: 25 pages, 8 figures, APA7Subjects: Physics Education (physics.ed-ph); Data Analysis, Statistics and Probability (physics.data-an)
In this article we assess the comprehension of physics concepts by Physical Science and Engineering students enrolled in their first semester at the University of Johannesburg (UJ), South Africa ($2022$). We employ different graphical measures to explore similarities and differences using the results of both pre- and post-test data from the Force Concept Inventory assessment tool, from which we calculate dominant misconceptions (DMs) and gains. We also use alluvial diagrams to track the choices made by these two groups of students from pre- to post-test stages. In our analysis, we find that DMs results indicate that participating Engineering students outperformed Physical Science students on average. However, the same types of normalised DMs persist at the post-test level. This is very useful when tracking persistent misconceptions, where when using repeated measures and alluvial diagrams with smaller groups of students, we find that Physical Science students tend to make more chaotic choices.
- [5] arXiv:2408.12271 (cross-list from quant-ph) [pdf, html, other]
-
Title: Domino-cooling Oscillator Networks with Deep Reinforcement LearningComments: The submission contains 6 (main text) + 8 (supplementary) pages with (5 + 6) figures and 1 table. For a demonstration of the cooling, see this https URLSubjects: Quantum Physics (quant-ph); Data Analysis, Statistics and Probability (physics.data-an)
The exploration of deep neural networks for optimal control has gathered a considerable amount of interest in recent years. Here, we utilize deep reinforcement learning to control individual evolutions of coupled harmonic oscillators in an oscillator network. Our work showcases a numerical approach to actively cool internal oscillators to their thermal ground states through modulated forces imparted to the external oscillators in the network. We present our results for thermal cooling of all oscillators in multiple network configurations and introduce the utility of our scheme in the quantum regime.
- [6] arXiv:2408.12296 (cross-list from hep-ph) [pdf, html, other]
-
Title: Multiple testing for signal-agnostic searches of new physics with machine learningComments: 17 pages, 5 tables, 6 figuresSubjects: High Energy Physics - Phenomenology (hep-ph); Machine Learning (cs.LG); High Energy Physics - Experiment (hep-ex); Data Analysis, Statistics and Probability (physics.data-an); Methodology (stat.ME)
In this work, we address the question of how to enhance signal-agnostic searches by leveraging multiple testing strategies. Specifically, we consider hypothesis tests relying on machine learning, where model selection can introduce a bias towards specific families of new physics signals. We show that it is beneficial to combine different tests, characterised by distinct choices of hyperparameters, and that performances comparable to the best available test are generally achieved while providing a more uniform response to various types of anomalies. Focusing on the New Physics Learning Machine, a methodology to perform a signal-agnostic likelihood-ratio test, we explore a number of approaches to multiple testing, such as combining p-values and aggregating test statistics.
Cross submissions for Friday, 23 August 2024 (showing 5 of 5 entries )
- [7] arXiv:2207.10710 (replaced) [pdf, html, other]
-
Title: Interpretable Boosted Decision Tree Analysis for the Majorana DemonstratorI. J. Arnquist, F. T. Avignone III, A. S. Barabash, C. J. Barton, K. H. Bhimani, E. Blalock, B. Bos, M. Busch, M. Buuck, T. S. Caldwell, Y -D. Chan, C. D. Christofferson, P. -H. Chu, M. L. Clark, C. Cuesta, J. A. Detwiler, Yu. Efremenko, S. R. Elliott, G. K. Giovanetti, M. P. Green, J. Gruszko, I. S. Guinn, V. E. Guiseppe, C. R. Haufe, R. Henning, D. Hervas Aguilar, E. W. Hoppe, A. Hostiuc, M. F. Kidd, I. Kim, R. T. Kouzes, T. E. Lannen V, A. Li, J. M. Lopez-Castano, E. L. Martin, R. D. Martin, R. Massarczyk, S. J. Meijer, T. K. Oli, G. Othman, L. S. Paudel, W. Pettus, A. W. P. Poon, D. C. Radford, A. L. Reine, K. Rielage, N. W. Ruof, D. C. Schaper, D. Tedeschi, R. L. Varner, S. Vasilyev, J. F. Wilkerson, C. Wiseman, W. Xu, C. -H. YuComments: 13 pages, 9 figuresJournal-ref: Phys. Rev. C, Vol. 107, Iss. 1, January 2023Subjects: Data Analysis, Statistics and Probability (physics.data-an); Machine Learning (cs.LG); Nuclear Experiment (nucl-ex)
The Majorana Demonstrator is a leading experiment searching for neutrinoless double-beta decay with high purity germanium detectors (HPGe). Machine learning provides a new way to maximize the amount of information provided by these detectors, but the data-driven nature makes it less interpretable compared to traditional analysis. An interpretability study reveals the machine's decision-making logic, allowing us to learn from the machine to feedback to the traditional analysis. In this work, we have presented the first machine learning analysis of the data from the Majorana Demonstrator; this is also the first interpretable machine learning analysis of any germanium detector experiment. Two gradient boosted decision tree models are trained to learn from the data, and a game-theory-based model interpretability study is conducted to understand the origin of the classification power. By learning from data, this analysis recognizes the correlations among reconstruction parameters to further enhance the background rejection performance. By learning from the machine, this analysis reveals the importance of new background categories to reciprocally benefit the standard Majorana analysis. This model is highly compatible with next-generation germanium detector experiments like LEGEND since it can be simultaneously trained on a large number of detectors.
- [8] arXiv:2405.12411 (replaced) [pdf, other]
-
Title: Decomposing causality into its synergistic, unique, and redundant componentsComments: arXiv admin note: text overlap with arXiv:2310.20544Subjects: Data Analysis, Statistics and Probability (physics.data-an); Fluid Dynamics (physics.flu-dyn)
Causality lies at the heart of scientific inquiry, serving as the fundamental basis for understanding interactions among variables in physical systems. Despite its central role, current methods for causal inference face significant challenges due to nonlinear dependencies, stochastic interactions, self-causation, collider effects, and influences from exogenous factors, among others. While existing methods can effectively address some of these challenges, no single approach has successfully integrated all these aspects. Here, we address these challenges with SURD: Synergistic-Unique-Redundant Decomposition of causality. SURD quantifies causality as the increments of redundant, unique, and synergistic information gained about future events from past observations. The formulation is non-intrusive and applicable to both computational and experimental investigations, even when samples are scarce. We benchmark SURD in scenarios that pose significant challenges for causal inference and demonstrate that it offers a more reliable quantification of causality compared to previous methods.