Neural and Evolutionary Computing
See recent articles
Showing new listings for Wednesday, 2 October 2024
- [1] arXiv:2410.00129 [pdf, html, other]
-
Title: Cartesian Genetic Programming Approach for Designing Convolutional Neural NetworksJournal-ref: Progress in Polish Artificial Intelligence Research, pp. 512-519, 2024Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
The present study covers an approach to neural architecture search (NAS) using Cartesian genetic programming (CGP) for the design and optimization of Convolutional Neural Networks (CNNs). In designing artificial neural networks, one crucial aspect of the innovative approach is suggesting a novel neural architecture. Currently used architectures have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In this work, we use pure Genetic Programming Approach to design CNNs, which employs only one genetic operation, i.e., mutation. In the course of preliminary experiments, our methodology yields promising results.
- [2] arXiv:2410.00518 [pdf, html, other]
-
Title: Analysing the Influence of Reorder Strategies for Cartesian Genetic ProgrammingHenning Cui (1), Andreas Margraf (2), Jörg Hähner (1) ((1) University of Augsburg, (2) Fraunhofer IGCV)Subjects: Neural and Evolutionary Computing (cs.NE)
Cartesian Genetic Programming (CGP) suffers from a specific limitation: Positional bias, a phenomenon in which mostly genes at the start of the genome contribute to a program output, while genes at the end rarely do.
This can lead to an overall worse performance of CGP.
One solution to overcome positional bias is to introduce reordering methods, which shuffle the current genotype without changing its corresponding phenotype.
There are currently two different reorder operators that extend the classic CGP formula and improve its fitness value.
In this work, we discuss possible shortcomings of these two existing operators.
Afterwards, we introduce three novel operators which reorder the genotype of a graph defined by CGP.
We show empirically on four Boolean and four symbolic regression benchmarks that the number of iterations until a solution is found and/or the fitness value improves by using CGP with a reorder method.
However, there is no consistently best performing reorder operator.
Furthermore, their behaviour is analysed by investigating their convergence plots and we show that all behave the same in terms of convergence type. - [3] arXiv:2410.00584 [pdf, html, other]
-
Title: Asymmetrically connected reservoir networks learn betterComments: 6 pages, 4 figures, supplementary materialSubjects: Neural and Evolutionary Computing (cs.NE); Chaotic Dynamics (nlin.CD)
We show that connectivity within the high-dimensional recurrent layer of a reservoir network is crucial for its performance. To this end, we systematically investigate the impact of network connectivity on its performance, i.e., we examine the symmetry and structure of the reservoir in relation to its computational power. Reservoirs with random and asymmetric connections are found to perform better for an exemplary Mackey-Glass time series than all structured reservoirs, including biologically inspired connectivities, such as small-world topologies. This result is quantified by the information processing capacity of the different network topologies which becomes highest for asymmetric and randomly connected networks.
- [4] arXiv:2410.00595 [pdf, html, other]
-
Title: On the Interaction of Adaptive Population Control with Cumulative Step-Size AdaptationSubjects: Neural and Evolutionary Computing (cs.NE)
Three state-of-the-art adaptive population control strategies (PCS) are theoretically and empirically investigated for a multi-recombinative, cumulative step-size adaptation Evolution Strategy $(\mu/\mu_I, \lambda)$-CSA-ES. First, scaling properties for the generation number and mutation strength rescaling are derived on the sphere in the limit of large population sizes. Then, the adaptation properties of three standard CSA-variants are studied as a function of the population size and dimensionality, and compared to the predicted scaling results. Thereafter, three PCS are implemented along the CSA-ES and studied on a test bed of sphere, random, and Rastrigin functions. The CSA-adaptation properties significantly influence the performance of the PCS, which is shown in more detail. Given the test bed, well-performing parameter sets (in terms of scaling, efficiency, and success rate) for both the CSA- and PCS-subroutines are identified.
New submissions (showing 4 of 4 entries)
- [5] arXiv:2410.00149 (cross-list from cs.CL) [pdf, html, other]
-
Title: Are Large Language Models In-Context Personalized Summarizers? Get an iCOPERNICUS Test Done!Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
Large Language Models (LLMs) have succeeded considerably in In-Context-Learning (ICL) based summarization. However, saliency is subject to the users' specific preference histories. Hence, we need reliable In-Context Personalization Learning (ICPL) capabilities within such LLMs. For any arbitrary LLM to exhibit ICPL, it needs to have the ability to discern contrast in user profiles. A recent study proposed a measure for degree-of-personalization called EGISES for the first time. EGISES measures a model's responsiveness to user profile differences. However, it cannot test if a model utilizes all three types of cues provided in ICPL prompts: (i) example summaries, (ii) user's reading histories, and (iii) contrast in user profiles. To address this, we propose the iCOPERNICUS framework, a novel In-COntext PERsonalization learNIng sCrUtiny of Summarization capability in LLMs that uses EGISES as a comparative measure. As a case-study, we evaluate 17 state-of-the-art LLMs based on their reported ICL performances and observe that 15 models' ICPL degrades (min: 1.6%; max: 3.6%) when probed with richer prompts, thereby showing lack of true ICPL.
- [6] arXiv:2410.00665 (cross-list from q-bio.NC) [pdf, html, other]
-
Title: TAVRNN: Temporal Attention-enhanced Variational Graph RNN Captures Neural Dynamics and BehaviorComments: 31 pages, 6 figures, 4 supplemental figures, 4 tables, 8 supplemental tablesSubjects: Neurons and Cognition (q-bio.NC); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
We introduce Temporal Attention-enhanced Variational Graph Recurrent Neural Network (TAVRNN), a novel framework for analyzing the evolving dynamics of neuronal connectivity networks in response to external stimuli and behavioral feedback. TAVRNN captures temporal changes in network structure by modeling sequential snapshots of neuronal activity, enabling the identification of key connectivity patterns. Leveraging temporal attention mechanisms and variational graph techniques, TAVRNN uncovers how connectivity shifts align with behavior over time. We validate TAVRNN on two datasets: in vivo calcium imaging data from freely behaving rats and novel in vitro electrophysiological data from the DishBrain system, where biological neurons control a simulated environment during the game of pong. We show that TAVRNN outperforms previous baseline models in classification, clustering tasks and computational efficiency while accurately linking connectivity changes to performance variations. Crucially, TAVRNN reveals that high game performance in the DishBrain system correlates with the alignment of sensory and motor subregion channels, a relationship not evident in earlier models. This framework represents the first application of dynamic graph representation of electrophysiological (neuronal) data from DishBrain system, providing insights into the reorganization of neuronal networks during learning. TAVRNN's ability to differentiate between neuronal states associated with successful and unsuccessful learning outcomes, offers significant implications for real-time monitoring and manipulation of biological neuronal systems.
Cross submissions (showing 2 of 2 entries)
- [7] arXiv:2402.01373 (replaced) [pdf, html, other]
-
Title: cmaes : A Simple yet Practical Python Library for CMA-ESSubjects: Neural and Evolutionary Computing (cs.NE); Mathematical Software (cs.MS)
The covariance matrix adaptation evolution strategy (CMA-ES) has been highly effective in black-box continuous optimization, as demonstrated by its success in both benchmark problems and various real-world applications. To address the need for an accessible yet potent tool in this domain, we developed cmaes, a simple and practical Python library for CMA-ES. cmaes is characterized by its simplicity, offering intuitive use and high code readability. This makes it suitable for quickly using CMA-ES, as well as for educational purposes and seamless integration into other libraries. Despite its simplistic design, cmaes maintains enhanced functionality. It incorporates recent advancements in CMA-ES, such as learning rate adaptation for challenging scenarios, transfer learning, and mixed-integer optimization capabilities. These advanced features are accessible through a user-friendly API, ensuring that cmaes can be easily adopted in practical applications. We regard cmaes as the first choice for a Python CMA-ES library among practitioners. The software is available under the MIT license at this https URL.
- [8] arXiv:2312.07987 (replaced) [pdf, html, other]
-
Title: SwitchHead: Accelerating Transformers with Mixture-of-Experts AttentionComments: Accepted to NeurIPS 2024Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Despite many recent works on Mixture of Experts (MoEs) for resource-efficient Transformer language models, existing methods mostly focus on MoEs for feedforward layers. Previous attempts at extending MoE to the self-attention layer fail to match the performance of the parameter-matched baseline. Our novel SwitchHead is an effective MoE method for the attention layer that successfully reduces both the compute and memory requirements, achieving wall-clock speedup, while matching the language modeling performance of the baseline Transformer. Our novel MoE mechanism allows SwitchHead to compute up to 8 times fewer attention matrices than the standard Transformer. SwitchHead can also be combined with MoE feedforward layers, resulting in fully-MoE "SwitchAll" Transformers. For our 262M parameter model trained on C4, SwitchHead matches the perplexity of standard models with only 44% compute and 27% memory usage. Zero-shot experiments on downstream tasks confirm the performance of SwitchHead, e.g., achieving more than 3.5% absolute improvements on BliMP compared to the baseline with an equal compute resource.