We gratefully acknowledge support from
the Simons Foundation and member institutions.

Quantitative Biology

New submissions

[ total of 31 entries: 1-31 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Tue, 7 May 24

[1]  arXiv:2405.02374 [pdf, other]
Title: Protein binding affinity prediction under multiple substitutions applying eGNNs on Residue and Atomic graphs combined with Language model information: eGRAL
Subjects: Quantitative Methods (q-bio.QM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Protein-protein interactions (PPIs) play a crucial role in numerous biological processes. Developing methods that predict binding affinity changes under substitution mutations is fundamental for modelling and re-engineering biological systems. Deep learning is increasingly recognized as a powerful tool capable of bridging the gap between in-silico predictions and in-vitro observations. With this contribution, we propose eGRAL, a novel SE(3) equivariant graph neural network (eGNN) architecture designed for predicting binding affinity changes from multiple amino acid substitutions in protein complexes. eGRAL leverages residue, atomic and evolutionary scales, thanks to features extracted from protein large language models. To address the limited availability of large-scale affinity assays with structural information, we generate a simulated dataset comprising approximately 500,000 data points. Our model is pre-trained on this dataset, then fine-tuned and tested on experimental data.

[2]  arXiv:2405.02379 [pdf, other]
Title: Modelling the Stochastic Importation Dynamics and Establishment of Novel Pathogenic Strains using a General Branching Processes Framework
Subjects: Populations and Evolution (q-bio.PE); Probability (math.PR); Quantitative Methods (q-bio.QM)

The importation and subsequent establishment of novel pathogenic strains in a population is subject to a large degree of uncertainty due to the stochastic nature of the disease dynamics. Mathematical models need to take this stochasticity in the early phase of an outbreak in order to adequately capture the uncertainty in disease forecasts. We propose a general branching process model of disease spread that includes host-level heterogeneity, and that can be straightforwardly tailored to capture the salient aspects of a particular disease outbreak. We combine this with a model of case importation that occurs via an independent marked Poisson process. We use this framework to investigate the impact of different control strategies, particularly on the time to establishment of an invading, exogenous strain, using parameters taken from the literature for COVID-19 as an example. We also demonstrate how to combine our model with a deterministic approximation, such that longer term projections can be generated that still incorporate the uncertainty from the early growth phase of the epidemic. Our approach produces meaningful short- and medium-term projections of the course of a disease outbreak when model parameters are still uncertain and when stochasticity still has a large effect on the population dynamics.

[3]  arXiv:2405.02381 [pdf, ps, other]
Title: Interactions between eagles and semidomestic reindeer: lessons learned from field surveys and deterrents
Comments: 17 pages, 4 Figures, 1 Table
Subjects: Populations and Evolution (q-bio.PE)

Predation by eagles on semi-domesticated reindeer (Rangifer tarandus) is an emerging human wildlife conflict in Fennoscandia. Both the Golden (Aquila chrysaetos) and the White-tailed eagle (Haliaeetus albicilla) are believed by herders to predate on reindeer, however, there is a considerable knowledge gap regarding extent of predation and scavenging by each species, and their distribution and behaviour within the reindeer herding areas. Lethal and non-lethal methods have been suggested to reduce this conflict with eagles. We developed this project to fill the existing knowledge gaps by investigating the patterns of eagle abundance before, during, and after reindeer calving in a herding district in northern Sweden, and testing the effect of two potential deterrents (air ventilators and rotating prisms) in diverting eagles away from reindeer calving areas. During the single study period, we made 12, 47, and 17 eagle observations before, during, and after calving respectively. Out of these observations, 34 were of Golden eagles, 33 of White-tailed eagles, and for 9 observations the species could not be confirmed. Eagle abundance increased during calving and decreased again after calving ended. No attacks by eagles on calves were observed. The odds of observing eagles were significantly higher in the control area compared to areas with deterrents. More sub-adults were observed during calving, and both species were present in the area. The extent of predation was difficult to infer using direct observations and deterrents seem to show promise in diverting eagles away from calving grounds. These studies should be replicated to get a general picture of the issue and testing the efficiency of deterrents in diverting eagles away from reindeer across reindeer herding districts.

[4]  arXiv:2405.02524 [pdf, other]
Title: Simulation-based Inference of Developmental EEG Maturation with the Spectral Graph Model
Comments: 58 pages, 6 figures, 18 supplementary figures
Subjects: Neurons and Cognition (q-bio.NC)

The spectral content of macroscopic neural activity evolves throughout development, yet how this maturation relates to underlying brain network formation and dynamics remains unknown. To gain mechanistic insights into this process, we evaluate developmental EEG spectral changes via Bayesian model inversion of the spectral graph model (SGM), a parsimonious model of whole-brain spatiospectral activity derived from linearized neural field models coupled by the structural connectome. Simulation-based inference was used to estimate age-varying SGM parameter posterior distributions from EEG spectra spanning the developmental period. We found this model-fitting approach accurately captures the developmental maturation of EEG spectra via a neurobiologically consistent progression of key neural parameters: long-range coupling, axonal conductance speed, and excitatory:inhibitory balance. These results suggest that spectral maturation of brain activity observed during normal development is supported by functional adaptations, specifically age-dependent tuning of localized neural dynamics and their long-range coupling within the macroscopic, structural network.

[5]  arXiv:2405.02593 [pdf, ps, other]
Title: An Interdisciplinary Perspective of the Built-Environment Microbiome
Comments: 23 pages
Subjects: Populations and Evolution (q-bio.PE)

The built environment provides an excellent setting for interdisciplinary research on the dynamics of microbial communities. The system is simplified compared to many natural settings, and to some extent the entire environment can be manipulated, from architectural design, to materials use, air flow, human traffic, and capacity to disrupt microbial communities through cleaning. Here we provide an overview of the ecology of the microbiome in the built environment. We address niche space and refugia, population and community (metagenomic) dynamics, spatial ecology within a building, including the major microbial transmission mechanisms, as well as evolution. We also address the landscape ecology connecting microbiomes between physically separated buildings. At each stage we pay particular attention to the actual and potential interface between disciplines, such as ecology, epidemiology, materials science, and human social behavior. We end by identifying some opportunities for future interdisciplinary research on the microbiome of the built environment.

[6]  arXiv:2405.02674 [pdf, other]
Title: Ambush strategy enhances organisms' performance in rock-paper-scissors games
Comments: 8 pages, 5 figures
Subjects: Populations and Evolution (q-bio.PE); Adaptation and Self-Organizing Systems (nlin.AO); Pattern Formation and Solitons (nlin.PS); Biological Physics (physics.bio-ph); Quantitative Methods (q-bio.QM)

We study a five-species cyclic system wherein individuals of one species strategically adapt their movements to enhance their performance in the spatial rock-paper-scissors game. Environmental cues enable the awareness of the presence of organisms targeted for elimination in the cyclic game. If the local density of target organisms is sufficiently high, individuals move towards concentrated areas for direct attack; otherwise, they employ an ambush tactic, maximising the chances of success by targeting regions likely to be dominated by opponents. Running stochastic simulations, we discover that the ambush strategy enhances the likelihood of individual success compared to direct attacks alone, leading to uneven spatial patterns characterised by spiral waves. We compute the autocorrelation function and measure how the ambush tactic unbalances the organisms' spatial organisation by calculating the characteristic length scale of typical spatial domains of each species. We demonstrate that the threshold for local species density influences the ambush strategy's effectiveness, while the neighbourhood perception range significantly impacts decision-making accuracy. The outcomes show that long-range perception improves performance by over 60\%, although there is potential interference in decision-making under high attack triggers. Understanding how organisms' adaptation to their environment enhances their performance may be helpful not only for ecologists but also for data scientists aiming to improve artificial intelligence systems.

[7]  arXiv:2405.02767 [pdf, ps, other]
Title: A Decade in a Systematic Review: The Evolution and Impact of Cell Painting
Comments: Supplementary Table/Code here: this https URL
Subjects: Subcellular Processes (q-bio.SC); Cell Behavior (q-bio.CB)

High-content image-based assays have fueled significant discoveries in the life sciences in the past decade (2013-2023), including novel insights into disease etiology, mechanism of action, new therapeutics, and toxicology predictions. Here, we systematically review the substantial methodological advancements and applications of Cell Painting. Advancements include improvements in the Cell Painting protocol, assay adaptations for different types of perturbations and applications, and improved methodologies for feature extraction, quality control, and batch effect correction. Moreover, machine learning methods recently surpassed classical approaches in their ability to extract biologically useful information from Cell Painting images. Cell Painting data have been used alone or in combination with other -omics data to decipher the mechanism of action of a compound, its toxicity profile, and many other biological effects. Overall, key methodological advances have expanded the ability of Cell Painting to capture cellular responses to various perturbations. Future advances will likely lie in advancing computational and experimental techniques, developing new publicly available datasets, and integrating them with other high-content data types.

[8]  arXiv:2405.02786 [pdf, other]
Title: A causal inference approach of monosynapses from spike trains
Subjects: Neurons and Cognition (q-bio.NC)

Neuroscientists have worked on the problem of estimating synaptic properties, such as connectivity and strength, from simultaneously recorded spike trains since the 1960s. Recent years have seen renewed interest in the problem, coinciding with rapid advances in the technology of high-density neural recordings and optogenetics, which can be used to calibrate causal hypotheses about functional connectivity. Here, a rigorous causal inference framework for pairwise excitatory and inhibitory monosynaptic effects between spike trains is developed. Causal interactions are identified by separating spike interactions in pairwise spike trains by their timescales. Fast algorithms for computing accurate estimates of associated quantities are also developed. Through the lens of this framework, the link between biophysical parameters and statistical definitions of causality between spike trains is examined across a spectrum of dynamical systems simulations. In an idealized setting, we demonstrate a correspondence between the synaptic causal metric developed here and the probabilities of causation developed by Tian and Pearl. Since the probabilities of causation are derived under distinct assumptions and include data from experimental randomization, this opens up the possibility of testing the synaptic inference framework's assumptions with juxtacellular or optogenetic stimulation. We simulate such an experiment with a biophysically detailed channelrhodopsin model and show that randomization is not achieved; strong confounding persists even with strong stimulations. A principal goal is to ask how carefully articulated causal assumptions might better inform the design of neural stimulation experiments and, in turn, support experimental tests of those assumptions.

[9]  arXiv:2405.02820 [pdf, other]
Title: Coat stiffening explains the consensus pathway of clathrin-mediated endocytosis
Authors: Felix Frey (IST Austria), Ulrich S. Schwarz (Heidelberg University)
Comments: revtex, 12 pages, 5 figures in PDF-format
Subjects: Subcellular Processes (q-bio.SC); Soft Condensed Matter (cond-mat.soft)

Clathrin-mediated endocytosis is the main pathway used by eukaryotic cells to take up extracellular material, but the dominant physical mechanisms driving this process are still elusive. Recently several high-resolution imaging techniques have been used on different cell lines to measure the geometrical properties of clathrin-coated pits over their whole lifetime. Here we first show that all datasets follow the same consensus pathway, which is well described by the recently introduced cooperative curvature model, which predicts a flat-to curved transition at finite area, followed by linear growth and subsequent saturation of curvature. We then apply an energetic model for the composite of plasma membrane and clathrin coat to the consensus pathway to show that the dominant mechanism for invagination is coat stiffening, which results from cooperative interactions between the different clathrin molecules and progressively drives the system towards its intrinsic curvature. Our theory predicts that two length scales determine the time course of invagination, namely the patch size at which the flat-to-curved transition occurs and the final pit radius.

[10]  arXiv:2405.02853 [pdf, ps, other]
Title: Development and validation of a short form of the medication literacy scale for Chinese College Students
Authors: Chen Zhenzhen (1,2), Ren Jiabao (1,2), Duan Tingyu (3), Chen Ke (4), Hou Ruyi (5), Li Yimiao (5), Zeng Leixiao (5), Meng Xiaoxuan (6), Wu Yibo (7), Liu Yu (2), ((1) College of Science, Minzu University of China, Beijing, China, (2) School of Nursing, China Medical University, Shenyang, Liaoning Province, China, (3) Hebei Institute of Communications, Hebei, China, (4) Department of Social Science and Humanities, Harbin Medical University, Harbin, Heilongjiang Province, China, (5) School of Journalism and Communication, Renmin University of China, Beijing, China, (6) Tianjin Medical University, Tianjin, China, (7) School of Public Health, Peking University, Beijing, China)
Comments: 25 pages, 3 figures,3 tables
Subjects: Other Quantitative Biology (q-bio.OT)

Medication literacy is integral to health literacy, pivotal for medication safety and adherence. It denotes an individual's capacity to discern, comprehend, and convey medication-related information. Existing scales, however, are time-consuming and predominantly cater to patients and community dwellers, necessitating a more succinct instrument. This study presents the development of a brief Medication Literacy Scale (MLS-14) utilizing classical test theory (CTT) and item response theory (IRT), targeting a college student demographic. The MLS-14's abbreviated version, a 6-item scale (MLS-SF), was distilled through CTT and IRT methodologies, engaging 2431 Chinese college students to scrutinize its psychometric properties. The MLS-SF demonstrated a Cronbach's {\alpha} of 0.765, with three extracted factors via exploratory factor analysis, accounting for 66% of the cumulative variance. All items exhibited factor loadings above 0.5. The scale's three-factor structure was substantiated through confirmatory factor analysis with satisfactory fit indices (chi2/df=5.11, RMSEA=0.063, GFI=0.990, AGFI=0.966, NFI=0.984, IFI=0.987, CFI=0.987). IRT modeling confirmed reasonable discrimination and location parameters for all items, free of differential item functioning (DIF) by gender. Except for items 4 and 10, the remaining items were informative at medium theta levels, indicating their utility in assessing medication literacy efficiently. The developed 6-item Medication Literacy Short Form (MLS-SF) proves to be a reliable and valid instrument for the expedited evaluation of college students' medication literacy, offering a valuable addition to the arsenal of health literacy assessment tools.

[11]  arXiv:2405.03148 [pdf, other]
Title: Counting Subnetworks Under Gene Duplication in Genetic Regulatory Networks
Subjects: Molecular Networks (q-bio.MN); Combinatorics (math.CO)

Gene duplication is a fundamental evolutionary mechanism that contributes to biological complexity and diversity (Fortna et al., 2004). Traditionally, research has focused on the duplication of gene sequences (Zhang, 1914). However, evidence suggests that the duplication of regulatory elements may also play a significant role in the evolution of genomic functions (Teichmann and Babu, 2004; Hallin and Landry, 2019). In this work, the evolution of regulatory relationships belonging to gene-specific-substructures in a GRN are modeled. In the model, a network grows from an initial configuration by repeatedly choosing a random gene to duplicate. The likelihood that the regulatory relationships associated with the selected gene are retained through duplication is determined by a vector of probabilities. Occurrences of gene-family-specific substructures are counted under the gene duplication model. In this thesis, gene-family-specific substructures are referred to as subnetwork motifs. These subnetwork motifs are motivated by network motifs which are patterns of interconnections that recur more often in a specialized network than in a random network (Milo et al., 2002). Subnetwork motifs differ from network motifs in the way that subnetwork motifs are instances of gene-family-specific substructures while network motifs are isomorphic substructures. These subnetwork motifs are counted under Full and Partial Duplication, which differ in the way in which regulation relationships are inherited. Full duplication occurs when all regulatory links are inherited at each duplication step, and Partial Duplication occurs when regulation inheritance varies at each duplication step. Moments for the number of occurrences of subnetwork motifs are determined in each model. The results presented offer a method for discovering subnetwork motifs that are significant in a GRN under gene duplication.

[12]  arXiv:2405.03346 [pdf, other]
Title: Population dynamics and games of variable size
Comments: 24 pages, 4 figures
Journal-ref: Matheus Hansen, Fabio A. C. C. Chalub. Population dynamics and games of variable size. Journal of Theoretical Biology 589, 111842, 2024
Subjects: Populations and Evolution (q-bio.PE)

This work introduces the concept of Variable Size Game Theory (VSGT), in which the number of players in a game is a strategic decision made by the players themselves. We start by discussing the main examples in game theory: dominance, coexistence, and coordination. We show that the same set of pay-offs can result in coordination-like or coexistence-like games depending on the strategic decision of each player type. We also solve an inverse problem to find a $d$-player game that reproduces the same fixation pattern of the VSGT. In the sequel, we consider a game involving prosocial and antisocial players, i.e., individuals who tend to play with large groups and small groups, respectively. In this game, a certain task should be performed, that will benefit one of the participants at the expense of the other players. We show that individuals able to gather large groups to perform the task may prevail, even if this task is costly, providing a possible scenario for the evolution of eusociality. The next example shows that different strategies regarding game size may lead to spontaneous separation of different types, a possible scenario for speciation without physical separation (sympatric speciation). In the last example, we generalize to three types of populations from the previous analysis and study compartmental epidemic models: in particular, we recast the SIRS model into the VSGT framework: Susceptibles play 2-player games, while Infectious and Removed play a 1-player game. The SIRS epidemic model is then obtained as the replicator equation of the VSGT. We finish with possible applications of VSGT to be addressed in the future.

[13]  arXiv:2405.03370 [pdf, other]
Title: AntiFold: Improved antibody structure-based design using inverse folding
Subjects: Biomolecules (q-bio.BM); Quantitative Methods (q-bio.QM)

The design and optimization of antibodies requires an intricate balance across multiple properties. Protein inverse folding models, capable of generating diverse sequences folding into the same structure, are promising tools for maintaining structural integrity during antibody design. Here, we present AntiFold, an antibody-specific inverse folding model, fine-tuned from ESM-IF1 on solved and predicted antibody structures. AntiFold outperforms existing inverse folding tools on sequence recovery across complementarity-determining regions, with designed sequences showing high structural similarity to their solved counterpart. It additionally achieves stronger correlations when predicting antibody-antigen binding affinity in a zero-shot manner, while performance is augmented further when including antigen information. AntiFold assigns low probabilities to mutations that disrupt antigen binding, synergizing with protein language model residue probabilities, and demonstrates promise for guiding antibody optimization while retaining structure-related properties. AntiFold is freely available under the BSD 3-Clause as a web server at https://opig.stats.ox.ac.uk/webapps/antifold/ and and pip installable package at https://github.com/oxpig/AntiFold

[14]  arXiv:2405.03601 [pdf, other]
Title: Firing rate model for brain rhythms controlled by astrocytes
Comments: 12 pages, 4 figures
Subjects: Neurons and Cognition (q-bio.NC)

We propose a new mean-field model of brain rhythms governed by astrocytes. This theoretical framework describes how astrocytes can regulate neuronal activity and contribute to the generation of brain rhythms. The model describes at the population level the interactions between two large groups of excitatory and inhibitory neurons. The excitatory population is governed by astrocytes via a so-called tripartite synapse. This approach allows us to describe how the interactions between different groups of neurons and astrocytes can give rise to various patterns of synchronized activity and transitions between them. Using methods of nonlinear analysis we show that astrocytic modulation can lead to a change in the period and amplitude of oscillations in the populations of neurons.

[15]  arXiv:2405.03602 [pdf, other]
Title: One nose but two nostrils: Learn to align with sparse connections between two olfactory cortices
Subjects: Neurons and Cognition (q-bio.NC); Biological Physics (physics.bio-ph)

The integration of neural representations in the two hemispheres is an important problem in neuroscience. Recent experiments revealed that odor responses in cortical neurons driven by separate stimulation of the two nostrils are highly correlated. This bilateral alignment points to structured inter-hemispheric connections, but detailed mechanism remains unclear. Here, we hypothesized that continuous exposure to environmental odors shapes these projections and modeled it as online learning with local Hebbian rule. We found that Hebbian learning with sparse connections achieves bilateral alignment, exhibiting a linear trade-off between speed and accuracy. We identified an inverse scaling relationship between the number of cortical neurons and the inter-hemispheric projection density required for desired alignment accuracy, i.e., more cortical neurons allow sparser inter-hemispheric projections. We next compared the alignment performance of local Hebbian rule and the global stochastic-gradient-descent (SGD) learning for artificial neural networks. We found that although SGD leads to the same alignment accuracy with modestly sparser connectivity, the same inverse scaling relation holds. We showed that their similar performance originates from the fact that the update vectors of the two learning rules align significantly throughout the learning process. This insight may inspire efficient sparse local learning algorithms for more complex problems.

Cross-lists for Tue, 7 May 24

[16]  arXiv:2405.02354 (cross-list from cs.LG) [pdf, ps, other]
Title: Heterogeneous network and graph attention auto-encoder for LncRNA-disease association prediction
Comments: 10 pages, 8 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Quantitative Methods (q-bio.QM)

The emerging research shows that lncRNAs are associated with a series of complex human diseases. However, most of the existing methods have limitations in identifying nonlinear lncRNA-disease associations (LDAs), and it remains a huge challenge to predict new LDAs. Therefore, the accurate identification of LDAs is very important for the warning and treatment of diseases. In this work, multiple sources of biomedical data are fully utilized to construct characteristics of lncRNAs and diseases, and linear and nonlinear characteristics are effectively integrated. Furthermore, a novel deep learning model based on graph attention automatic encoder is proposed, called HGATELDA. To begin with, the linear characteristics of lncRNAs and diseases are created by the miRNA-lncRNA interaction matrix and miRNA-disease interaction matrix. Following this, the nonlinear features of diseases and lncRNAs are extracted using a graph attention auto-encoder, which largely retains the critical information and effectively aggregates the neighborhood information of nodes. In the end, LDAs can be predicted by fusing the linear and nonlinear characteristics of diseases and lncRNA. The HGATELDA model achieves an impressive AUC value of 0.9692 when evaluated using a 5-fold cross-validation indicating its superior performance in comparison to several recent prediction models. Meanwhile, the effectiveness of HGATELDA in identifying novel LDAs is further demonstrated by case studies. the HGATELDA model appears to be a viable computational model for predicting LDAs.

[17]  arXiv:2405.02449 (cross-list from stat.ML) [pdf, other]
Title: Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design
Comments: Published in International Conference on Machine Learning, ICML 2024. Code can be found in the Vertaix GitHub: this https URL Paper dedicated to Kwame Nkrumah
Subjects: Machine Learning (stat.ML); Materials Science (cond-mat.mtrl-sci); Machine Learning (cs.LG); Biomolecules (q-bio.BM)

Experimental design techniques such as active search and Bayesian optimization are widely used in the natural sciences for data collection and discovery. However, existing techniques tend to favor exploitation over exploration of the search space, which causes them to get stuck in local optima. This ``collapse" problem prevents experimental design algorithms from yielding diverse high-quality data. In this paper, we extend the Vendi scores -- a family of interpretable similarity-based diversity metrics -- to account for quality. We then leverage these quality-weighted Vendi scores to tackle experimental design problems across various applications, including drug discovery, materials discovery, and reinforcement learning. We found that quality-weighted Vendi scores allow us to construct policies for experimental design that flexibly balance quality and diversity, and ultimately assemble rich and diverse sets of high-performing data points. Our algorithms led to a 70%-170% increase in the number of effective discoveries compared to baselines.

[18]  arXiv:2405.02534 (cross-list from cs.LG) [pdf, other]
Title: A Multi-Domain Multi-Task Approach for Feature Selection from Bulk RNA Datasets
Subjects: Machine Learning (cs.LG); Genomics (q-bio.GN)

In this paper a multi-domain multi-task algorithm for feature selection in bulk RNAseq data is proposed. Two datasets are investigated arising from mouse host immune response to Salmonella infection. Data is collected from several strains of collaborative cross mice. Samples from the spleen and liver serve as the two domains. Several machine learning experiments are conducted and the small subset of discriminative across domains features have been extracted in each case. The algorithm proves viable and underlines the benefits of across domain feature selection by extracting new subset of discriminative features which couldn't be extracted only by one-domain approach.

[19]  arXiv:2405.02564 (cross-list from cs.CV) [pdf, ps, other]
Title: Leveraging the Human Ventral Visual Stream to Improve Neural Network Robustness
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)

Human object recognition exhibits remarkable resilience in cluttered and dynamic visual environments. In contrast, despite their unparalleled performance across numerous visual tasks, Deep Neural Networks (DNNs) remain far less robust than humans, showing, for example, a surprising susceptibility to adversarial attacks involving image perturbations that are (almost) imperceptible to humans. Human object recognition likely owes its robustness, in part, to the increasingly resilient representations that emerge along the hierarchy of the ventral visual cortex. Here we show that DNNs, when guided by neural representations from a hierarchical sequence of regions in the human ventral visual stream, display increasing robustness to adversarial attacks. These neural-guided models also exhibit a gradual shift towards more human-like decision-making patterns and develop hierarchically smoother decision surfaces. Importantly, the resulting representational spaces differ in important ways from those produced by conventional smoothing methods, suggesting that such neural-guidance may provide previously unexplored robustness solutions. Our findings support the gradual emergence of human robustness along the ventral visual hierarchy and suggest that the key to DNN robustness may lie in increasing emulation of the human brain.

[20]  arXiv:2405.02845 (cross-list from cs.LG) [pdf, other]
Title: Data-Efficient Molecular Generation with Hierarchical Textual Inversion
Subjects: Machine Learning (cs.LG); Molecular Networks (q-bio.MN)

Developing an effective molecular generation framework even with a limited number of molecules is often important for its practical deployment, e.g., drug discovery, since acquiring task-related molecular data requires expensive and time-consuming experimental costs. To tackle this issue, we introduce Hierarchical textual Inversion for Molecular generation (HI-Mol), a novel data-efficient molecular generation method. HI-Mol is inspired by the importance of hierarchical information, e.g., both coarse- and fine-grained features, in understanding the molecule distribution. We propose to use multi-level embeddings to reflect such hierarchical features based on the adoption of the recent textual inversion technique in the visual domain, which achieves data-efficient image generation. Compared to the conventional textual inversion method in the image domain using a single-level token embedding, our multi-level token embeddings allow the model to effectively learn the underlying low-shot molecule distribution. We then generate molecules based on the interpolation of the multi-level token embeddings. Extensive experiments demonstrate the superiority of HI-Mol with notable data-efficiency. For instance, on QM9, HI-Mol outperforms the prior state-of-the-art method with 50x less training data. We also show the effectiveness of molecules generated by HI-Mol in low-shot molecular property prediction.

Replacements for Tue, 7 May 24

[21]  arXiv:2208.06348 (replaced) [pdf, other]
Title: Can Brain Signals Reveal Inner Alignment with Human Languages?
Comments: EMNLP 2023 Findings
Subjects: Neurons and Cognition (q-bio.NC); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[22]  arXiv:2301.01445 (replaced) [pdf, other]
Title: A compositional account of motifs, mechanisms, and dynamics in biochemical regulatory networks
Comments: Final version published in Compositionality
Journal-ref: Compositionality, Volume 6, Issue 2 (2024)
Subjects: Molecular Networks (q-bio.MN); Category Theory (math.CT)
[23]  arXiv:2305.19367 (replaced) [pdf, other]
Title: Fibration symmetries and cluster synchronization in the Caenorhabditis elegans connectome
Comments: Word count Text: 12054 words. 99 references. 17 figures
Subjects: Neurons and Cognition (q-bio.NC); Quantitative Methods (q-bio.QM)
[24]  arXiv:2307.16767 (replaced) [pdf, ps, other]
Title: Infection-induced Cascading Failures -- Impact and Mitigation
Authors: Bo Li, David Saad
Journal-ref: Communications Physics 7, 144 (2024)
Subjects: Physics and Society (physics.soc-ph); Disordered Systems and Neural Networks (cond-mat.dis-nn); Optimization and Control (math.OC); Populations and Evolution (q-bio.PE)
[25]  arXiv:2310.12952 (replaced) [pdf, other]
Title: Cousins Of The Vendi Score: A Family Of Similarity-Based Diversity Metrics For Science And Machine Learning
Comments: Published in the proceedings of Artificial Intelligence and Statistics, AISTATS 2024. This paper is dedicated to Aline Sitoe Diatta. The code can be found on Vertaix's GitHub. Code for evaluating diversity using the Vendi scores can be found at this https URL Code for using the scores within Vendi Sampling can be found at this https URL
Subjects: Machine Learning (cs.LG); Chemical Physics (physics.chem-ph); Populations and Evolution (q-bio.PE)
[26]  arXiv:2402.13392 (replaced) [pdf, other]
Title: An SEIR network epidemic model with manual and digital contact tracing allowing delays
Subjects: Physics and Society (physics.soc-ph); Probability (math.PR); Populations and Evolution (q-bio.PE)
[27]  arXiv:2403.03134 (replaced) [pdf, other]
Title: Simplicity in Complexity : Explaining Visual Complexity using Deep Segmentation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)
[28]  arXiv:2404.05468 (replaced) [pdf, other]
Title: Mind-to-Image: Projecting Visual Mental Imagination of the Brain from fMRI
Comments: Pre-print to be updated. Work in progress
Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[29]  arXiv:2404.05696 (replaced) [pdf, ps, other]
Title: BOLD v4: A Centralized Bioinformatics Platform for DNA-based Biodiversity Data
Subjects: Databases (cs.DB); Quantitative Methods (q-bio.QM)
[30]  arXiv:2404.17026 (replaced) [pdf, other]
Title: Catalytic Coagulation
Comments: 8 pages, 1 figure. Version 2: terminology of the model changed. No other changes
Subjects: Statistical Mechanics (cond-mat.stat-mech); Biological Physics (physics.bio-ph); Chemical Physics (physics.chem-ph); Quantitative Methods (q-bio.QM)
[31]  arXiv:2404.19041 (replaced) [pdf, other]
Title: Stochastic dynamics of two-compartment models with regulatory mechanisms for hematopoiesis
Subjects: Populations and Evolution (q-bio.PE)
[ total of 31 entries: 1-31 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, q-bio, recent, 2405, contact, help  (Access key information)