We gratefully acknowledge support from
the Simons Foundation and member institutions.

Quantitative Biology

New submissions

[ total of 26 entries: 1-26 ]
[ showing up to 1000 entries per page: fewer | more ]

New submissions for Mon, 3 Jun 24

[1]  arXiv:2405.20359 [pdf, ps, other]
Title: Life history shapes variation in egg composition in the blue tit Cyanistes caeruleus
Journal-ref: Communications Biology (2019) 2:6
Subjects: Populations and Evolution (q-bio.PE)

Maternal investment directly shapes early developmental conditions and therefore has longterm fitness consequences for the offspring. In oviparous species prenatal maternal investment is fixed at the time of laying. To ensure the best survival chances for most of their offspring, females must equip their eggs with the resources required to perform well under various circumstances, yet the actual mechanisms remain unknown. Here we describe the blue tit egg albumen and yolk proteomes and evaluate their potential to mediate maternal effects. We show that variation in egg composition (proteins, lipids, carotenoids) primarily depends on laying order and female age. Egg proteomic profiles are mainly driven by laying order, and investment in the egg proteome is functionally biased among eggs. Our results suggest that maternal effects on egg composition result from both passive and active (partly compensatory) mechanisms, and that variation in egg composition creates diverse biochemical environments for embryonic development.

[2]  arXiv:2405.20523 [pdf, other]
Title: Systems-level health of patients living with end-stage kidney disease using standard lab values
Comments: 50 pages (15 for main, 35 supplemental), 10 figures in main
Subjects: Quantitative Methods (q-bio.QM)

We present a systems-level analysis of end-stage kidney disease (ESKD) with a dynamical network analysis of 14 commonly measured blood-based biomarkers in patients undergoing regular haemodialysis. Utilizing a validated pipeline for declining homeostatic systems, our approach learns a dynamical model together with an invertible transformation that simplifies the behaviour of observed biomarkers into natural variables. Within the natural variables, we identified two distinct dynamical behaviours: (i) stochastic accumulation, the random accumulation of abnormal values, and (ii) mallostasis, a deterministic drift towards worse health. These behaviours are identified by persistent fluctuations indicating weak stability, or a gradual shift in homeostatic set point, respectively. Both lead to worsening natural variable values, making the natural variables salient survival predictors with preferred directions of increasing risk. When this worsening is transformed back into observable biomarkers, it generates a coherent spectrum of worsening medical signs characteristic of a medical syndrome. Specifically, we found that small modules of natural variables corresponded to two existing syndromes commonly afflicting ESKD patients: protein-energy wasting and sepsis. We also identified new prospective syndromes. Our findings suggest that natural variables are robust, systems-level biomarkers, capturing the complex, holistic changes in health associated with ESKD.

[3]  arXiv:2405.20591 [pdf, other]
Title: Weak-Form Inference for Hybrid Dynamical Systems in Ecology
Subjects: Populations and Evolution (q-bio.PE); Machine Learning (cs.LG); Dynamical Systems (math.DS)

Species subject to predation and environmental threats commonly exhibit variable periods of population boom and bust over long timescales. Understanding and predicting such behavior, especially given the inherent heterogeneity and stochasticity of exogenous driving factors over short timescales, is an ongoing challenge. A modeling paradigm gaining popularity in the ecological sciences for such multi-scale effects is to couple short-term continuous dynamics to long-term discrete updates. We develop a data-driven method utilizing weak-form equation learning to extract such hybrid governing equations for population dynamics and to estimate the requisite parameters using sparse intermittent measurements of the discrete and continuous variables. The method produces a set of short-term continuous dynamical system equations parametrized by long-term variables, and long-term discrete equations parametrized by short-term variables, allowing direct assessment of interdependencies between the two time scales. We demonstrate the utility of the method on a variety of ecological scenarios and provide extensive tests using models previously derived for epizootics experienced by the North American spongy moth (Lymantria dispar dispar).

[4]  arXiv:2405.20619 [pdf, other]
Title: Investigation of P. Vivax Elimination via Mass Drug Administration
Subjects: Populations and Evolution (q-bio.PE)

Plasmodium vivax is the most geographically widespread malaria parasite due to its ability to remain dormant (as a hypnozoite) in the human liver and subsequently reactivate. Given the majority of P. vivax infections are due to hypnozoite reactivation, targeting the hypnozoite reservoir with a radical cure is crucial for achieving P. vivax elimination. Stochastic effects can strongly influence dynamics when disease prevalence is low or when the population size is small. Hence, it is important to account for this when modelling malaria elimination.cWe use a stochastic multiscale model of P. vivax transmission to study the impacts of multiple rounds of mass drug administration (MDA) with a radical cure, accounting for superinfection and hypnozoite dynamics. Our results indicate multiple rounds of MDA with a high-efficacy drug are needed to achieve a substantial probability of elimination. This work has the potential to help guide P. vivax elimination strategies by quantifying elimination probabilities for an MDA approach.

[5]  arXiv:2405.20668 [pdf, other]
Title: Improving Paratope and Epitope Prediction by Multi-Modal Contrastive Learning and Interaction Informativeness Estimation
Comments: This paper is accepted by IJCAI 2024
Subjects: Biomolecules (q-bio.BM); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)

Accurately predicting antibody-antigen binding residues, i.e., paratopes and epitopes, is crucial in antibody design. However, existing methods solely focus on uni-modal data (either sequence or structure), disregarding the complementary information present in multi-modal data, and most methods predict paratopes and epitopes separately, overlooking their specific spatial interactions. In this paper, we propose a novel Multi-modal contrastive learning and Interaction informativeness estimation-based method for Paratope and Epitope prediction, named MIPE, by using both sequence and structure data of antibodies and antigens. MIPE implements a multi-modal contrastive learning strategy, which maximizes representations of binding and non-binding residues within each modality and meanwhile aligns uni-modal representations towards effective modal representations. To exploit the spatial interaction information, MIPE also incorporates an interaction informativeness estimation that computes the estimated interaction matrices between antibodies and antigens, thereby approximating them to the actual ones. Extensive experiments demonstrate the superiority of our method compared to baselines. Additionally, the ablation studies and visualizations demonstrate the superiority of MIPE owing to the better representations acquired through multi-modal contrastive learning and the interaction patterns comprehended by the interaction informativeness estimation.

[6]  arXiv:2405.20702 [pdf, other]
Title: Effect of antibody levels on the spread of disease in multiple infections
Comments: 14 pages, 9 figures
Subjects: Populations and Evolution (q-bio.PE); Physics and Society (physics.soc-ph)

There are complex interactions between antibody levels and epidemic propagation, the antibody level of an individual influences the probability of infection, and the spread of the virus influences the antibody level of each individual. There exist some viruses that, in their natural state, cause antibody levels in an infected individual to gradually decay. When these antibody levels decay to a certain point, the individual can be reinfected, such as with COVID 19. To describe their interaction, we introduce a novel mathematical model that incorporates the presence of an antibody retention rate to investigate the infection patterns of individuals who survive multiple infections. The model is composed of a system of stochastic differential equations to derive the equilibrium point and threshold of the model and presents rich experimental results of numerical simulations to further elucidate the propagation properties of the model. We find that the antibody decay rate strongly affects the propagation process, and also that different network structures have different sensitivities to the antibody decay rate, and that changes in the antibody decay rate cause stronger changes in the propagation process in Barabasi Albert networks. Furthermore, we investigate the stationary distribution of the number of infection states and the final antibody levels, and find that they both satisfy the normal distribution, but the standard deviation is small in the Barabasi Albert network. Finally, we explore the effect of individual antibody differences and decay rates on the final population antibody levels, and uncover that individual antibody differences do not affect the final mean antibody levels. The study offers valuable insights for epidemic prevention and control in practical applications.

[7]  arXiv:2405.20747 [pdf, other]
Title: Generalized Inverse Optimal Control and its Application in Biology
Subjects: Quantitative Methods (q-bio.QM); Optimization and Control (math.OC)

Living organisms exhibit remarkable adaptations across all scales, from molecules to ecosystems. We believe that many of these adaptations correspond to optimal solutions driven by evolution, training, and underlying physical and chemical laws and constraints. While some argue against such optimality principles due to their potential ambiguity, we propose generalized inverse optimal control to infer them directly from data. This novel approach incorporates multi-criteria optimality, nestedness of objective functions on different scales, the presence of active constraints, the possibility of switches of optimality principles during the observed time horizon, maximization of robustness, and minimization of time as important special cases, as well as uncertainties involved with the mathematical modeling of biological systems. This data-driven approach ensures that optimality principles are not merely theoretical constructs but are firmly rooted in experimental observations. Furthermore, the inferred principles can be used in forward optimal control to predict and manipulate biological systems, with possible applications in bio-medicine, biotechnology, and agriculture. As discussed and illustrated, the well-posed problem formulation and the inference are challenging and require a substantial interdisciplinary effort in the development of theory and robust numerical methods.

[8]  arXiv:2405.20863 [pdf, other]
Title: ABodyBuilder3: Improved and scalable antibody structure predictions
Comments: 8 pages, 3 figures, 3 tables, code available at this https URL, weights and data available at this https URL
Subjects: Biomolecules (q-bio.BM); Artificial Intelligence (cs.AI)

Accurate prediction of antibody structure is a central task in the design and development of monoclonal antibodies, notably to understand both their developability and their binding properties. In this article, we introduce ABodyBuilder3, an improved and scalable antibody structure prediction model based on ImmuneBuilder. We achieve a new state-of-the-art accuracy in the modelling of CDR loops by leveraging language model embeddings, and show how predicted structures can be further improved through careful relaxation strategies. Finally, we incorporate a predicted Local Distance Difference Test into the model output to allow for a more accurate estimation of uncertainties.

Cross-lists for Mon, 3 Jun 24

[9]  arXiv:2405.20358 (cross-list from cs.LG) [pdf, other]
Title: Medication Recommendation via Dual Molecular Modalities and Multi-Substructure Distillation
Comments: 14 pages, 9 figures
Subjects: Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)

Medication recommendation combines patient medical history with biomedical knowledge to assist doctors in determining medication combinations more accurately and safely. Existing approaches based on molecular knowledge overlook the atomic geometric structure of molecules, failing to capture the high-dimensional characteristics and intrinsic physical properties of medications, leading to structural confusion and the inability to extract useful substructures from individual patient visits. To address these limitations, we propose BiMoRec, which overcomes the inherent lack of molecular essential information in 2D molecular structures by incorporating 3D molecular structures and atomic properties. To retain the fast response required of recommendation systems, BiMoRec maximizes the mutual information between the two molecular modalities through bimodal graph contrastive learning, achieving the integration of 2D and 3D molecular graphs, and finally distills substructures through interaction with single patient visits. Specifically, we use deep learning networks to construct a pre-training method to obtain representations of 2D and 3D molecular structures and substructures, and we use contrastive learning to derive mutual information. Subsequently, we generate fused molecular representations through a trained GNN module, re-determining the relevance of substructure representations in conjunction with the patient's clinical history information. Finally, we generate the final medication combination based on the extracted substructure sequences. Our implementation on the MIMIC-III and MIMIC-IV datasets demonstrates that our method achieves state-of-the-art performance. Compared to the next best baseline, our model improves accuracy by 1.8\% while maintaining the same level of DDI as the baseline.

[10]  arXiv:2405.20573 (cross-list from cs.LG) [pdf, other]
Title: Enhancing Generative Molecular Design via Uncertainty-guided Fine-tuning of Variational Autoencoders
Subjects: Machine Learning (cs.LG); Biomolecules (q-bio.BM); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

In recent years, deep generative models have been successfully adopted for various molecular design tasks, particularly in the life and material sciences. A critical challenge for pre-trained generative molecular design (GMD) models is to fine-tune them to be better suited for downstream design tasks aimed at optimizing specific molecular properties. However, redesigning and training an existing effective generative model from scratch for each new design task is impractical. Furthermore, the black-box nature of typical downstream tasks$\unicode{x2013}$such as property prediction$\unicode{x2013}$makes it nontrivial to optimize the generative model in a task-specific manner. In this work, we propose a novel approach for a model uncertainty-guided fine-tuning of a pre-trained variational autoencoder (VAE)-based GMD model through performance feedback in an active learning setting. The main idea is to quantify model uncertainty in the generative model, which is made efficient by working within a low-dimensional active subspace of the high-dimensional VAE parameters explaining most of the variability in the model's output. The inclusion of model uncertainty expands the space of viable molecules through decoder diversity. We then explore the resulting model uncertainty class via black-box optimization made tractable by low-dimensionality of the active subspace. This enables us to identify and leverage a diverse set of high-performing models to generate enhanced molecules. Empirical results across six target molecular properties, using multiple VAE-based generative models, demonstrate that our uncertainty-guided fine-tuning approach consistently outperforms the original pre-trained models.

[11]  arXiv:2405.20594 (cross-list from cs.LG) [pdf, other]
Title: Deep Learning without Weight Symmetry
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neurons and Cognition (q-bio.NC)

Backpropagation (BP), a foundational algorithm for training artificial neural networks, predominates in contemporary deep learning. Although highly successful, it is often considered biologically implausible. A significant limitation arises from the need for precise symmetry between connections in the backward and forward pathways to backpropagate gradient signals accurately, which is not observed in biological brains. Researchers have proposed several algorithms to alleviate this symmetry constraint, such as feedback alignment and direct feedback alignment. However, their divergence from backpropagation dynamics presents challenges, particularly in deeper networks and convolutional layers. Here we introduce the Product Feedback Alignment (PFA) algorithm. Our findings demonstrate that PFA closely approximates BP and achieves comparable performance in deep convolutional networks while avoiding explicit weight symmetry. Our results offer a novel solution to the longstanding weight symmetry problem, leading to more biologically plausible learning in deep convolutional networks compared to earlier methods.

[12]  arXiv:2405.20658 (cross-list from physics.bio-ph) [pdf, other]
Title: Emergence of a dynamical state of coherent bursting with power-law distributed avalanches from collective stochastic dynamics of adaptive neurons
Subjects: Biological Physics (physics.bio-ph); Neurons and Cognition (q-bio.NC)

Spontaneous brain activity in the absence of external stimuli is not random but contains complex dynamical structures such as neuronal avalanches with power-law duration and size distributions. These experimental observations have been interpreted as supporting evidence for the hypothesis that the brain is operating at criticality and attracted much attention. Here, we show that an entire state of coherent bursting, with power-law distributed avalanches and features as observed in experiments, emerges in networks of adaptive neurons with stochastic input when excitation is sufficiently strong and balanced by adaptation. We demonstrate that these power-law distributed avalanches are direct consequences of stochasticity and the oscillatory population firing rate arising from coherent bursting, which in turn is the result of the balance between excitation and adaptation, and criticality does not play a role.

[13]  arXiv:2405.20818 (cross-list from cs.CL) [pdf, other]
Title: An iterated learning model of language change that mixes supervised and unsupervised learning
Subjects: Computation and Language (cs.CL); Adaptation and Self-Organizing Systems (nlin.AO); Populations and Evolution (q-bio.PE)

The iterated learning model is an agent-based model of language change in which language is transmitted from a tutor to a pupil which itself becomes a tutor to a new pupil, and so on. Languages that are stable, expressive, and compositional arise spontaneously as a consequence of a language transmission bottleneck. Previous models have implemented an agent's mapping from signals to meanings using an artificial neural network decoder, but have relied on an unrealistic and computationally expensive process of obversion to implement the associated encoder, mapping from meanings to signals. Here, a new model is presented in which both decoder and encoder are neural networks, trained separately through supervised learning, and trained together through unsupervised learning in the form of an autoencoder. This avoids the substantial computational burden entailed in obversion and introduces a mixture of supervised and unsupervised learning as observed during human development.

[14]  arXiv:2405.21051 (cross-list from cs.SE) [pdf, other]
Title: Good Modelling Software Practices
Comments: 1 Figure
Subjects: Software Engineering (cs.SE); Populations and Evolution (q-bio.PE)

In socio-environmental sciences, models are frequently used as tools to represent, understand, project and predict the behaviour of these complex systems. Along the modelling chain, Good Modelling Practices have been evolving that ensure -- amongst others -- that models are transparent and replicable. Whenever such models are represented in software, good modelling meets Good software Practices, such as a tractable development workflow, good code, collaborative development and governance, continuous integration and deployment, and Good Scientific Practices, such as attribution of copyrights and acknowledgement of intellectual property, publication of a software paper and archiving. Too often in existing socio-environmental model software, these practices have been regarded as an add-on to be considered at a later stage only; in fact, many modellers have shied away from publishing their model as open source out of fear that having to add good practices is too demanding. We here argue for making a habit of following a list of simple and not so simple practices early on in the implementation of the model life cycle. We contextualise cherry-picked and hands-on practices for supporting Good Modelling Practices, and we demonstrate their application in the example context of the Viable North Sea fisheries socio-ecological systems model.

Replacements for Mon, 3 Jun 24

[15]  arXiv:2212.14041 (replaced) [pdf, other]
Title: Deciphering RNA Secondary Structure Prediction: A Probabilistic K-Rook Matching Perspective
Comments: Accepted by ICML 2024
Subjects: Biomolecules (q-bio.BM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[16]  arXiv:2402.05841 (replaced) [pdf, other]
Title: Dirichlet Flow Matching with Applications to DNA Sequence Design
Comments: Published at ICML 2024. (Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024)
Subjects: Biomolecules (q-bio.BM); Machine Learning (cs.LG)
[17]  arXiv:2402.07103 (replaced) [pdf, other]
Title: Learning protein-ligand unbinding pathways via single-parameter community detection
Comments: This preprint is the unedited version of a manuscript that has been accepted by J. Chem. Theory Comput. and can be downloaded for private use only. Copyright with the journal and its publisher after publication. Bibliographic information will follow shortly
Subjects: Computational Physics (physics.comp-ph); Statistical Mechanics (cond-mat.stat-mech); Data Analysis, Statistics and Probability (physics.data-an); Biomolecules (q-bio.BM)
[18]  arXiv:2402.07111 (replaced) [pdf, other]
Title: A multitype Galton-Watson model for rejuvenating cells
Subjects: Probability (math.PR); Populations and Evolution (q-bio.PE)
[19]  arXiv:2402.11391 (replaced) [pdf, other]
Title: A soluble model for synchronized rhythmic activity in ant colonies
Subjects: Adaptation and Self-Organizing Systems (nlin.AO); Populations and Evolution (q-bio.PE)
[20]  arXiv:2402.14991 (replaced) [pdf, other]
Title: Quantum Theory and Application of Contextual Optimal Transport
Comments: ICML 2024
Subjects: Machine Learning (cs.LG); Emerging Technologies (cs.ET); Quantum Algebra (math.QA); Quantitative Methods (q-bio.QM); Quantum Physics (quant-ph)
[21]  arXiv:2402.17810 (replaced) [pdf, other]
Title: BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning
Comments: Accepted by ACL 2024 (Findings)
Subjects: Quantitative Methods (q-bio.QM); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG); Biomolecules (q-bio.BM)
[22]  arXiv:2403.12995 (replaced) [pdf, other]
Title: ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling
Comments: ICML2024 camera-ready, update some experimental results, add github url
Subjects: Biomolecules (q-bio.BM); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG)
[23]  arXiv:2403.16290 (replaced) [pdf, other]
Title: An Information Theory Treatment of Animal Movement Tracks
Authors: Wayne M Getz
Comments: 21 pages, 2 tables, 1 figure
Subjects: Populations and Evolution (q-bio.PE); Information Theory (cs.IT)
[24]  arXiv:2404.10854 (replaced) [pdf, other]
Title: Methods to Estimate Cryptic Sequence Complexity
Subjects: Populations and Evolution (q-bio.PE); Neural and Evolutionary Computing (cs.NE)
[25]  arXiv:2405.10780 (replaced) [pdf, ps, other]
Title: Intelligent and Miniaturized Neural Interfaces: An Emerging Era in Neurotechnology
Journal-ref: 2024 IEEE Custom Integrated Circuits Conference (CICC), Denver, CO, USA, 2024, pp. 1-7
Subjects: Signal Processing (eess.SP); Hardware Architecture (cs.AR); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
[26]  arXiv:2405.17960 (replaced) [pdf, other]
Title: Elementary Flux Modes as CRN Gears for Free Energy Transduction
Comments: 6 pages, 3 figures
Subjects: Molecular Networks (q-bio.MN); Statistical Mechanics (cond-mat.stat-mech)
[ total of 26 entries: 1-26 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, q-bio, recent, 2406, contact, help  (Access key information)