Mastodawn

🚨 New paper out!

Novel algorithms for uncertainty quantification in systems biology using conformal inference.
Joint work with Alberto Portela and Marcos Matabuena.

Conformal prediction for uncertainty quantification in dynamic biological systems

Author summary Uncertainty quantification involves determining how confident we are in the predictions made by mathematical models. This process is vital in the field of systems biology because it helps us understand and predict how these systems behave, despite their complexity. Typically, Bayesian statistics are used for this task. Although powerful, these methods often require specific prior information and make assumptions that may not always hold true for biological systems. Additionally, they struggle when we have limited data, and can be slow for large models. To address these issues, here we have developed two new algorithms based on conformal inference methods. These algorithms offer excellent reliability and scalability. Testing in various scenarios has demonstrated that they outperform traditional Bayesian methods, particularly when applied to large models. Our approach provides a new, general, and flexible method for quantifying uncertainty in dynamic biological models.

Julio R. Banga Sep 5, 2024

📢 New preprint out!
We used conformal prediction to perform uncertainty quantification in dynamic models of biological systems.

https://arxiv.org/abs/2409.02644

Great collaboration with
@MarcosMatabuena https://www.marcosmatabuena.com/

#sysbio #UQ #ML #DynamicModels

Conformal Prediction in Dynamic Biological Systems

Uncertainty quantification (UQ) is the process of systematically determining and characterizing the degree of confidence in computational model predictions. In the context of systems biology, especially with dynamic models, UQ is crucial because it addresses the challenges posed by nonlinearity and parameter sensitivity, allowing us to properly understand and extrapolate the behavior of complex biological systems. Here, we focus on dynamic models represented by deterministic nonlinear ordinary differential equations. Many current UQ approaches in this field rely on Bayesian statistical methods. While powerful, these methods often require strong prior specifications and make parametric assumptions that may not always hold in biological systems. Additionally, these methods face challenges in domains where sample sizes are limited, and statistical inference becomes constrained, with computational speed being a bottleneck in large models of biological systems. As an alternative, we propose the use of conformal inference methods, introducing two novel algorithms that, in some instances, offer non-asymptotic guarantees, enhancing robustness and scalability across various applications. We demonstrate the efficacy of our proposed algorithms through several scenarios, highlighting their advantages over traditional Bayesian approaches. The proposed methods show promising results for diverse biological data structures and scenarios, offering a general framework to quantify uncertainty for dynamic models of biological systems.The software for the methodology and the reproduction of the results is available at https://zenodo.org/doi/10.5281/zenodo.13644870.

arXiv.org

Show thread

Julio R. Banga Jun 3, 2024

We review the literature and introduce a novel approach - generalized inverse optimal control - to infer optimality principles directly from data. #DataDriven

Julio R. Banga Jun 3, 2024

📢 Interested in optimality principles in biology? Our latest preprint is out: “Generalized Inverse Optimal Control and its Application in Biology”, with Sebastian Sager @OVGUpresse
Preprint at: https://arxiv.org/abs/2405.20747

Generalized Inverse Optimal Control and its Application in Biology

Living organisms exhibit remarkable adaptations across all scales, from molecules to ecosystems. We believe that many of these adaptations correspond to optimal solutions driven by evolution, training, and underlying physical and chemical laws and constraints. While some argue against such optimality principles due to their potential ambiguity, we propose generalized inverse optimal control to infer them directly from data. This novel approach incorporates multi-criteria optimality, nestedness of objective functions on different scales, the presence of active constraints, the possibility of switches of optimality principles during the observed time horizon, maximization of robustness, and minimization of time as important special cases, as well as uncertainties involved with the mathematical modeling of biological systems. This data-driven approach ensures that optimality principles are not merely theoretical constructs but are firmly rooted in experimental observations. Furthermore, the inferred principles can be used in forward optimal control to predict and manipulate biological systems, with possible applications in bio-medicine, biotechnology, and agriculture. As discussed and illustrated, the well-posed problem formulation and the inference are challenging and require a substantial interdisciplinary effort in the development of theory and robust numerical methods.

arXiv.org

Julio R. Banga Oct 19, 2023

Show thread

Julio R. Banga Oct 19, 2023

Interested in data-driven model discovery in biology?

Our latest paper is now out in PLOS Computational Biology:
https://doi.org/10.1371/journal.pcbi.1011014

An integrated methodology for automatic model discovery ensuring identifiability.
Joint work with with
@GemmaMassonis
and
@AlexFVillaverde

Distilling identifiable and interpretable dynamic models from biological data

Author summary Dynamical models provide a quantitative understanding of complex biological systems. Since their development is far from trivial, in recent years many research efforts focus on obtaining these models automatically from data. One of the most effective approaches is based on implicit sparse regression. This technique is able to infer biochemical networks with kinetic functions containing rational nonlinear terms. However, as we show here, one limitation is that it may yield models that are unidentifiable. These features may lead to inaccurate mechanistic interpretations and wrong biological insights. To overcome this limitation, we propose an integrated methodology that applies additional procedures in order to ensure that the discovered models are structurally identifiable, observable, and interpretable. We demonstrate our method with six challenging case studies of increasing model complexity.

Show thread

Julio R. Banga Oct 19, 2023

Interested in data-driven model discovery in biology?

Our latest paper is now out in PLOS Computational Biology:
https://doi.org/10.1371/journal.pcbi.1011014

An integrated methodology for automatic model discovery ensuring identifiability.
Joint work with with
@GemmaMassonis
and
@AlexFVillaverde

Distilling identifiable and interpretable dynamic models from biological data

Author summary Dynamical models provide a quantitative understanding of complex biological systems. Since their development is far from trivial, in recent years many research efforts focus on obtaining these models automatically from data. One of the most effective approaches is based on implicit sparse regression. This technique is able to infer biochemical networks with kinetic functions containing rational nonlinear terms. However, as we show here, one limitation is that it may yield models that are unidentifiable. These features may lead to inaccurate mechanistic interpretations and wrong biological insights. To overcome this limitation, we propose an integrated methodology that applies additional procedures in order to ensure that the discovered models are structurally identifiable, observable, and interpretable. We demonstrate our method with six challenging case studies of increasing model complexity.

Julio R. Banga Mar 15, 2023

📢 New paper out! Discover how our methodology, combined with implicit sparse regression, allows the automatic discovery of identifiable dynamic models for complex biological systems. Joint work with
@GemmaMassonis
and
@AlexFVillaverde

https://www.biorxiv.org/content/10.1101/2023.03.13.532340v2

#computationalbiology #dynamicalmodels

Julio R. Banga Jan 17, 2023

David Basanta Gutierrez Jan 15, 2023

What’s behind in the decline of cancer deaths in the USA: https://www.theatlantic.com/newsletters/archive/2023/01/cancer-mortality-death-rate-down-screenings/672724/

The Surprising Reason for the Decline in Cancer Mortality

Behavioral changes and screenings may be just as important as treatments, if not more so.

The Atlantic

Julio R. Banga Dec 15, 2022

Dr. Chris Rackauckas

Dec 15, 2022

The new SciML documentation is almost out now. One big piece just came online: the SciML workshop, pre-built exercises for running workshops (and courses) with SciML! Some of the exercises have answers and hints, some do not... so use them as HW problems!

https://docs.sciml.ai/SciMLWorkshop/stable/

#julialang #sciml #differentialequations #machinelearning

SciMLWorkshop: Workshop Materials for Training in Scientific Computing and Scientific Machine Learning (SciML) · SciML Workshop

Julio R. Banga Dec 15, 2022

Michael P.H. Stumpf Dec 15, 2022

This by @GorinGennady@twitter.com, @lpachter. et al is a tour de force and will help everyone interested in detecting and understanding the signals in single cell data

https://www.nature.com/articles/s41467-022-34857-7

Interpretable and tractable models of transcriptional noise for the rational design of single-molecule quantification experiments - Nature Communications

Here the authors explore the distributional differences expected from distinct biophysical models of transcription and show how measurements from single-cell genomics experiments can shed light on the underlying biological processes.

Nature

Academic web site	https://www.bangalab.org
ORCID	https://orcid.org/0000-0002-4245-0320
GScholar	https://scholar.google.com/citations?user=ycKDH18AAAAJ&hl=en