Biolearning KULeuven

4 Followers
6 Following
9 Posts
Research group @ KULeuven #ESAT #STADIUS (Belgium) focused on #machinelearning #deeplearning and #softwaredevelopment in #bioinformatics and #chemoinformatics
Daniele Raimondi from our lab gave a highlight talk on the ballance of dataset size and model complexity in genome interpretation on #ECCB2024. You can read the original work at https://genomebiology.biomedcentral.com/articles/10.1186/s13059-023-03064-y
Large sample size and nonlinear sparse models outline epistatic effects in inflammatory bowel disease - Genome Biology

Background Despite clear evidence of nonlinear interactions in the molecular architecture of polygenic diseases, linear models have so far appeared optimal in genotype-to-phenotype modeling. A key bottleneck for such modeling is that genetic data intrinsically suffers from underdetermination ( $$p \gg n$$ p ≫ n ). Millions of variants are present in each individual while the collection of large, homogeneous cohorts is hindered by phenotype incidence, sequencing cost, and batch effects. Results We demonstrate that when we provide enough training data and control the complexity of nonlinear models, a neural network outperforms additive approaches in whole exome sequencing-based inflammatory bowel disease case–control prediction. To do so, we propose a biologically meaningful sparsified neural network architecture, providing empirical evidence for positive and negative epistatic effects present in the inflammatory bowel disease pathogenesis. Conclusions In this paper, we show that underdetermination is likely a major driver for the apparent optimality of additive modeling in clinical genetics today.

BioMed Central
Andras Formanek, from our lab, will present tomorrow 20th at #ICANN2024 (14:50, Aula Magna) on Model Based Clustering of Time Series Utilizing Expert ODEs.
Paper: https://link.springer.com/chapter/10.1007/978-3-031-72341-4_16
Model Based Clustering of Time Series Utilizing Expert ODEs

In practical system identification scenarios, partially observed time series are often acquired from a set of similar dynamical systems forming clusters in the parameter space (e.g., healthy vs. diseased patients). The problem of identifying these clusters and that...

SpringerLink
If you are curious how different #uncertainty estimation methods perform when they are tested on NN models trained on large scale industrial #pharma data, visit our poster in room A8 (AI4Science) from 16:15 #ICML2024 @RosaFriesi @adamgld Emma Svensson @AiddOne
Please join us today afternoon on #CVPR24 poster section, number 297. To hear about our CV method for extracting chemical knowledge from papers and patents.
Our paper (accepted by #CVPR2024) is now available as a preprint on @arxiv: (https://arxiv.org/abs/2404.01743). We introduce the first model to perform OCSR with atom-level entity detection with only SMILES supervision. Code is available on @github (https://github.com/molden/atomlenz).
Atom-Level Optical Chemical Structure Recognition with Limited Supervision

Identifying the chemical structure from a graphical representation, or image, of a molecule is a challenging pattern recognition task that would greatly benefit drug development. Yet, existing methods for chemical structure recognition do not typically generalize well, and show diminished effectiveness when confronted with domains where data is sparse, or costly to generate, such as hand-drawn molecule images. To address this limitation, we propose a new chemical structure recognition tool that delivers state-of-the-art performance and can adapt to new domains with a limited number of data samples and supervision. Unlike previous approaches, our method provides atom-level localization, and can therefore segment the image into the different atoms and bonds. Our model is the first model to perform OCSR with atom-level entity detection with only SMILES supervision. Through rigorous and extensive benchmarking, we demonstrate the preeminence of our chemical structure recognition approach in terms of data efficiency, accuracy, and atom-level entity prediction.

arXiv.org
It was a pleasure to welcome Paula Torren Peraire in Leuven, where she gave a talk about her work on the effect of single-step retrosynthesis models on multi-step planning performance. @AiddOne www.ai-dd.eu