Ian Simpson ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ

@iansimpson
245 Followers
469 Following
80 Posts

The Ontoverse: Democratising Access to Knowledge Graph-based Data Through a Cartographic Interface

Johannes Zimmermann, Dariusz Wiktorek, Thomas Meusburger, Miquel Monge-Dalmau, Antonio Fabregat, Alexander Jarasch, G\"unter Schmidt, Jorge S. Reis-Filho, T. Ian Simpson
https://arxiv.org/abs/2408.03339 https://arxiv.org/pdf/2408.03339

arXiv:2408.03339v1 Announce Type: new
Abstract: As the number of scientific publications and preprints is growing exponentially, several attempts have been made to navigate this complex and increasingly detailed landscape. These have almost exclusively taken unsupervised approaches that fail to incorporate domain knowledge and lack the structural organisation required for intuitive interactive human exploration and discovery. Especially in highly interdisciplinary fields, a deep understanding of the connectedness of research works across topics is essential for generating insights. We have developed a unique approach to data navigation that leans on geographical visualisation and uses hierarchically structured domain knowledge to enable end-users to explore knowledge spaces grounded in their desired domains of interest. This can take advantage of existing ontologies, proprietary intelligence schemata, or be directly derived from the underlying data through hierarchical topic modelling. Our approach uses natural language processing techniques to extract named entities from the underlying data and normalise them against relevant domain references and navigational structures. The knowledge is integrated by first calculating similarities between entities based on their shared extracted feature space and then by alignment to the navigational structures. The result is a knowledge graph that allows for full text and semantic graph query and structured topic driven navigation. This allows end-users to identify entities relevant to their needs and access extensive graph analytics. The user interface facilitates graphical interaction with the underlying knowledge graph and mimics a cartographic map to maximise ease of use and widen adoption. We demonstrate an exemplar project using our generalisable and scalable infrastructure for an academic biomedical literature corpus that is grounded against hundreds of different named domain entities.

The Ontoverse: Democratising Access to Knowledge Graph-based Data Through a Cartographic Interface

As the number of scientific publications and preprints is growing exponentially, several attempts have been made to navigate this complex and increasingly detailed landscape. These have almost exclusively taken unsupervised approaches that fail to incorporate domain knowledge and lack the structural organisation required for intuitive interactive human exploration and discovery. Especially in highly interdisciplinary fields, a deep understanding of the connectedness of research works across topics is essential for generating insights. We have developed a unique approach to data navigation that leans on geographical visualisation and uses hierarchically structured domain knowledge to enable end-users to explore knowledge spaces grounded in their desired domains of interest. This can take advantage of existing ontologies, proprietary intelligence schemata, or be directly derived from the underlying data through hierarchical topic modelling. Our approach uses natural language processing techniques to extract named entities from the underlying data and normalise them against relevant domain references and navigational structures. The knowledge is integrated by first calculating similarities between entities based on their shared extracted feature space and then by alignment to the navigational structures. The result is a knowledge graph that allows for full text and semantic graph query and structured topic driven navigation. This allows end-users to identify entities relevant to their needs and access extensive graph analytics. The user interface facilitates graphical interaction with the underlying knowledge graph and mimics a cartographic map to maximise ease of use and widen adoption. We demonstrate an exemplar project using our generalisable and scalable infrastructure for an academic biomedical literature corpus that is grounded against hundreds of different named domain entities.

arXiv.org
OBF ยป Biopython 1.82 released ยป Biopython 1.82 released

Open Bioinformatics Foundation Homepage

Multi-Omic Graph Diagnosis (MOGDx) : A data integration tool to perform classification tasks for heterogeneous diseases https://www.medrxiv.org/content/10.1101/2023.07.09.23292410
Multi-Omic Graph Diagnosis (MOGDx) : A data integration tool to perform classification tasks for heterogeneous diseases

Heterogeneity in human diseases presents challenges in diagnosis and treatments due to the broad range of manifestations and symptoms. With the rapid development of labelled multi-omic data, integrative machine learning methods have achieved breakthroughs in treatments by redefining these diseases at a more granular level. These approaches often have limitations in scalability, oversimplification, and handling of missing data. In this study, we introduce Multi-Omic Graph Diagnosis (MOGDx), a flexible command line tool for the integration of multi-omic data to perform classification tasks for heterogeneous diseases. MOGDx is a network integrative method that combines patient similarity networks with a reduced vector representation of genomic data. The reduced vector is derived from the latent embeddings of an auto-encoder and the combined network is fed into a graph convolutional network for classification. MOGDx was evaluated on three datasets from the cancer genome atlas for breast invasive carcinoma, kidney cancer, and low grade glioma. MOGDx demonstrated state-of-the-art performance and an ability to identify relevant multi-omic markers in each task. It did so while integrating more genomic measures with greater patient coverage compared to other network integrative methods. MOGDx is available to download from https://github.com/biomedicalinformaticsgroup/MOGDx. Overall, MOGDx is a promising tool for integrating multi-omic data, classifying heterogeneous diseases, and interpreting genomic markers. ### Competing Interest Statement REM is a scientific advisor to Optima Partners and the Epigenetic Clock Development Foundation. ### Funding Statement This work was supported by the United Kingdom Research and Innovation [grant EP/S02431X/1], UKRI Centre for Doctoral Training in Biomedical AI at the University of Edinburgh, School of Informatics. ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: The study used ONLY openly available human data that were originally downloaded from the Genomic Data Commons Data Portal located at https://portal.gdc.cancer.gov. I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes All data produced in the present work are contained in the manuscript and accompanying supplemental files.

medRxiv
Our latest preprint out now โžก๏ธ Omics network integration and GNNs for patient class prediction. Multi-Omic Graph Diagnosis (MOGDx) : A data integration tool to perform classification tasks for heterogeneous diseases - https://www.medrxiv.org/content/10.1101/2023.07.09.23292410v1 #gnn #networks #dataintegration #omics
Multi-Omic Graph Diagnosis (MOGDx) : A data integration tool to perform classification tasks for heterogeneous diseases

Heterogeneity in human diseases presents challenges in diagnosis and treatments due to the broad range of manifestations and symptoms. With the rapid development of labelled multi-omic data, integrative machine learning methods have achieved breakthroughs in treatments by redefining these diseases at a more granular level. These approaches often have limitations in scalability, oversimplification, and handling of missing data. In this study, we introduce Multi-Omic Graph Diagnosis (MOGDx), a flexible command line tool for the integration of multi-omic data to perform classification tasks for heterogeneous diseases. MOGDx is a network integrative method that combines patient similarity networks with a reduced vector representation of genomic data. The reduced vector is derived from the latent embeddings of an auto-encoder and the combined network is fed into a graph convolutional network for classification. MOGDx was evaluated on three datasets from the cancer genome atlas for breast invasive carcinoma, kidney cancer, and low grade glioma. MOGDx demonstrated state-of-the-art performance and an ability to identify relevant multi-omic markers in each task. It did so while integrating more genomic measures with greater patient coverage compared to other network integrative methods. MOGDx is available to download from https://github.com/biomedicalinformaticsgroup/MOGDx. Overall, MOGDx is a promising tool for integrating multi-omic data, classifying heterogeneous diseases, and interpreting genomic markers. ### Competing Interest Statement REM is a scientific advisor to Optima Partners and the Epigenetic Clock Development Foundation. ### Funding Statement This work was supported by the United Kingdom Research and Innovation [grant EP/S02431X/1], UKRI Centre for Doctoral Training in Biomedical AI at the University of Edinburgh, School of Informatics. ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: The study used ONLY openly available human data that were originally downloaded from the Genomic Data Commons Data Portal located at https://portal.gdc.cancer.gov. I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes All data produced in the present work are contained in the manuscript and accompanying supplemental files.

medRxiv

๐Ÿ“ข Lecturer in Engineering Biology, #EdinburghUni. ๐Ÿ“† 10 Aug. Please share!

Join my colleagues in the Centre for Engineering Biology, where #interdisciplinary research has been natural for over 15 years. Amazing breadth of collaborators: School of Biological Sciences alone has over 130 research groups, working from atomic to landscape scale. And the #Edinburgh landscapes are great too!

#Biology #research #jobs #SyntheticBiology
โžก๏ธ https://elxw.fa.em3.oraclecloud.com:443/hcmUI/CandidateExperience/en/job/7641/share/300001135910193?utm_medium=jobshare

Lecturer in Engineering Biology

We seek a lecturer in the area of Engineering Biology, broadly defined. The successful applicant will establish a vibrant externally funded research group and will contribute to the teaching mission of the School of Biological Sciences (especially in the area of Engineering Biology/Biotechnology) at the undergraduate and postgraduate level.

University of Edinburgh Jobs
Our faculty of Maths & Computer Science at TU/e @TUEindhoven is recruiting 24 (!) new faculty - open rank. I would _love_ to welcome you as a new colleague! #hiring #tenuretrack #faculty #jobs #academicjobs #newpi. Boosting warmly appreciated! https://www.tue.nl/en/working-at-tue/scientific-staff/ready-for-the-next-step-in-your-academic-career/?utm_source=linkedin&utm_medium=post&utm_campaign=teaser+broad+hire&utm_id=24+positions+M%26CS
Explore your options at TU/e's department of Mathematics and Computer Science

TU/e's department of Mathematics and Computer Science rings in the new year with 24 new tenure track faculty positions. Open soon for all academic levels!

Intro post for this platform: I am an Asst. Prof. of Genome Sciences at the Univ. of Washington ๐Ÿ‘‹

My lab uses bioinfo, mol bio, and advanced microscopy to develop new tech and study spatial patterns of chromosome organization and gene expression at the single-cell level.

This also seems like a good place to mention that @dschweppe and I have an open ad for a postdoc interested in chromatin proteomics ๐Ÿ‘‡

https://www.beliveau.io/chrom-prot-postdoc

chrom-prot-pd | Beliveau Lab

Beliveau Lab
A Multifaceted benchmarking of synthetic electronic health record generation models https://www.nature.com/articles/s41467-022-35295-1 #ehr #datascience #machinelearning #bioinformatics
A Multifaceted benchmarking of synthetic electronic health record generation models - Nature Communications

Synthetic health data have the potential to mitigate privacy concerns when sharing data to support biomedical research and the development of innovative healthcare applications. In this work, the authors introduce a use case oriented benchmarking framework to evaluate data synthesis models through a set of utility and privacy metrics.

Nature

โ€ฆlast but certainly not least, all this is excellent work from the amazing @[email protected].

Well done Emilia!

GitHub - ewysocka/rb_vs_ode_model_of_darpp-32: Repository storing molecular rule-based models in Kappa language and Python-based figures published in PeerJ: https://doi.org/10.7717/peerj.14516

Repository storing molecular rule-based models in Kappa language and Python-based figures published in PeerJ: https://doi.org/10.7717/peerj.14516 - GitHub - ewysocka/rb_vs_ode_model_of_darpp-32: R...

GitHub