| SIB website | https://sib.swiss |
| Lab website | https://lab.dessimoz.org |
| SIB website | https://sib.swiss |
| Lab website | https://lab.dessimoz.org |
Really important points by Alex Bateman (@alexbateman1.bsky.social) on the importance of curation, addressing some of the myths out there.
Deeply resonates with the commentary Paul Thomas and I wrote last year https://www.nature.com/articles/s41597-024-03099-1
Fantastic keynote by @marcrr at the joint EvolCompGen-Function session of #ISMBECCB2025 on the power of (curated & FAIR) gene expression data from @bgeedb to study functional retention and innovation across animals, summarising 15+ yrs of work
Papers listed here: https://www.unil.ch/dee/en/home/menuguid/people/group-leaders/prof-marc-robinson-rechavi.html
Delighted to represent @SIB at the AI-Bioscience Collaborative Summit in Washington DC, organized by the US Department of State & others.
Great to see a growing recognition of curated biodata’s critical role in AI breakthroughs, exemplified by resources like PDB and UniProt enabling AlphaFold.
Now, let’s focus on securing sustainable funding for these invaluable resources to continue advancing AI and bioscience innovation.
Expert curation is essential to capture knowledge of enzyme functions from the scientific literature in FAIR open knowledgebases but cannot keep pace with the rate of new discoveries and new publications. In this work we present EnzChemRED, for Enzyme Chemistry Relation Extraction Dataset, a new training and benchmarking dataset to support the development of Natural Language Processing (NLP) methods such as (large) language models that can assist enzyme curation. EnzChemRED consists of 1,210 expert curated PubMed abstracts where enzymes and the chemical reactions they catalyze are annotated using identifiers from the protein knowledgebase UniProtKB and the chemical ontology ChEBI. We show that fine-tuning language models with EnzChemRED significantly boosts their ability to identify proteins and chemicals in text (86.30% F1 score) and to extract the chemical conversions (86.66% F1 score) and the enzymes that catalyze those conversions (83.79% F1 score). We apply our methods to abstracts at PubMed scale to create a draft map of enzyme functions in literature to guide curation efforts in UniProtKB and the reaction knowledgebase Rhea.
For evolutionary biologists interested in tracing genomes back to their ancient ancestors, we’ve developed EdgeHOG—a tool that reconstructs ancestral gene orders at an unprecedented scale and speed. #Genomics #EvolutionaryBiology #EdgeHOG
For instance, we can identify very highly conserved adjacencies corresponding to histone clusters (arrow) or fast evolving sex chromosomes.
Check out first author Charles Bernard‘s blog post here: https://lab.dessimoz.org/blog/2024/08/30/edgehog
Preprint: https://www.biorxiv.org/content/10.1101/2024.08.28.610045v1