Kohulan Rajan

@Kohulan
13 Followers
4 Following
29 Posts
Cheminformatician, Photographer, Open-Source enthusiast and Human
Visit my websitekohulanr.com

Our new paper, MARCUS (Molecular Annotation and Recognition for Curating Unravelled Structures), has been published in the Digital Discovery Journal.

Paper: https://doi.org/10.1039/D5DD00313J

We built MARCUS—a free, open-source AI platform that extracts millions of chemical structures trapped in PDFs in minutes, rather than hours.

#Opensource #Opendata #OpenScience #CompChem #Cheminformatics #chemistryAI

Our new paper now 🎉 Published: Cheminformatics Microservice V3!

Paper: https://doi.org/10.1186/s13321-025-01094-1

🛠️ Web app + API: RDKit, CDK, Open Babel — NO install, 100% open-source.

Try now: app.naturalproducts.net

#Cheminformatics #OpenScience #Innovation #CompChem #Opensource

Cheminformatics Microservice V3: a web portal for chemical structure manipulation and analysis - Journal of Cheminformatics

The widespread adoption of open-source cheminformatics toolkits remains constrained by technical implementation barriers, including complex installation procedures, dependency management, and integration challenges. Here, we present Cheminformatics Microservice V3, a significant update to the existing platform that provides unified programmatic access to cheminformatics libraries, including RDKit, Chemistry Development Kit (CDK), and Open Babel through a RESTful API framework. This latest version features a newly developed, interactive web-based frontend built with React, providing users with an intuitive graphical interface for manipulating and analysing chemical structures. The frontend supports essential cheminformatics operations, including structure editing, PubChem database integration, batch molecular processing, and standardised InChI/RInChI identifier generation. The microservice V3 addresses critical accessibility barriers in computational chemistry by providing researchers with immediate access to analytical tools, eliminating the need for specialised technical expertise or complex software installations. This approach facilitates reproducible research workflows and broadens the utilisation of cheminformatics methodologies across interdisciplinary research communities. The platform is publicly accessible at https://app.naturalproducts.net , and the complete source code and documentation are available on GitHub.

BioMed Central

New Preprint Alert!

We're excited to share our latest work on #ChemRxiv! MARCUS (Molecular Annotation and Recognition for Curating Unravelled Structures) is a web-based platform for extracting chemical information from scientific papers.

📄 Preprint: https://doi.org/10.26434/chemrxiv-2025-9p1q1

🔗 Try it out: https://marcus.decimer.ai

#Cheminformatics #OpenScience #ChemicalDatabases #AIinScience #ScientificSoftware #ResearchTools

MARCUS: Molecular Annotation and Recognition for Curating Unravelled Structures

The exponential growth of chemical literature necessitates the development of automated tools for extracting and curating molecular information from unstructured scientific publications into open-access chemical databases. Current optical chemical structure recognition (OCSR) and named entity recognition solutions operate in isolation, which limits their scalability for comprehensive literature curation. Here we present MARCUS (Molecular Annotation and Recognition for Curating Unravelled Structures), a tool to aid curators in performing literature curation in the field of natural products. This integrated web-based platform combines automated text annotation, multi-engine OCSR, and direct submission capabilities to the COCONUT database. MARCUS employs a fine-tuned GPT-4 model to extract chemical entities and utilises an ensemble approach integrating DECIMER, MolNexTR, and MolScribe for structure recognition. The platform aims to streamline the data extraction workflow from PDF upload to database submission, significantly reducing curation time. MARCUS bridges the gap between unstructured chemical literature and machine-actionable databases, enabling FAIR data principles and facilitating AI-driven chemical discovery. Through open-source code, accessible models, and comprehensive documentation, the web application enhances accessibility and promotes community-driven development. This approach facilitates unrestricted use and encourages the collaborative advancement of automated chemical literature curation tools. We dedicate MARCUS to Dr Marcus Ennis, the longest-serving curator of the ChEBI database, on the occasion of his 75th birthday.

ChemRxiv

Huh. Was using my new #cheminformatics fingerprint generator code generator on the PubChem fingerprints which can be defined by a single match.

It told me bits 472 and 506 are the same.

I pulled up the primary documentation, at https://ftp.ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.txt and .. indeed they are, with the same patterns in reverse order!

472 C:N:C-C
506 C-C:N:C

The full list of such pairs is:

472/506, 585/657, 589/626, 620/632, 462/537, 581/642, 594/666, 470/520, 584/677, 595/608, 634/668, 490/556, 660/678

new preprint with #opensource #cheminformatics by @Kohulan et al.: "Cheminformatics Microservice V-3: A Web Portal for Chemical Structure Manipulation and Analysis" https://doi.org/10.26434/chemrxiv-2025-xjkxl

"Here, we present Cheminformatics Microservice V3, a significant update to the existing platform that provides unified programmatic access to cheminformatics libraries, including RDKit, Chemistry Development Kit (CDK), and Open Babel through a RESTful API framework."

Cheminformatics Microservice V-3: A Web Portal for Chemical Structure Manipulation and Analysis

The widespread adoption of open-source cheminformatics toolkits remains constrained by technical implementation barriers, including complex installation procedures, dependency management, and integration challenges. Here, we present Cheminformatics Microservice V3, a significant update to the existing platform that provides unified programmatic access to cheminformatics libraries, including RDKit, Chemistry Development Kit (CDK), and Open Babel through a RESTful API framework. This latest version features a newly developed, interactive web-based frontend built with React, providing users with an intuitive graphical interface for manipulating and analysing chemical structures. The frontend supports essential cheminformatics operations, including structure editing, PubChem database integration, batch molecular processing, and standardised InChI/RInChI identifier generation. The microservice V3 addresses critical accessibility barriers in computational chemistry by providing researchers with immediate access to analytical tools, eliminating the need for specialised technical expertise or complex software installations. This approach facilitates reproducible research workflows and broadens the utilisation of cheminformatics methodologies across interdisciplinary research communities. The platform is publicly accessible at https://app.naturalproducts.net, and the complete source code and documentation are available on GitHub.

ChemRxiv

🎬 Thinking about which movie to watch tonight? Stop Wasting Time Searching for a Movie! 🍿

It usually takes me 10-20 minutes to find a movie I haven't seen and want to watch on one of the many streaming platforms I pay for (currently too many!). The moment I sit down for dinner, I want to hit play—but instead, I get bombarded with recommendations I don’t care about.

That’s why I made https://www.whichmovieto.watch/

#Usefulwebsites #movies #netflix #opensource

Which Movie To Watch - Find Your Next Movie

Discover your next favorite movie! Get personalized movie recommendations based on your streaming services and preferences. Free and paid movie suggestions available.

Which Movie To Watch

New paper alert from the @steinbeck
Lab!

Our COCONUT (COlleCtion of Open Natural prodUcTs) 2.0 paper is now officially online on NAR journal.

paper: https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae1063/7908792

#NaturalProducts #OpenScience #ResearchTools #DrugDiscovery #OpenSource

COCONUT 2.0: a comprehensive overhaul and curation of the collection of open natural products database

Abstract. The COCONUT (COlleCtion of Open Natural prodUcTs) database was launched in 2021 as an aggregation of openly available natural product datasets an

OUP Academic

Our most recent research paper, titled 'Cheminformatics Microservice' and authored by the @steinbeck Lab, has been published in the Journal of Cheminformatics as part of the special issue focused on "Improving Reproducibility and Reusability in the Journal of Cheminformatics."

https://lnkd.in/eHCSdxJY
#opensource #opendata #openscience #cheminformatics

LinkedIn

This link will take you to a page that’s not on LinkedIn

Hello from Cambridge! I will be talking about our work on #DECIMER tomorrow at the 6th Artificial Intelligence in Chemistry Symposium #AIChem23
the national #openscience festival #OSF2023NL is very near now... great way to start the academic year!