#SWAT4HCLS is over.

Today, I am at the #KNCV CTC meeting, where we are launching the #Cheminformatics and #DrugDiscovery working group, as part of the CTC Section

JH shows recent work where they use LLMs together with chemical similarity knowledge to do ChEBI ontology classification, the chebifier: https://chebifier.hastingslab.org/ and https://doi.org/10.1039/D3DD00238A

#swat4hcls #cheminformatics

Two papers for the ICCS Collection in the Journal of Cheminformatics have been tagged as accepted now. With another 4 under review, and at least one more paper to be submitted. The deadline was extended one last time.

But we are looking forward to a nice collection of work presented at the ICCS 2025!

#cheminformatics #noordwijkerhout #ICCS2025

OPSIN 2.9.0 has been released: https://chembl.blogspot.com/2026/03/opsin-v290-released.html

OPSIN is an IUPAC name parser that returns SMILES

"The release notes describe a mixture of minor bug fixes and improvements:"

#smiles #chemistry #iupac #cheminformatics #openscience

OPSIN v2.9.0 released

Just a quick note to say that Daniel Lowe has released OPSIN v.2.9.0 , the first release since Oct 2023. This is now available via the EMBL-...

Here's an #RDKit #cheminformatics quiz for you all. What do you think this code will output?

from rdkit import Chem
mol = Chem.MolFromSmiles("C" + "C(C)(C)" * 50 + "C")
pat = Chem.MolFromSmarts("[$([CD4H0X4](-*)(-*)(-*)-*)]")
print(len(mol.GetSubstructMatches(pat)))

No cheating by actually running the code! :) Feel free to explain your reasoning in the comments.

0
0%
38
0%
42
50%
50
50%
Poll ended at .

Spent most of the week writing an EState count #cheminformatics fingerprint for #chemfp .

It should have been a few hours to build on RDKit's EState code. Perhaps a bit longer to implement a faster version using the same SMARTS patterns.

I then realized the RDKit implementation and patterns had problems, eg, not matching both atoms in "CC", and unexpected handling of explicit hydrogens, like in deuterated [2H]. See https://git.sr.ht/~dalke/rdkit/log

The hard part was finding good test cases.

SureChEMBL has the banner "By using the site you are agreeing to our Privacy Policy". It links to https://chembl.gitbook.io/surechembl/privacy-notices then https://chembl.gitbook.io/surechembl/privacy-notices/surechembl-website links then https://1396459327-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Fty0JOfWwPnEbs5wW271w%2Fuploads%2F8dCXtuh9PSd4NKw4f7Cd%2FPrivacy%20Notice%20SureChEMBL%20website.pdf?alt=media&token=bc640502-e42a-4776-92e7-24ea3e871380

They use Google Analytics to track my visit.

I sent email to the #EMBL data controller via [email protected] to highlight issues with Google and the US, and asked they stop using Google Analytics, or clarify why they couldn't use one of https://european-alternatives.eu/alternatives-to .

You should email them too. #cheminformatics

Privacy notices | SureChEMBL

Find out what personal data is collected by the specific EMBL-EBI services you are requesting and for what purposes, as well as how your data are processed and kept secure.

pyBacting 0.2.16 (with CDK 2.12) is now available at https://pypi.org/project/pybacting/

#python #openscience #cheminformatics

Client Challenge

CDK 2.12

Version 2.12 of the Chemistry Development Kit has been released. It is the last release with contributions by our NWO Open Science grant. This release adds some nice new APIs:

chem-bla-ics