๐Ÿ“ข Pre-conference workshop:
Open-source chemometrics for real-world NIR handheld spectroscopy

A hands-on course at CAC 2026 (International Conference on Chemometrics in Analytical Chemistry):
https://congressos.urv.cat/cac-2026/preliminary-courses

๐Ÿ“… 29 June โ€“ 3 July 2026
๐Ÿ“ Tarragona, Spain (sunny and Mediterranean)

Fee: only 75 EUR, so reserve your seat asap

If you work with NIR, chemometrics, or deploy models in practice, this may be relevant.

#Chemometrics #NIRSpectroscopy #OpenScience #AnalyticalChemistry #soil

๐Ÿšจ #publication alert!

Just a few days into the new year and we already have a published #paper first-authored by our PhD candidate Sarah. ๐ŸŽ‰

It is the first study on the volatile emissions of Synchytrium endobioticum (potato wart disease) on potatoes. We identified seven possible marker compounds using TD-GC-MS and #chemometrics.

S. endobioticum is a highly potent fungus classified as a quarantine pathogen in the EU, since it can affect #potato fields for more than 40 years.

Our research offers new insights into the volatilome and may enable future screening methods for potato diseases.

Read more โžก๏ธ https://link.springer.com/article/10.1007/s41348-025-01216-9

#voc #planthealth #gcms #analyticalchemistry #potatowartdisease

Diagnostic volatile organic compounds for potato wart disease: a GC-MS based chemometric approach - Journal of Plant Diseases and Protection

Volatile organic compounds (VOCs) can serve as sensitive indicators of plant health and pathogen infection. In this study, gas chromatographyโ€“mass spectrometry combined with multivariate chemometric analysis was applied to identify VOC patterns specific to potato wart disease caused by the pathogen Synchytrium endobioticum. Healthy and artificially infected potato tubers were analyzed under controlled conditions, and the resulting chromatographic data were processed using a Python-based workflow integrating data merging, preprocessing, principal component analysis, and linear discriminant analysis. The chemometric models successfully distinguished infected from healthy tubers. Seven compounds, 1-methoxy-3-methylbutane, 3-methyl-1-butanol, 2-methyl-1-butanol, 2,3-butanediol, prenyl ethyl ether, styrene, and solavetivone, were identified as indicative for infection. In addition, a mass-specific evaluation demonstrated that discrimination is possible using selected ion fragments alone, providing a basis for simplified on-site applications. This study presents the first characterization of a volatile fingerprint for S. endobioticum infection and establishes a robust, time-efficient workflow for non-invasive detection of quarantine pathogens in potato crops.

SpringerLink

It took a while, but I'm finally back to writing my blog ๐Ÿ˜Ž

The first installment for 2026 is an easy introduction to calculating information #entropy for optical spectra (or for any signal, really).

In my blog, I focus on #data analysis (#chemometrics, machine learning) applied to optical and near-infrared #spectroscopy Smoothing, or denoising, is one of the most common steps to work with spectroscopy data, and information entropy can be used as a criterion to guide the smoothing process.

Better still, the entropy of the derivative of a signal can help with that, because it accounts for the shape of the signal more naturally.

Read more at https://nirpyresearch.com/information-entropy-spectra/

#MachineLearning #NIR #Physics

Information entropy for spectra โ€ข NIRPY Research

An introduction to the calculation of the information entropy (Shannon entropy) for NIR spectra, to be used as criterion for optimal smoothing.

NIRPY Research

I used chembl-downloader to create some nice charts on how the number of compounds, assays, activities, and other entities in ChEMBL have grown over time

๐Ÿ“– https://cthoyt.com/2025/08/26/chembl-history.html

#chembl #chemistry #chemometrics #chemoinformatics #cheminformatics #rdkit #cdk #proteochemometrics

A historical analysis of ChEMBL

Iโ€™ve recently submitted an article to the Journal of Open Source Software (JOSS) describing chembl-downloader, a Python package for automating downloading and using ChEMBL data in a reproducible way. In this post, I use chembl-downloader to show how the number of compounds, assays, activities, and other entities in ChEMBL have changed over time.

Biopragmatics

Our first step towards a semi-automated approach for finding relevant VOC profiles of plant pests and pathogens has recently been published in Scientific Reports. ๐Ÿฅผ๐Ÿฆ ๐Ÿ”ฌ

It's the first publication of our PhD-student Sarah! ๐ŸŽ‰๐Ÿฅณ
She developed a multivariate evaluation method of GC-MS data and found a few more possible VOC markers for ALB infestation in maple trees compared to a manual evaluation.

Read more: https://rdcu.be/exF4r

#science #publication #research #volatiles #PlantProtection #ALB #AsianLonghornedBeetle #QuarantinePests #chemometrics

An #introduction: I am a doctoral candidate at Rutgers with a focus on using #spectroscopy, especially #VibrationalSpectroscopy, #Chemometrics and data tools to understand chemical reaction systems.
At home, I am interested in #homelab, #birdphotography, and #3dprinting. I used to have other interests, but the PhD consumed them. Whenever I post, it'll probably be small things for funsies and work I do in #Python and LaTeX.
Development and validation of a new method by MIR-FTIR and chemometrics for the early diagnosis of leprosy and evaluation of the treatment effect.
Chemometrics and Intelligent Laboratory Systems
Volume 254, 15 November 2024, 105248
https://doi.org/10.1016/j.chemolab.2024.105248
#infrared #chemometrics #leprosy
Development and validation of a new method by MIR-FTIR and chemometrics for the early diagnosis of leprosy and evaluation of the treatment effect

Develop a new method for diagnosing leprosy and monitoring the pharmacological treatment effect of patients.Plasma samples from patients diagnosed witโ€ฆ

๐Ÿ“š New post from me | Genetic Algorithm for Wavelength Selection Using NumPy

TL;DR A simplified implementation of a genetic algorithm (GA) for wavelength selection, using only @numpy and @sklearn

๐Ÿ“ Key Points:

๐Ÿ”ธ The basics, including population, fitness function, crossovers, and mutations.

๐Ÿ”ธ A step-by-step implementation of the GA for wavelength selection, with clear Python code examples.

๐Ÿ”ธAn example using NIR spectroscopy data to demonstrate the algorithm's application in selecting optimal wavelengths for predicting soil properties.
Improved regression results when comparing the performance before and after optimization.

This post may be valuable for spectroscopists, chemometricians, and data scientists looking to optimise feature selection in #spectroscopy datasets using evolutionary algorithms.

๐ŸŒ Full post available here
https://nirpyresearch.com/genetic-algorithm-wavelength-selection-numpy/

#chemometrics #python #numpy #MachineLearning

Genetic algorithm for wavelength selection using Numpy โ€ข NIRPY Research

A implementation of a genetic algorithm for wavelength selection using basic Numpy functions.

NIRPY Research

New publication: a novel quantitative method to validate accurate scale-up of chemical reactions in real time.

Discover the full methodology at https://oa.eu/E4xb6X

#datascience #chemometrics #cmc #pharma

โ˜• Here's a bit of technical content from me - today a deep dive on #baseline correction methods.

๐Ÿ“ˆ Baseline correction is a preprocessing technique to remove background signal and isolate peaks in hashtag#spectroscopy data.

๐Ÿ“ In my recent post I discuss two methods:
1. Wavelet transform (WT) - Decomposes signal into components at different frequencies. Lowest frequency component represents baseline and can be removed.
2. Asymmetric least squares (ALS) - Fits a smooth baseline function, penalising positive deviations more than negative ones.

TL;DR: WT method is intuitive but can distort peaks. ALS produces better results.

๐Ÿ”Ž Both methods are applied on a #Raman spectrum and an X-ray fluorescence (#XRF) spectrum. ALS gives a cleaner baseline correction and it's effective for removing broad, slowly varying background while preserving sharper spectral features.

#chemometrics #Python #MachineLearning #wavelets #regression

https://nirpyresearch.com/two-methods-baseline-correction-spectral-data/

Two methods for baseline correction of spectral data โ€ข NIRPY Research

Worked examples of two methods for baseline correction of spectra applied to Raman and XRF data.

NIRPY Research