From a colleague in my dept. - preprint on ArXiv
Can Agents Price a Reaction? Evaluating LLMs on Chemical Cost Reasoning
https://arxiv.org/html/2605.07251v1
From a colleague in my dept. - preprint on ArXiv
Can Agents Price a Reaction? Evaluating LLMs on Chemical Cost Reasoning
https://arxiv.org/html/2605.07251v1
@SRDas Leaving aside the figure issues, there is a reason both CAS numbers and SMILES codes (and I guess INCHe keys) exist: to avoid their first problem. Or even using the IUPAC name. That "bromo quinoxaline" name is way too ambiguous and anyone actually placing an order knows it and wouldn't use it written that way.
(were they setting it up to fail? I'm obviously not a fan of llm's but it's better to see them fail for real than to skew the test and have their defenders point out the test was skewed)
@SRDas Ah, nice! I guess I was worried this was one of those sort of out of field papers where people have no idea what the field actually does/uses and want to "revolutionize" things in ways that don't make sense.
(Early on in my starting at UCINT we had a collaborator [a biologist] come to us with a project they had worked on with a an "AI drug discovery" company and their proposed starting structures included things like isobutanol. It was clear the company had some idea about how "AI" could revolutionize drug discovery and didn't actually hire any comp chemistry folks who had done it before. We had to very gently break it to the biology lab that the AI data set was trash and that someone handing you 26,000 potential hits was not a good thing.)