NEW BIML Bibliography entry

https://arxiv.org/abs/2408.04667v2

LLM Stability: A detailed analysis with some surprises

Berk Atil et al

This is terrible science (which means it is ironically a good example of how not to do it). Walks directly into the baseline bunker. "Benchmarking does not work so we introduce...a benchmark."

#MLsec #ML #DONTBOTHER

https://berryvilleiml.com/references/