PrismML released Bonsai 8B & the whole model, weights and all, fits in 1.15 GB. For context, the standard FP16 version of a comparable 8B model sits at around 16 GB. Bonsai beats or matches several of them on benchmarks
https://firethering.com/bonsai-8b-1bit-llm/

#ai #opensource #llm #trending #genai #huggingface

Bonsai 8B: A 1-Bit LLM That Delivers 8B-Class Performance at 1/14th the Size - Firethering

Nobody expected a 1.15 GB model to score competitively against full precision 8B models. That is not how this usually goes. PrismML released Bonsai 8B last month and the headline number is almost absurd. The whole model, weights and all, fits in 1.15 GB. For context, the standard FP16 version of a comparable 8B model sits at around 16 GB. Bonsai beats or matches several of them on benchmarks while being 14 times smaller. It runs on a phone. There is literally an iPhone build. I want to be clear that these numbers come from PrismML's own evaluations, not independent third party testing. But even with that caveat, this is worth paying attention to.

Firethering