Mastodawn

Show HN: 1-Bit Bonsai, the First Commercially Viable 1-Bit LLMs

PrismML — Concentrating intelligence

Large models can't fit on smartphones. Datacenters can't sustain them. PrismML is building ultra dense intelligence to solve both.

Show thread

jjcm Mar 31

1 bit with a FP16 scale factor every 128 bits. Fascinating that this works so well.

I tried a few things with it. Got it driving Cursor, which in itself was impressive - it handled some tool usage. Via cursor I had it generate a few web page tests.

On a monte carlo simulation of pi, it got the logic correct but failed to build an interface to start the test. Requesting changes mostly worked, but left over some symbols which caused things to fail. Required a bit of manual editing.

Tried a Simon Wilson pelican as well - very abstract, not recognizable at all as a bird or a bicycle.

Pictures of the results here: https://x.com/pwnies/status/2039122871604441213

There doesn't seem to be a demo link on their webpage, so here's a llama.cpp running on my local desktop if people want to try it out. I'll keep this running for a couple hours past this post: https://unfarmable-overaffirmatively-euclid.ngrok-free.dev

Jacob Miller (@pwnies) on X

Played around with PrismML's 1bit model. https://t.co/mLfSL22gRd It uses 1 bit per parameter, and a FP16 scale factor for each group of 128 params. Cool demo - runs crazy fast. It's able to handle basic tool usage via cursor, but it's nowhere near usable. I rate it neat / 10

X (formerly Twitter)

Show thread

adityashankar

here's the google colab link, https://colab.research.google.com/drive/1EzyAaQ2nwDv_1X0jaC5... since the ngrok like likely got ddosed by the number of individuals coming along

Google Colab

Show thread

jjcm Mar 31

Good call. Right now though traffic is low (1 req per min). With the speed of completion I should be able to handle ~100x that, but if the ngrok link doesn't work defo use the google colab link.

Show thread

adityashankar Mar 31

The link didn't work for me personally, but that may be a bandwidth issue with me fighting for a connection in the EU