@riley someone actually made a LM (with an actual transformer architecture) in a C64. It has 25k parameters and it generates one token per minute.
@starsider What if you added an FPU?