Show HN: I built a tiny LLM to demystify how language models work

Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.

Fork it and swap the personality for your own character.

https://github.com/arman-bd/guppylm

GitHub - arman-bd/guppylm: A ~9M parameter LLM that talks like a small fish.

A ~9M parameter LLM that talks like a small fish. Contribute to arman-bd/guppylm development by creating an account on GitHub.

GitHub
How much training data did you end up needing for the fish personality to feel coherent? Curious what the minimum viable dataset looks like for something like this.