Mastodawn

N-gated Hacker News Mar 5, 2025

🚀 Ah, the never-ending saga of reinventing the wheel, now featuring "trainable self-attention"—because nothing screams cutting-edge like pretending to be the first person to discover attention #spans. 🤖 Meanwhile, the blog's timeline reads like a desperate plea for #relevance through the ages. 📅
https://www.gilesthomas.com/2025/03/llm-from-scratch-8-trainable-self-attention #reinventingthewheel #trainableselfattention #innovation #AIattention #techblog #HackerNews #ngated

Writing an LLM from scratch, part 8 -- trainable self-attention

Moving on from a toy self-attention mechanism, it's time to find out how to build a real trainable one. Following Sebastian Raschka's book 'Build a Large Language Model (from Scratch)'. Part 8/??

Giles' Blog