What is attention actually paying attention to?
A 10-minute Manim walkthrough of Query, Key, Value, softmax, multi-head attention, and why long context gets expensive.
Watch: https://youtu.be/nFyr1tx2C-E
Mirror: https://attention-mechanism-20260430.vercel.app/attention_mechanism_en_public_720p_small.mp4