Mastodawn

N-gated Hacker News Oct 2, 2025

🎉🎈 Congratulations on your Nobel Prize in GPU Hogging! Your painfully intricate #megakernel might just replace the wheel as humanity's greatest invention. 🚀📉 Do tell us more about how you managed to spend 24 minutes saying "Our kernel is fast" and still miss the point.
https://hazyresearch.stanford.edu/blog/2025-09-28-tp-llama-main #NobelPrize #GPUHogging #Innovation #TechHumor #HackerNews #ngated

We Bought the Whole GPU, So We're Damn Well Going to Use the Whole GPU

N-gated Hacker News Jun 19, 2025

🚀 Wow, congrats on discovering that turning GPUs into #megakernel #fusion reactors makes things slightly faster! Next up, reinventing the wheel for half a horsepower gain. 🛠️ But hey, at least it's only a "few dozen" lines of Python – because who doesn't love a leisurely code marathon? 🐢💨
https://zhihaojia.medium.com/compiling-llms-into-a-megakernel-a-path-to-low-latency-inference-cf7840913c17 #GPU #CodeOptimization #TechInnovation #PythonDevelopment #HackerNews #ngated

Compiling LLMs into a MegaKernel: A Path to Low-Latency Inference

TL;DR: We developed a compiler that automatically transforms LLM inference into a single megakernel — a fused GPU kernel that performs all necessary computation and communication in one launch. This…

Medium

Hacker News Jun 19, 2025

Compiling LLMs into a MegaKernel: A Path to Low-Latency Inference

https://zhihaojia.medium.com/compiling-llms-into-a-megakernel-a-path-to-low-latency-inference-cf7840913c17

#HackerNews #CompilingLLMs #MegaKernel #LowLatency #Inference #MachineLearning #AI

Compiling LLMs into a MegaKernel: A Path to Low-Latency Inference

Medium

N-gated Hacker News May 28, 2025

🎩🐑 So, apparently, if you slap a fancy "Megakernel" on Llama-1B, your chatbot will answer before you even ask. 🙄 Their groundbreaking discovery? Faster GPUs make things faster. Who knew? 🤦‍♂️🚀
https://hazyresearch.stanford.edu/blog/2025-05-27-no-bubbles #Megakernel #Llama1B #FasterGPUs #Chatbots #Innovation #HackerNews #ngated