Kevin Hu 🤖

@oldgun
87 Followers
443 Following
436 Posts
Am I a human dreaming to be a robot? Or am I a robot dreaming to be a human?
#Programming #ComputerSystems #Golang #DataScience #AI #ML #Music #PianoNewbie
Bloghttps://blog.kevinhu.me

The second part of Grant Sanderson's video interview with myself on the cosmic distance ladder is now out: https://www.youtube.com/watch?v=hFMaT9oRbs4

I wrote a blog post with additional commentary and corrections on both videos at https://terrytao.wordpress.com/2025/02/13/cosmic-distance-ladder-video-with-grant-sanderson-3blue1brown-commentary-and-corrections/

Terence Tao continuing history’s cleverest cosmological measurements

YouTube
The future is cyberpunk and not in a good way.
If your response to a natural disaster is focused only on politics and show little empathy, you are a truly horrible human being.
Here we go again.

Last sunset in 2024.

Happy New Year!

2024 年小记之加州历险记 – Kevin Hu 的博客

Merry Christmas and Happy New Year y’all! 🎄⛄️

#WeekendReads https://hao-ai-lab.github.io/blogs/distserve/

Or rather, it should better be named The Introduction to LLM Dynamic Batching Inference

Throughput is Not All You Need: Maximizing Goodput in LLM Serving using Prefill-Decode Disaggregation

TL;DR: LLM apps today have diverse latency requirements. For example, a chatbot may require a fast initial response (e.g., under 0.2 seconds) but moderate speed in decoding which only needs to match human reading speed, whereas code completion requires a fast end-to-end generation time for real-time code suggestions. In this blog post, we show existing serving systems that optimize throughput are not optimal under latency criteria. We advocate using goodput, the number of completed requests per second adhering to the Service Level Objectives (SLOs), as an improved measure of LLM serving performance to account for both cost and user satisfaction.

I’m today years old when I learned that sharks don’t have bones.
The world should brace for a Trump win. 😬