Red Hat Launches the llm-d Community, Powering Distributed Gen AI Inference at Scale

Red Hat has launched llm-d, a new open source project developed in collaboration with industry leaders, designed to power distributed generative AI inference at scale. The project utilizes Kubernetes architecture, vLLM-based distributed inference, and intelligent AI-aware network routing to enable robust and scalable large language model (LLM) inference clouds.