Mastodawn

RT @vllm_project: 🚀 Es freut mich zu sehen, dass @RedHatAI und das Team von @poolsideai zusammenarbeiten, um Laguna XS.2 in vLLM schneller und kostengünstiger bereitzustellen. Ein DFlash-Spezulator, der mit Speculators entwickelt wurde, entnimft 8 Tokens pro Vorwärtsdurchlauf, was zu einer 2- bis 3-fach schnelleren Dekodierung ohne Qualitätsverlust führt. LLM Compressor ermöglicht FP8-, NVFP4- und INT4-Checkpoints, sodass Sie Ihr Hardware-Budget optimal nutzen können. 🔗 vllm.ai/blog/2026-05-28-lagu… Red Hat AI (@RedHatAI) hat Laguna XS.2 von @poolsideai, ein 33B-MoE-Modell für agentic coding, trainiert. Red Hat AI hat einen DFlash-Spezulator dafür entwickelt: 0,6B Drafter, 8 Tokens pro Durchlauf, ohne Qualitätsverlust. FP8-, NVFP4- und INT4-Checkpoints über LLM Compressor. Modelle in den Kommentaren. Geschwindigkeitssteigerung mit @vllmproject: Video — https://nitter.net/RedHatAI/status/2060714281717404005#m

mehr auf Arint.info

#AIOptimization #CodingAI #LLMCompressor #MachineLearning #RedHatAI #vLLM #arint_info

https://x.com/vllm_project/status/2060875400121864266#m

Adam

Apr 23

Red Hat and Tesla engineers tackled a real production problem together.

3x output tokens/sec, 2x faster TTFT on Llama 3.1 70B with KServe + llm-d + vLLM. Fixes pushed upstream to KServe along the way.

This is what open source looks like. 🤝 🚀

https://llm-d.ai/blog/production-grade-llm-inference-at-scale-kserve-llm-d-vllm

#RedHat #Tesla #RedHatAI #vLLM #Pytorch #Kubernetes #OpenShift #KServe #llmd #Llama #OpenSource

Production-Grade LLM Inference at Scale with KServe, llm-d, and vLLM | llm-d

How migrating from a simple vLLM deployment to a robust MLOps platform utilizing KServe, llm-d's intelligent routing, and vLLM solved significant scaling and operational challenges in LLM deployment through deep customization and prefix-cache aware routing to maximize GPU utilization.

llm-d

Adam

Apr 16

233% 3-year return on investment and 13 months to payback with Red Hat AI

https://www.redhat.com/en/blog/233-3-year-return-investment-and-13-months-payback-red-hat-ai

#RedHat #AI #RedHatAI #OpenSource #OpenShift #RHEL #Kubernetes #vLLM

233% 3-year return on investment and 13 months to payback with Red Hat AI

Discover the financial benefits and return on investment (ROI) experienced by customers using Red Hat AI. Learn how organizations turned infrastructure challenges into measurable financial gains with a 3-year ROI of 233% and a 13-month payback period.

Adam

Jan 20

Today we announce the General Availability of AI Quickstarts! Get started quickly with your usecase and solve real business problems using Red Hat AI rapidly!

https://docs.redhat.com/en/learn/ai-quickstarts

#RedHat #AI #RedHatAI #OpenShift #OpenShiftAI #RHEL #RHELAI #OpenSource #OpenSourceAI

AI quickstarts | Red Hat Documentation

Adam

Nov 13, 2025

Red Hat AI 3 is GA!

https://docs.redhat.com/en/documentation/red_hat_ai/3

#RedHat #RedHatAI #RHAIIS #OpenShift #OpenShiftAI #vLLM #KServe #Kubeflow #llmd #AI #GenAI #AIPlatform #OpenSource #OpenSourceAI

Red Hat AI | 3 | Red Hat Documentation

Adam

Nov 11, 2025

KServe joins CNCF as an incubating project

https://www.redhat.com/en/blog/kserve-joins-cncf-incubating-project

#RedHat #Kubernetes #OpenShift #OpenShiftAI #RedHatAI #CNCF #KServe #Inference #ModelServing

KServe joins CNCF as an incubating project

KServe, the leading standardized AI inference platform on Kubernetes, has been accepted as an incubating project by the Cloud Native Computing Foundation (CNCF).

Adam

Oct 14, 2025

3 things to know about Red Hat AI 3

https://www.youtube.com/watch?v=eztORiJWYMs

#RedHat #AI #RedHatAI #llmd #Agentic #MCP #ModelContextProtocol #LlamaStack #OpenSource #OpenShift #OpenShiftAI

3 things to know about Red Hat AI 3

YouTube

Adam

Oct 14, 2025

Red Hat Brings Distributed AI Inference to Production AI Workloads with Red Hat AI 3

https://www.redhat.com/en/about/press-releases/red-hat-brings-distributed-ai-inference-production-ai-workloads-red-hat-ai-3

#RedHat #AI #OpenSource #RedHatAI #vllm #llamastack #mcp

Red Hat Brings Distributed AI Inference to Production AI Workloads with Red Hat AI 3

Adam

Sep 11, 2025

Red Hat OpenShift AI achieves ISO 42001 AI certification, reinforcing Red Hat's leadership in responsible AI

https://www.redhat.com/en/blog/red-hat-openshift-ai-achieves-iso-42001-ai-certification

#RedHat #OpenShift #AI #OpenShiftAI #RedHatAI

Red Hat OpenShift AI achieves ISO 42001 AI certification

Learn how Red Hat OpenShift AI's ISO 42001 certification reinforces Red Hat's leadership in responsible AI, providing enhanced customer data protection, industry standard alignment, and platform maturity.