Mastodawn

InfoQ Oct 24, 2025

As LLMs move into production, #Observability is essential for Reliability, Performance & Responsible AI.

Learn how to deploy an #opensource observability stack - using Prometheus, Grafana, Tempo, and OpenTelemetry Collectors on Kubernetes - and monitor real #AI workloads with #vLLM & #Llamastack.

🎥 Watch the #InfoQ video (#transcript included): https://bit.ly/4hlKoDa

#Prometheus #Grafana #OpenTelemetry #Kubernetes

Adam

Oct 14, 2025

3 things to know about Red Hat AI 3

https://www.youtube.com/watch?v=eztORiJWYMs

#RedHat #AI #RedHatAI #llmd #Agentic #MCP #ModelContextProtocol #LlamaStack #OpenSource #OpenShift #OpenShiftAI

3 things to know about Red Hat AI 3

YouTube

Adam

Oct 14, 2025

Red Hat Brings Distributed AI Inference to Production AI Workloads with Red Hat AI 3

https://www.redhat.com/en/about/press-releases/red-hat-brings-distributed-ai-inference-production-ai-workloads-red-hat-ai-3

#RedHat #AI #OpenSource #RedHatAI #vllm #llamastack #mcp

Red Hat Brings Distributed AI Inference to Production AI Workloads with Red Hat AI 3

Adam

Feb 4, 2025

Introducing vLLM Inference Provider in Llama Stack

https://blog.vllm.ai/2025/01/27/intro-to-llama-stack-with-vllm.html

#artificialintelligence #AI #vLLM #llamastack #opensource #RedHat

Introducing vLLM Inference Provider in Llama Stack

We are excited to announce that vLLM inference provider is now available in Llama Stack through the collaboration between the Red Hat AI Engineering team and the Llama Stack team from Meta. This article provides an introduction to this integration and a tutorial to help you get started using it locally or deploying it in a Kubernetes cluster.

vLLM Blog

michabbb Sep 29, 2024

🦙 #LlamaStack: Standardizing #GenerativeAI Development

Defines open API specs for #AI application building blocks

Covers full lifecycle: model training, evaluation, production deployment

Includes APIs for inference, safety, memory, agents, and more

Supports multiple environments: local, hosted, and on-device

🛠️ Features:

#OpenSource API providers and distributions

Mix-and-match capabilities (e.g., local small models, cloud-based large models)

Consistent APIs across platforms (server, mobile, etc.)

🤝 Supported implementations:

API Providers: #Meta Reference, #Fireworks, #AWS Bedrock, #Together, #Ollama, TGI, #Chroma, PG Vector, #PyTorch ExecuTorch

Distributions: Meta Reference, Dell-TGI

📦 Easy installation via pip or from source 🖥️ Includes 'llama' CLI for managing distributions, models, and more

Learn more: https://github.com/meta-llama/llama-stack

GitHub - meta-llama/llama-stack: Composable building blocks to build Llama Apps

Composable building blocks to build Llama Apps. Contribute to meta-llama/llama-stack development by creating an account on GitHub.

GitHub