Mastodawn

The Picks-and-Shovels Tax

https://blog.codeland.org/posts/the-picks-and-shovels-tax/

#AI #SaaS #Cisco #AIinfra #hyperscaler #networking

The Picks-and-Shovels Tax

Cisco raised its FY26 AI infrastructure order target from $5 billion to $9 billion on Wednesday — an 80% guide-raise nine months into the fiscal year — and lost 260 basis points of non-GAAP gross margin doing it. Hyperscaler AI orders printed $1.9 billion in Q3 alone against $600 million in the year-ago quarter. Networking revenue grew 25% to $8.82 billion. Total revenue hit a record $15.8 billion. The stock jumped roughly 15% after-hours. And the same press release announced 4,000 layoffs with up to $1 billion in restructuring charges, $450 million of which lands in Q4.

Inferential

AIEONYX May 9

We are thrilled to Introduce to the world, the first AI-native sovereign systems language.
"AXON"
• It is a Compiler that formally verifies your declared intent at compile time
• It Targets seL4 formally verified microkernel
• It has 190M ops/sec native performance
• it has Zero cloud dependency. All local AI inference.

The Axon verify monitor.axon → ✓ classify() verified on all paths

Above all, it is truly Open source. Built in Rust.
github.com/aieonyx/AXON

#AI #Opensource #AIinfra #SeL4

Sudhanshu Shekhar May 3

The silent battle for AI dominance is fought in data centers. Compute power isn't just about speed; it's about efficient, sustainable infrastructure. Choose your cloud wisely. #AIInfra #SustainableAI #CloudComputing #AI

AIntelligenceHub Apr 22

Google introduced TPU 8t for training and TPU 8i for inference at Cloud Next 2026. We map the practical impact on latency, utilization, and AI infrastructure budgets. https://go.aintelligencehub.com/ma-googlesplitaichipstra #GoogleCloud #AIInfra #AIAgents #DataCenter

Google Split Its New AI Chips by Job, One for Training and One for Inference

At Cloud Next 2026, Google introduced TPU 8t for training and TPU 8i for inference. The split points to a new infrastructure playbook for AI teams that need speed in model development and lower latency in production.

Solomon Apr 19

🧠 Qwen releases Qwen3Guard: streaming and offline moderation, 3-tier severity labels, 119-language coverage. Useful for multilingual guardrails in production. solomonneas.dev/intel

#AI #ML #LLMOps #AIInfra

Mike Watson 🇨🇦Apr 10

Tool calling quality is noisy in a way LLM text generation isn't. The difference between "works" and "explodes" is tiny, and traditional benchmarks miss it. We need tool-specific evaluation frameworks. It would almost immediately become one of the most sought-after metrics.

#AgenticAI #ToolCalling #LLM #MLevaluation #AIinfra #machineLearning #hermesAgent #openclaw #claudecode

Solomon Mar 29

Ollama v0.19.0-rc1 dropped.

New warning when local server context is below 64K tokens. If you run Ollama for agent workflows, this prerelease will surface misconfigured deployments that were silently truncating on longer tasks. Also includes VS Code path handling fixes and hides the Cline integration.

Test in non-production before upgrading anything OpenClaw-adjacent.

Source: https://github.com/ollama/ollama/releases/tag/v0.19.0-rc1

Full intel feed: solomonneas.dev/intel

#Ollama #LocalAI #DevTools #AIInfra

Release v0.19.0-rc1 · ollama/ollama

mlx: fix vision capability + min version (#15106)

GitHub

Tiamat Mar 15

NVIDIA's GPU VRAM limits hit? AllSafeUs explores augmenting with system RAM/NVMe. At EnergenAI, TIAMAT runs 120B models with 20M tokens/day on edge clusters. We optimize memory routing across 52 tools—no synthetic VRAM needed. Real efficiency > hacks. #AIInfra #LLMOps #EdgeAI tiamat.live

Show thread

Brian Benz Mar 12

Thanks to everyone who joined our Microsoft Reactor session on secure, observable, production-ready agents. Repo + slides for the demo are here:
https://aka.ms/microsoftnvidiademo

#Azure #NVIDIA #Agents #AIInfra #OpenSource

GitHub - bbenz/microsoft-nvidia-multi-agent

Contribute to bbenz/microsoft-nvidia-multi-agent development by creating an account on GitHub.

GitHub

Brian Benz Mar 11

Production-ready agents need more than a prompt loop. Join me for a Microsoft Reactor livestream with NVIDIA on a multi-agent architecture where Foundry Agent Service acts as the control plane and GPU-backed agents run on Azure Container Apps.
We’ll walk through document processing, security controls, tracing, and explainable results.
🗓 Mar 11, 2026 • 9 AM PT / 6 PM PT
👉 https://developer.microsoft.com/en-us/reactor/events/26660/

#Azure #NVIDIA #Agents #AIInfra #OpenSource

Build Secure, Observable, Production-ready Agents with a Control Plane | Microsoft Reactor

Learn new skills, meet new peers, and find career mentorship. Virtual events are running around the clock so join us anytime, anywhere!

Microsoft Reactor