Mastodawn

#AMD threatens to go medieval on Nvidia with #Epyc and #Instinct: What we know so far
AMD teased its next-generation #AI accelerators at #CES2026, with CEO Lisa Su boasting the #MI500-series will deliver a 1,000x uplift in performance over its two-year-old #MI300X #GPU.
if AMD wants to stay competitive with Nvidia, MI500-series will need to deliver performance on par with if not better than Rubin Ultra Kyber racks.
AMD joins the #rackscale race with #MI455X #Helios racks.
https://www.theregister.com/2026/01/07/mi500x_amd_ai/

AMD threatens to go medieval on Nvidia with Epyc and Instinct: What we know so far

: AMD boasts 1000x higher AI perf by 2027 and pulls the lid off Helios compute tray ahead of 2H 2026 launch

The Register

NERDS.xyz – Real Tech News for Real Nerds Oct 6, 2025

AMD and OpenAI team up for massive 6 gigawatt GPU partnership

https://web.brid.gy/r/https://nerds.xyz/2025/10/amd-openai/

Dr. Moritz Lehmann May 8, 2025

Battle of the giants: Nvidia #Blackwell B200 takes the lead in FluidX3D CFD performance

#Nvidia #B200 just launched, and I'm one of the first people to benchmark 8x B200 via Shadeform, in a WhiteFiber server with 2x #Intel #Xeon6 6960P 72-core CPUs. 🖖😋

8x Nvidia B200 go head-to-head with 8x #AMD #MI300X in the #FluidX3D #CFD benchmark, winning overall (with FP16S storage) at 219300 MLUPs/s (~17TB/s combined VRAM bandwidth), but losing in FP32 & FP16C storage. 8x MI300X achieve 204924 MLUPs/s.

Hacker News Mar 24, 2025

Instella: New Open 3B Language Models
https://rocm.blogs.amd.com/artificial-intelligence/introducing-instella-3B/README.html
#ycombinator #Instella #MI300X #LLMs #ROCm

Introducing Instella: New State-of-the-art Fully Open 3B Language Models — ROCm Blogs

AMD is excited to announce Instella, a family of fully open state-of-the-art 3-billion-parameter language models (LMs). , In this blog we explain how the Instella models were trained, and how to access them.

Benjamin Carr, Ph.D. 👨🏻‍💻🧬Mar 21, 2025

#AMD Announces "#Instella" Fully #OpenSource 3B Language Models
AMD Instella represents "fully open state-of-the-art 3-billion-parameter language models (LMs)." These models were trained on AMD Instinct #MI300X #GPU and according to AMD's published data delivers competitive performance to the likes of Llama 3.2 3B, Gemma-2 2B, and Qwen 2.5 3B.
https://www.phoronix.com/news/AMD-Intella-Open-Source-LM

AMD Announces "Instella" Fully Open-Source 3B Language Models

Another announcement at AMD today beyond the open-source Linux driver fun for the Radeon RX 9070 series is announcing the open-sourcing of Instella as their new fully open 3B parameter language models.

Dr. Moritz Lehmann Mar 3, 2025

Hot Aisle's 8x AMD #MI300X server is the fastest computer I've ever tested in #FluidX3D #CFD, achieving a peak #LBM performance of 205 GLUPs/s, and a combined VRAM bandwidth of 23 TB/s. 🖖🤯
The #RTX 5090 looks like a toy in comparison.

MI300X beats even Nvidia's GH200 94GB. This marks a very fascinating inflection point in #GPGPU: #CUDA is not the performance leader anymore. 🖖😛
You need a cross-vendor language like #OpenCL to leverage its power.

FluidX3D on #GitHub: https://github.com/ProjectPhysX/FluidX3D

GitHub - ProjectPhysX/FluidX3D: The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use. - ProjectPhysX/FluidX3D

GitHub

Benjamin Carr, Ph.D. 👨🏻‍💻🧬Feb 26, 2025

Sizing up #MI300A’s #GPU
It’s well ahead of #Nvidia’s #H100 PCIe for just about every major category of 32- or 64-bit operations. MI300A can achieve 113.2 TFLOPS of #FP32 throughput, with each FMA counting as two floating point operations. For comparison, H100 PCIe achieved 49.3 TFLOPS in same test.
#AMD cut down #MI300X’s GPU to create MI300A. 24 #Zen4 cores is a lot of #CPU power, and occupies one quadrant on the MI300 chip. But MI300’s main attraction is still the GPU.
https://chipsandcheese.com/p/sizing-up-mi300as-gpu

Sizing up MI300A’s GPU

AMD’s Instinct MI300A is a giant APU, created by swapping out two GPU chiplets (XCDs) for three CPU chiplets (CCDs).

Chips and Cheese

卡拉今天看了什麼 Dec 24, 2024

MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive – SemiAnalysis

Link📌 Summary: 本文深入比較了AMD的MI300X與Nvidia的H100和H200在訓練性能、用戶體驗和總擁有成本等方面的優劣。儘管MI300X在規格上似乎優於競爭對手，實際性能卻未達預期，主要原因是AMD的公共軟件堆棧存在多重漏洞，導致用戶的初始體驗不佳。AMD必須改進軟件質量和測試過程，並提供更良好的出廠體驗纔能有效競爭。此文亦提供了對AMD的具體建議，助其在AI訓練工作負載中成為更強的競爭者。

🎯 Key Points:
- 性能比較：MI300X在矩陣乘法（GEMM）性能上普遍低於H100/H200。
- 用戶體驗：MI300X的公共穩定版本在出廠時存在多數Bug，影響用戶的使用體驗。
- 總擁有成本（TCO）：雖然MI300X的TCO較低，但在公開穩定版本上，其訓練性能表現卻不佳。
- 建議改進：AMD需提高軟件開發資源、改善自家開發流程，並加強自動化測試來提升產品質量。
- 軟件支持：AMD應提交MLPerf訓練結果，以提升其市場競爭力和透明度。

🔖 Keywords: #MI300X #H100 #H200 #訓練性能 #用戶體驗

MI300X vs H100 vs H200 Benchmark Part 1: Training – CUDA Moat Still Alive

Intro SemiAnalysis has been on a five-month long quest to settle the reality of MI300X. In theory, the MI300X should be at a huge advantage over Nvidia’s H100 and H200 in terms of specifications an…

SemiAnalysis

HPC Guru Dec 22, 2024

SemiAnalysis after testing MI300X: “When comparing Nvidia’s GPUs to AMD’s #MI300X, we found that the potential on paper advantage of the MI300X was not realized due to a lack within AMD public release software stack and the lack of testing from AMD” semianalysis.com/2024/12/22/m... #AI #GPU

MI300X vs H100 vs H200 Benchma...

Bluesky

Bluesky Social

Benjamin Carr, Ph.D. 👨🏻‍💻🧬Sep 5, 2024

First #AI #Benchmarks Pitting #AMD Against #Nvidia
Results are good in that they show #MI300X is absolutely competitive with H100 #GPU on one set of AI inference benchmarks, and based on our estimates of GPU and total system costs can be competitive with Nvidia’s H100 and #H200. But, tests only done for #Llama2 #LLM model from Meta with 70 billion parameters.
A lot will depend, on how AMD prices #MI325 later this year and how many AMD can get its partners to manufacture.
https://www.nextplatform.com/2024/09/03/the-first-ai-benchmarks-pitting-amd-against-nvidia/

The First AI Benchmarks Pitting AMD Against Nvidia

Rated horsepower for a compute engine is an interesting intellectual exercise, but it is where the rubber hits the road that really matters. We finally

The Next Platform