AI 모델 실행 비용 절반으로, Dropbox가 설명하는 Low-bit 추론 최적화

AI 모델 실행 비용을 절반으로 줄이는 Low-bit 추론 기술. Dropbox가 설명하는 양자화 기법과 MXFP 포맷의 실무 적용 사례를 소개합니다.

https://aisparkup.com/posts/9287

🌗 NVIDIA/cuda-tile:針對 CUDA 核心優化且基於 MLIR 的分塊式中間表示法與編譯器基礎設施
➤ 深度硬體協調:分塊運算優化如何重塑 GPU 效能巔峯
https://github.com/NVIDIA/cuda-tile
NVIDIA 宣佈推出開源專案「CUDA Tile IR」,這是一套基於 MLIR(多級中間表示法)架構的編譯器基礎設施,專為提升 CUDA 核心(Kernel)的執行效率而生。該技術的核心在於優化「分塊式運算」(Tile-based computation)模式,特別針對 NVIDIA 旗下的 Tensor Core 硬體單元進行深度調優。透過提供高階抽象化接口,開發者能更輕鬆地管理複雜的記憶體階層與分塊模式,從而釋放 GPU 的極致效能。此專案與最新的 CUDA Toolkit 13.1 同步釋出,象徵著 NVIDIA 在編譯器層級簡化高效能運算開發的重大突破。
+ 「NVIDIA 將 MLIR 引入 CUDA 生態系是必然的趨勢。有了官方
##GPU運算 #編譯器技術 #NVIDIA #MLIR #平行運算 #TensorCore
GitHub - NVIDIA/cuda-tile: CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA tensor core units.

CUDA Tile IR is an MLIR-based intermediate representation and compiler infrastructure for CUDA kernel optimization, focusing on tile-based computation patterns and optimizations targeting NVIDIA te...

GitHub

Hôm nay khám phá kiến thức về GPU – linh hồn của mô hình ngôn ngữ. GPU xử lý siêu song song, lý tưởng cho matrix multiplication trong ML nhờ hàng nghìn CUDA và Tensor Cores. So sánh CPU (lõi mạnh, xử lý tuần tự) vs GPU (nhiều lõi, song song). VRAM quan trọng để lưu trọng số/activations, thiếu gây lỗi training. FLOPS đo tốc độ tính toán, nhưng phụ thuộc bandwidth và hiệu suất Tensor Cores. Hiểu GPU để tối ưu hiệu quả huấn luyện mô hình!

#AI #ML #GPU #DeepLearning #VRAM #CUDA #TensorCore #FLOPS

#NVIDIA #TensorCore Evolution: From Volta To Blackwell Amdahls Law, Strong Scaling, Asynchronous Execution, Blackwell, Hopper, Ampere, Turing, Volta, TMA
They introduce core features of major #datacenter #GPU, first explaining important first principles of performance engineering. Then trace evolution of Nvidia’s Tensor Core architectures and programming model, highlighting motivations behind evolution. End goal is to provide a resource for understanding Nvidia’s GPU arch
https://semianalysis.com/2025/06/23/nvidia-tensor-core-evolution-from-volta-to-blackwell/
NVIDIA Tensor Core Evolution: From Volta To Blackwell

In our AI Scaling Laws article from late last year, we discussed how multiple stacks of AI scaling laws have continued to drive the AI industry forward, enabling greater than Moore’s Law grow…

SemiAnalysis
Overtake Mr Bean GIF by Working Title - Find & Share on GIPHY

Discover & share this Overtake Mr Bean GIF by Working Title with everyone you know. GIPHY is how you search, share, discover, and create GIFs.

DLSS 3 su RTX 3000? NVIDIA apre alla generazione di frame

NVIDIA apre alla possibilità di portare la generazione di frame DLSS 3 sulle RTX 3000! Grazie ai Tensor Core e a un nuovo algoritmo, l'upgrade è possibile.

CeoTech

When technology meets mathematics: The A100 GPU powers the discovery of the latest giant in the prime number world! 🚀💻✨🔍💡🧮💥🎉

The "new" largest prime was discovered today (October 21, 2024)! It's a Mersenne prime \( (2^p-1 )\), which is easier to find because it can utilize a specialized "faster" primality test (known as LLT, i.e., Lucas-Lehmer test). On October 11, an NVIDIA A100 GPU in Dublin, Ireland, reported that M136279841 is probably prime. On October 12, an NVIDIA H100 in San Antonio, Texas, USA, confirmed primality with a Lucas-Lehmer test.

\[\Huge 2^{136,279,841}-1\]

It took nearly 6 years for the Great Internet Mersenne Prime Search (GIMPS )software to find it after the previous largest known prime. It was also the first Mersenne prime found using GPUs.

#Prime #PrimeNumber #LargestPrime #LargestKnownPrime #MersennePrimeA100 #TensorCore #LLT #LucasLehmerTest #PrimalityTest #GPGPUs #GIMPS

https://www.mersenne.org/primes/?press=M136279841
https://www.livescience.com/physics-mathematics/mathematics/largest-known-prime-number-spanning-41-million-digits-discovered-by-amateur-mathematician-using-free-software
https://www.popularmechanics.com/science/math/a62695223/biggest-prime-number/
https://www.popsci.com/science/largest-prime-number/
https://www.smithsonianmag.com/smart-news/amateur-mathematician-discovers-the-largest-known-prime-number-with-more-than-41-million-digits-180985321/
https://www.newscientist.com/article/2452686-amateur-sleuth-finds-largest-known-prime-number-with-41-million-digits/
https://www.sciencefocus.com/news/new-prime-number
https://sherwood.news/culture/new-largest-prime-number-discovered-by-former-nvidia-software-engineer/
https://www.tomshardware.com/tech-industry/former-nvidia-engineer-discovers-41-million-digit-prime-largest-prime-number-known-to-man-was-uncovered-and-verified-with-the-help-of-gpus
https://www.washingtonpost.com/science/2024/10/23/nvidia-prime-mersenne-gpu-cloud/
https://gizmodo.com/nvidia-computer-finds-largest-known-prime-blows-past-record-by-16-million-digits-2000514948
https://fermatslibrary.com/p/6883b84b
https://en.wikipedia.org/wiki/Lucas%E2%80%93Lehmer_primality_test
https://en.wikipedia.org/wiki/Great_Internet_Mersenne_Prime_Search
https://en.wikipedia.org/wiki/Probable_prime
https://en.wikipedia.org/wiki/Largest_known_prime_number

Mersenne Prime Discovery - 2^136279841-1 is Prime!

GIMPS has discovered a new Mersenne prime number: 2^136279841-1 is prime! Discovered: 2024 Oct 12

@gomoot

"(...) 👉NVIDIA's Instant NeRF: Create 3D scenes from simple 2D photos. Detailed three-dimensional scenes in seconds. The software is open-source and available on GitHub

https://gomoot.com/instant-nerf-di-nvidia-crea-scene-3d-da-semplici-foto-2d

#3D #GeForceRTX #gpu #InstantNeRF #InstantNGP #nvidia #NvidiaRTX #opensource #PC #gpu #windows #TensorCore @nvidia #nvidia (...)"

Instant NeRF di NVIDIA crea esperienze 3D da semplici foto

La tecnologia Instant NeRF di NVIDIA trasforma foto 2D in scene 3D utilizzando l'IA, con applicazioni per turismo, intrattenimento e commercio elettronico.

Gomoot : tecnologia e lifestyle Scopri le ultime novità in fatto di hardware,tecnologia e altro

👉Instant NeRF di NVIDIA: crea scene 3D da semplici foto 2D . Scene dettagliate tridimensionali in pochi secondi. Il software è open-source e disponibile su GitHub

https://gomoot.com/instant-nerf-di-nvidia-crea-scene-3d-da-semplici-foto-2d

#3D #GeForceRTX #gpu #InstantNeRF #InstantNGP #nvidia #NvidiaRTX #opensource #PC #gpu #windows #TensorCore @nvidia #nvidia

Instant NeRF di NVIDIA crea esperienze 3D da semplici foto

La tecnologia Instant NeRF di NVIDIA trasforma foto 2D in scene 3D utilizzando l'IA, con applicazioni per turismo, intrattenimento e commercio elettronico.

Gomoot : tecnologia e lifestyle Scopri le ultime novità in fatto di hardware,tecnologia e altro

💡 LATTE3D di Nvidia è il modello di IA generativa più veloce per contenuti 3D, in grado di convertire l'input di testo in oggetti 3D dettagliati in pochi secondi

https://gomoot.com/latte3d-la-generazione-3d-istantanea-di-nvidia

#3D #a100 #ChatGPT #gpu #LATTE3D #nvidia #Omniverse #OpenAI #RTXA6000 #TensorCore #usd

LATTE3D: a generazione 3D istantanea di NVIDIA

Modello LATTE3D di NVIDIA rende il processo creativo 3 più veloce, più intuitivo e più efficiente, cambiando il futuro del design e della creatività digitale.

Gomoot : tecnologia e lifestyle Scopri le ultime novità in fatto di hardware,tecnologia e altro