Mastodawn

Mixed-precision numerics in scientific applications: survey and perspectives

Mixed-precision numerics in scientific applications: survey and perspectives

The explosive demand for artificial intelligence (AI) workloads has led to a significant increase in silicon area dedicated to lower-precision computations on recent high-performance computing hard…

hgpu.org

N-gated Hacker News Mar 20

🎉🌈 Behold, the NumKong 2000—a mind-boggling parade of mixed precision #kernels, designed to make your head spin faster than a washing machine on hyperdrive! 🤯🌀 With a dazzling array of Float6 to #Float118 across 7 languages, it's the Swiss Army knife of numerics—but only if you have 48 spare minutes and a PhD in deciphering technobabble. 📚🔍
https://ashvardanian.com/posts/numkong/ #NumKong2000 #MixedPrecision #TechInnovation #Numerics #HackerNews #ngated

NumKong: 2'000 Mixed Precision Kernels For All 🦍

Around 2'000 SIMD kernels for mixed-precision BLAS-like numerics — dot products, batched GEMMs, distances, geospatial, ColBERT MaxSim, and mesh alignment — from Float6 to Float118, leveraging RISC-V, Intel AMX, Arm SME, and WebAssembly Relaxed SIMD, in 7 languages and 5 MB.

Ash's Blog

Aditya Nov 20, 2024

I missed the fact that #hpgmp (High-performance GMRES mixed-precision) is now a separate project (finally). https://github.com/hpg-mxp/hpg-mxp #mixedprecision #hpc

GitHub - hpg-mxp/hpg-mxp

Contribute to hpg-mxp/hpg-mxp development by creating an account on GitHub.

GitHub

HGPU group Oct 27, 2024

Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration

#Intel #AVX #MixedPrecision #FEM #Package

https://hgpu.org/?p=29481

Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration

In this paper we develop the first fine-grained rounding error analysis of finite element (FE) cell kernels and assembly. The theory includes mixed-precision implementations and accounts for hardwa…

hgpu.org

Show thread

Dr. Moritz Lehmann May 17, 2024

@freemin7 working in both I'd say #1. Gamedev has less computational physics and since death of SLI also no distributed computing component, but rendering itself is a flavour of computing and the brutal runtime and memory ceilings make it nothing less than #HPC. After all, a modern gaming #GPU has the same oompf as an entire supercomputer from 2 decades ago. There is plenty optimization techniques originating in gamedev that made it into HPC and vice versa. Prime example is #mixedprecision. 🖖🧐

Alessio Sclocco Apr 8, 2024

Are you an European scientist working in climate and weather? Then you may want to check this hackathon that we are organizing in Amsterdam. We want to help you improve the performance and energy efficiency of your code using Graphics Processing Units, auto-tuning, and mixed-precision techniques!

#Climate #Weather #HPC #GPU #EnergyEfficiency #AutoTuning #MixedPrecision

Help me by reposting this (if you can)

https://www.esiwace.eu/events/2nd-esiwace3-hackathon

2nd ESiWACE3 Hackathon

ESiWACE3 Hackathon on Optimisation and Tuning of Earth-System Models

ESiWACE

Show thread

FCLC Mar 18, 2024

Oh hey! #mixedprecision! That’s my thing!

What is it Jensen? OCP? FP8? MXfloat? Death to TF32?

The Triangle Agency Mar 22, 2023

How distributed training works in Pytorch: distributed data-parallel and mixed-precision training https://triangleagency.co.uk/how-distributed-training-works-in-pytorch-distributed-data-parallel-and-mixed-precision-training/?utm_source=dlvr.it&utm_medium=mastodon #TheTriangleAgencyNews #dataparallel #Distributed #mixedprecision

How distributed training works in Pytorch: distributed data-parallel and mixed-precision training - The Triangle Agency

Click the link to discover all our marketing tools and unlimited access B2B email leads. Leads Vault In this tutorial, we will learn how to use nn.parallel.DistributedDataParallel for training our models in multiple GPUs. We will take a minimal example of training an image classifier and see how we can speed up the training. Let’s […]

The Triangle Agency

heise online (inoffiziell)Dec 15, 2020

Neben der direkten Anbindung an NumPy führt das Machine-Learning-Framework eine neue Methode für asynchrones paralleles Modelltraining ein.
Machine Learning: TensorFlow 2.4 rechnet mit NumPy-APIs

Machine Learning: TensorFlow 2.4 rechnet mit NumPy-APIs

Neben der direkten Anbindung an NumPy führt das Machine-Learning-Framework eine neue Methode für asynchrones paralleles Modelltraining ein.