I missed the fact that #hpgmp (High-performance GMRES mixed-precision) is now a separate project (finally). https://github.com/hpg-mxp/hpg-mxp #mixedprecision #hpc
GitHub - hpg-mxp/hpg-mxp

Contribute to hpg-mxp/hpg-mxp development by creating an account on GitHub.

GitHub

Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration

#Intel #AVX #MixedPrecision #FEM #Package

https://hgpu.org/?p=29481

Mixed-precision finite element kernels and assembly: Rounding error analysis and hardware acceleration

In this paper we develop the first fine-grained rounding error analysis of finite element (FE) cell kernels and assembly. The theory includes mixed-precision implementations and accounts for hardwa…

hgpu.org
@freemin7 working in both I'd say #1. Gamedev has less computational physics and since death of SLI also no distributed computing component, but rendering itself is a flavour of computing and the brutal runtime and memory ceilings make it nothing less than #HPC. After all, a modern gaming #GPU has the same oompf as an entire supercomputer from 2 decades ago. There is plenty optimization techniques originating in gamedev that made it into HPC and vice versa. Prime example is #mixedprecision. 🖖🧐

Are you an European scientist working in climate and weather? Then you may want to check this hackathon that we are organizing in Amsterdam. We want to help you improve the performance and energy efficiency of your code using Graphics Processing Units, auto-tuning, and mixed-precision techniques!

#Climate #Weather #HPC #GPU #EnergyEfficiency #AutoTuning #MixedPrecision

Help me by reposting this (if you can)

https://www.esiwace.eu/events/2nd-esiwace3-hackathon

2nd ESiWACE3 Hackathon

ESiWACE3 Hackathon on Optimisation and Tuning of Earth-System Models

ESiWACE

Oh hey! #mixedprecision! That’s my thing!

What is it Jensen? OCP? FP8? MXfloat? Death to TF32?

How distributed training works in Pytorch: distributed data-parallel and mixed-precision training - The Triangle Agency

Click the link to discover all our marketing tools and unlimited access B2B email leads. Leads Vault In this tutorial, we will learn how to use nn.parallel.DistributedDataParallel for training our models in multiple GPUs. We will take a minimal example of training an image classifier and see how we can speed up the training. Let’s […]

The Triangle Agency
Neben der direkten Anbindung an NumPy führt das Machine-Learning-Framework eine neue Methode für asynchrones paralleles Modelltraining ein.
Machine Learning: TensorFlow 2.4 rechnet mit NumPy-APIs
Machine Learning: TensorFlow 2.4 rechnet mit NumPy-APIs

Neben der direkten Anbindung an NumPy führt das Machine-Learning-Framework eine neue Methode für asynchrones paralleles Modelltraining ein.