Mastodawn

Dr. Moritz Lehmann Nov 1, 2023

I've always wondered how much VRAM bandwidth games and applications pull - other tools don't tell you that. So I've written a program that can query/measure it, and properly visualize in CMD! 🖖 🧐
So far it works on Windows with Intel (IGCL) and Nvidia (NVML) #GPUs. Here is Intel i7-8700K and Arc A750 running the game Planetside 2. Cyberpunk is ~60GB/s without and ~100GB/s with #raytracing.

Dr. Moritz Lehmann Jan 13, 2024

My hw-smi program now has partial #Linux support! Still a lot of work to do. 🖖🧐
____________ | Windows | Linux |
CPU / RAM | ✅ | ✅ |
Nvidia GPU | ✅ | ✅ |
Intel GPU | ✅ | 🚧 |
AMD GPU | 🚧 | 🚧 |

Dr. Moritz Lehmann Mar 16

Finally getting back to this fun side project - a portable, universally compatible #CPU / #GPU telemetry monitor. Only #Linux support for #Intel remaining
#AMD's AMDSMI is an adventure - they have 3 API calls to get the PCIe throughput, and 2 to get the max VRAM bandwidth - none of which work 🖖🤯
____________ | Windows | Linux |
CPU / RAM | ✅️ | ✅️ |
Nvidia GPU | ✅️ | ✅️ |
Intel GPU | ✅️ | 🚧 |
AMD GPU | ✅️ | ✅️ |

Dr. Moritz Lehmann 5d ago

Finally Intel #GPU support on Linux too. Watch all the metrics go brrr in multi-GPU #FluidX3D #CFD workload! Will #opensource soon™️

Hardening against the myriads of broken counters in all those bugged APIs was a long shot. 🖖🫠

____________ | Windows | #Linux |
CPU / RAM | ✅️️WinAPI | ✅️️/proc |
#Nvidia GPU | ✅️️NVML | ✅️️NVML |
#Intel GPU | ✅IGCL | ✅SYSMAN |
#AMD GPU | ✅️️️️ADLX | ✅️️️️AMDSMI |

Wulfy—Speaker to the machines 8h ago

Dr. Moritz Lehmann

It's done and #opensourced on #GitHub! 🖖🥳
https://github.com/ProjectPhysX/hw-smi

A minimal, cross-compatible #CPU/#GPU telemetry monitor with accurate data directly from vendor APIs and beautiful ASCII visualization.

How much #VRAM bandwidth does an application or a game pull? Is the traffic over #PCIe a bottleneck? What's the CPU/GPU load, RAM/VRAM occupation, temps, power draw, clocks? hw-smi works with all CPUs and all #Nvidia/#AMD/#Intel GPUs, on both Windows and #Linux.

Have fun!

Alan Sill 1d ago

@ProjectPhysX I was able to get it to build on one of our GPU nodes but had to add a soft-link for the x86_64-linux-gnu in /usr/lib as our path to the Nvidia driver library is just /usr/lib/libnvidia-ml.so

@AlanSill @ProjectPhysX Just had a very quick look into the project and was wondering about the large *.so binaries. Redistributing proprietary drivers has been always a pain. Curious about its origin I checked VirusTotal, usually they know common spread software. In this case it don't.

Might those files should not have made it into git?

Alan Sill 8h ago

@t0my @ProjectPhysX In my case, I'm linking against the previously installed library we installed on our own. Not sure if the one in the GitHub source is needed at all for such cases.

Dr. Moritz Lehmann 8h ago

@AlanSill @t0my I've decided to ship libs with the repo, as they might be missing on some systems.
Have updated the compile script to check more possible driver install paths - unforunately they are different depending on distro.

Methylzero 1d ago

@ProjectPhysX under linux for CPU power you could use the rapl system no? at least for cpus supported by rapl

@ProjectPhysX @aud ooooh

Asta [AMP]8h ago

@SnoopJ @ProjectPhysX right?! I wanna try this out...

Asta [AMP]8h ago

@SnoopJ @ProjectPhysX Ah, a quick check on my work container (which is running Ubuntu on Incus, an LXD fork) has it compiling without GPUs as 'no NVIDIA driver detected'; I'm guessing whatever the check for the GPU is doesn't work inside a container...

Asta [AMP]8h ago

@SnoopJ @ProjectPhysX ... oh, it also doesn't find my AMD card on bare metal on my main machine, either 😅

EDIT: That makes sense, I don't have ROCm installed currently and it's looking for that.

Dr. Moritz Lehmann 8h ago

@aud @SnoopJ it tries to find libnvidia-ml.so, no Idea where that is located in a container. But you can try to compile manually for Nvidia with:

g++ src/main.cpp -o bin/hw-smi -std=c++17 -O3 -D NVIDIA_GPU -L./src/NVML/lib -lnvidia-ml

Asta [AMP]8h ago

@ProjectPhysX @SnoopJ nice! I'll give that a go, here.

Yeah, things using the nvidia-container-toolkit have the libraries sort of added at the last minute...

In my container's case, it's located at /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.570.172.08; I found it by running nvidia-container-cli list, just in case someone else runs into that issue.

Asta [AMP]8h ago

@SnoopJ @ProjectPhysX Ah, so, the symlink to that exists at

/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1

Not sure if an ubuntu thing or what...

Asta [AMP]8h ago

@SnoopJ @ProjectPhysX bwahaha! It works!

Sweet, thank you! I'll probably fire it up during some test runs when I get the chance... and will probably install ROCm on my main machine then spy on video games while they're running.

synlogic4242 8h ago

@ProjectPhysX bookmarking that! I'm writing a book on HPC

Delta Sierra 7h ago

@ProjectPhysX very nice! I was able to build it after installing amdsmi (a quick 'pacman -S amdsmi').

I like the minimal UI, although if you're taking suggestions it would be nice to have some labeling options in the --graph view and configurable update intervals.

Dr. Moritz Lehmann 2h ago

@notthatdelta fo now you can set update frequency here: https://github.com/ProjectPhysX/hw-smi/blob/master/src/main.cpp#L13

hw-smi/src/main.cpp at master · ProjectPhysX/hw-smi

A minimal, cross-compatible CPU/GPU telemetry monitor with accurate data directly from vendor APIs and beautiful ASCII visualization. - ProjectPhysX/hw-smi

GitHub

Delta Sierra 2h ago

@ProjectPhysX Lovely, thank you! I'll play around with it a bit more tonight.