If I wanted to aquire some to #gpu #compute to try running either #GraphCast or #PanguWeather ML weather forecast models, how could I do that? This is a private project, so it can't cost a lot.
I have tried running #GraphCast on a CPU workstation with 64 GB RAM and it works for the reduced complexity version, but takes a long time (14 minutes for 10 days). The full version runs out of memory. Also tried #PanguWeather which took 8 minutes to run 24 hours, but there might be some tweaks available. Here I was able to run the full version.
@hansbrenna
Depending on the architecture/framework of the model (torch, tensorflow lite, etc), you could forego a GPU and instead buy an AI/ML accelerator.
They're small ASIC devices, which are more affordable (both to acquire and run) than full GPUs.
I've been thinking about getting a Google Coral device myself, but I haven't shopped around yet.
https://coral.ai/products
Products | Coral

Helping you bring local AI to applications from prototype to production

Coral
@hansbrenna
Also, regarding the out-of-memory issue, I've had some success with simply adding a swapfile of 100G to my Linux system. Make sure it's on your fastest drive!
Apart from that #linux #oom can be alleviated by compressing the ram contents using #zram, or in your case where you'll want excess things in memory to spill over into a backing swapfile; using #zswap