If I wanted to aquire some to #gpu#compute to try running either #GraphCast or #PanguWeather ML weather forecast models, how could I do that? This is a private project, so it can't cost a lot.
I have tried running #GraphCast on a CPU workstation with 64 GB RAM and it works for the reduced complexity version, but takes a long time (14 minutes for 10 days). The full version runs out of memory. Also tried #PanguWeather which took 8 minutes to run 24 hours, but there might be some tweaks available. Here I was able to run the full version.
@hansbrenna Depending on the architecture/framework of the model (torch, tensorflow lite, etc), you could forego a GPU and instead buy an AI/ML accelerator. They're small ASIC devices, which are more affordable (both to acquire and run) than full GPUs. I've been thinking about getting a Google Coral device myself, but I haven't shopped around yet. https://coral.ai/products
Products | Coral
Helping you bring local AI to applications from prototype to production
@hansbrenna Also, regarding the out-of-memory issue, I've had some success with simply adding a swapfile of 100G to my Linux system. Make sure it's on your fastest drive! Apart from that #linux#oom can be alleviated by compressing the ram contents using #zram, or in your case where you'll want excess things in memory to spill over into a backing swapfile; using #zswap