Alibaba Cloud claims to slash Nvidia GPU use by 82% with new pooling system

The new Aegaeon system can serve dozens of large language models using a fraction of the GPUs previously required, potentially reshaping AI workloads.

South China Morning Post