Dynamic Register Allocation on AMD's RDNA 4 GPU Architecture

Modern GPUs often make a difficult tradeoff between occupancy (active thread count) and register count available to each thread.

Chips and Cheese