@GDCPresoReviews it's all good, this stuff could be explained better anyway.
On mobile GPUs it is always the same lesson: bandwidth is lava so minimise it at all costs. Reading in and out of tile is what kills your perf in 90% of cases so minimise it. Not only that but ALU has grown orders of magnitude over the years whereas mem bw has barely had linear growth so that ends up, in real world use cases, being your main limiting factor