Mastodawn

Tangram: Hiding GPU Heterogeneity for Efficient LLM Parallelization

Tangram: Hiding GPU Heterogeneity for Efficient LLM Parallelization

The scale of LLM training jobs requires parallelization planning over large GPU clusters. Due to different GPU types and interconnects added over time, these GPU clusters are increasingly heterogen…

hgpu.org