➤ 以 MLIR 驅動新一代高效能稀疏運算架構
✤ https://www.osti.gov/biblio/3013883
本報告介紹了 LAPIS 編譯器框架的研發成果。該框架基於多層次中間表示(MLIR)構建,旨在解決稀疏線性代數運算中的效能瓶頸,並確保程式碼在多種計算架構間的移植性。透過引入創新的「Kokkos 方言」,LAPIS 成功簡化了從高階語言向底層硬體轉換的過程,並支援將 MLIR 程式碼轉換為 C++ Kokkos 程式碼,從而促進科學機器學習(SciML)模型的整合。此外,LAPIS 透過新增「分區方言」來處理分散式記憶體架構,優化了稀疏張量的分佈與通訊模式。這一框架不僅提升了稀疏與稠密矩陣運算在 GPU 上的執行效率,還廣泛應用於 GraphBLAS 資料庫(TenSQL)及複雜圖演算法,實現了開發效率與運算效能的完美平衡。
+ 終於看到有基於 MLIR 的框架能處理分散式稀疏張量了,這對於科學計算領域的大規模圖分析應用來說
#高效能計算 (HPC) #編譯器架構 #稀疏線性代數 #MLIR #分散式運算
Enabling Efficient Sparse Computations using Linear Algebra Aware Compilers (Technical Report) | OSTI.GOV
This project developed the LAPIS compiler framework, built on the Multilevel Intermediate Representation (MLIR), to optimize sparse linear algebra operations and support performance portability across diverse architectures. The main innovation of LAPIS is the Kokkos dialect, which allows for lowering codes from a high productivity language to different architectures in an elegant way. The dialect also allows the conversion of lower-level MLIR code to C++ Kokkos code, facilitating the integration of scientific machine learning (SciML) models into applications. To extend LAPIS for distributed memory architectures, a new partition dialect was created to manage the distribution of sparse tensors and express communication patterns for sparse linear algebra operations. This dialect also supports the distributed execution of operators and includes algorithmic optimizations to minimize communication to improve performance. The project also demonstrates that MLIR can enable effective linear algebra-level optimizations, improving performance on different GPUs for both sparse and dense linear algebra kernels. Key applications of LAPIS include sparse linear algebra and graph kernels, TenSQL, a relational database management solution built on GraphBLAS, and the development of subgraph isomorphism and monomorphism kernels, showcasing performance portability. In summary, the LAPIS framework supports productivity, performance, portability, and distributed memory execution, while also enabling linear algebra-level optimizations that are challenging in traditional programming languages, with successful applications ranging from simple sparse linear algebra to complex graph kernels. | OSTI.GOV

