🚀 #PETSc 3.21 was released today. There were a number of new contributors this release; thank you all.
https://lists.mcs.anl.gov/pipermail/petsc-announce/2024/000115.html
A few highlights:
* VecMDot and (optionally) VecMAXPY can identify strided memory and use gemv when applicable. This is faster than home-rolled kernels on some GPUs.
* GAMG: new filtering and smoothing options for algebraic multigrid.
* Small subdomain (many per process) BDDC support
* Trust region and quasi-Newton trust region improvements.
@gvwilson I mention numerical/scientific software because that's the area I work in, but also, our software lifecycle is long (many packages are over 30 years old) with a cultural appetite for excusing poor user and developer experience (cf. "Firetran" https://jedbrown.org/files/BrownKnepleySmith-RuntimeExtensibilityAndLibrarizationOfSimulationSoftware-2014.pdf).
Ex: For linear algebra, #PETSc represented a radical shift from the BLAS/LAPACK philosophy at the time. But PETSc has its share of baked-in architecture, as do mature packages throughout the ecosystem.
Some good news!
After a little back and forth with intel, managed to get #PETSc built properly with the Intel #LLVM compilers AND #MPI
in the specific case of PETSc, you want to use `./configure --with-debugging=0 --with-cc='mpiicc -cc=icx' --with-cxx='mpiicpc -cxx=icpx' --with-fc='mpiifort -fc=ifx'`
NB however that this doesn't work well (at all?) with nested cmake, which is relied on by PETSc when using --download-kokkos=1 to mirror your build flags when building kokkos from source
#HPC
The feature I miss the most from Twitter on Mastodon is quote tweets, explicitly of my own for the use case of related but divergent technical threads.
Working on the #GPGPU && #AVX512 accelerated build of #PETSc to enable faster execution of #OpenFOAM as of now and all the relevant threads would have to be independent threads and hard to cross reference as new threads/tangents pop up.
I'll be speaking during the OneAPI dev summit tomorrow, specifically the panel discussion on accelerated computing.
Partially as a meme/Sanity check the challenge of this afternoon:
Can I got from no installs to a GPU accelerated simulation of OpenFOAM using an Arc a770 on Linux?
Plan is #OpenFoam 2212, #petsc 3.19, #OneAPI 2023.1 and #mesa 23 (in case I need to fall back to #OpenCL)
Now is a good time to register for #PETSc 2023 in Chicago, June 5-7.
https://petsc.org/release/community/meetings/2023/#meeting
We'd love to hear what you're doing with PETSc. Submit an abstract before May 1. https://docs.google.com/forms/d/e/1FAIpQLSesh47RGVb9YD9F1qu4obXSe1X6fn7vVmjewllePBDxBItfOw/viewform
#PETSc 3.19 has been released. Some highlights:
* Sparse matrices have good perf w/out user-provided preallocation.
* Optimal simplex quadrature up to 20th.
* -dm_plex_shape zbox creates a "born parallel" mesh with Z-order partition.
* "Isoperiodicity": late mapping enables simple traversal of manifolds adjacent to periodic connections.
* Scalable CGNS output: high order elements and flexible batching of time series.
* Perf/GPU improvements.
* Coordinate-based SF graph.
@jannem You got me there, as I’m not that acquainted with these topics. So I didn’t remember the exact suggestions given.
Citing from @mhagdorn’s summary (hash signs by me):
“Essentially use #OpenMP or #SYCL or a suitable library such as #PETSc or #pytorch depending on the application.”
The slides and videos are supposed to get uploaded very soon.