Mastodawn

Daniel's talk on the current state of SERGHEI, our #HPC hydrodynamic code that leverages the #kokkos framework:
https://youtu.be/scI8jB2e5UQ

Kokkos Tea time with Daniel Caviedes-Voullieme (FZJ) : SERGHEI: a Kokkos-based framework

YouTube

HGPU group Oct 19

A Performance Portable Matrix Free Dense MTTKRP in GenTen

#Kokkos #CUDA #OpenMP #Package

https://hgpu.org/?p=30302

A Performance Portable Matrix Free Dense MTTKRP in GenTen

We extend the GenTen tensor decomposition package by introducing an accelerated dense matricized tensor times Khatri-Rao product (MTTKRP), the workhorse kernel for canonical polyadic (CP) tensor de…

hgpu.org

HGPU group Aug 17

Performant Unified GPU Kernels for Portable Singular Value Computation Across Hardware and Precision

#OpenCL #SYCL #HIP #Kokkos #Julia

https://hgpu.org/?p=30096

Performant Unified GPU Kernels for Portable Singular Value Computation Across Hardware and Precision

This paper presents a portable, GPU-accelerated implementation of a QR-based singular value computation algorithm in Julia. The singular value ecomposition (SVD) is a fundamental numerical tool in …

hgpu.org

HGPU group Aug 3

Performance Portable Gradient Computations Using Source Transformation

#Kokkos #HIP #CUDA #Performance

https://hgpu.org/?p=30070

Performance Portable Gradient Computations Using Source Transformation

Derivative computation is a key component of optimization, sensitivity analysis, uncertainty quantification, and nonlinear solvers. Automatic differentiation (AD) is a powerful technique for evalua…

hgpu.org

Wololllooo May 27, 2025

I'm learning #kokkos via the (kokkos-tutorial)[https://github.com/kokkos/kokkos-tutorials] repo on GitHub, and first module gave me an idea of a meme to actually best describe what Kokkos is for, so enjoy :
#gpu #hpc #opensource #kokkos #tutorial #code #cuda #sycl

HGPU group Feb 16, 2025

Leveraging LLVM OpenMP GPU Offload Optimizations for Kokkos Applications

#Kokkos #CUDA #HIP #OpenMP #PerformancePortability #Package

https://hgpu.org/?p=29747

Leveraging LLVM OpenMP GPU Offload Optimizations for Kokkos Applications

OpenMP provides a cross-vendor API for GPU offload that can serve as an implementation layer under performance portability frameworks like the Kokkos C++ library. However, recent work identified so…

hgpu.org

ct Feb 13, 2025

https://dl.acm.org/doi/10.1145/3624062.3624230

#risc_v #hpc #supercomputing #kokkos #hpx

Evaluating HPX and Kokkos on RISC-V using an astrophysics application Octo-Tiger | Proceedings of the SC '23 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis

ACM Other conferences

HGPU group Dec 29, 2024

Asynchronous-Many-Task Systems: Challenges and Opportunities – Scaling an AMR Astrophysics Code on Exascale machines using Kokkos and HPX

#CUDA #HIP #Kokkos #Astrophysics #Exascale #Package

https://hgpu.org/?p=29620

Asynchronous-Many-Task Systems: Challenges and Opportunities – Scaling an AMR Astrophysics Code on Exascale machines using Kokkos and HPX

Dynamic and adaptive mesh refinement is pivotal in high-resolution, multi-physics, multi-model simulations, necessitating precise physics resolution in localized areas across expansive domains. Tod…

hgpu.org

HGPU group Nov 17, 2024

Kokkidio: Fast, expressive, portable code, based on Kokkos and Eigen

#Kokkos #PerformancePortability #Package

https://hgpu.org/?p=29541

Kokkidio: Fast, expressive, portable code, based on Kokkos and Eigen

Kokkidio is a newly developed C++ template library that combines the performance portability framework Kokkos and its strength in utilising GPUs with the expressive syntax and CPU optimisations of …

hgpu.org

Robert Bassett Aug 12, 2024

Working on my first project using the performance portability layer Kokkos. My first impression is mostly good, with the single big exception being their View type. Too many bells and whistles included, in my opinion, for high performance computing.

Kokkos also supports C style dynamic allocation, but the docs suggest that Views may be more naturally mapped onto various hardware. Can anyone confirm this? It seems surprising to me that all the extra overhead of Views would outperform some malloc equivalent.

Or do folks have any tips for a Kokkos newbie in general?

#hpc #kokkos #computation