Mastodawn

cuTile C++ has been released! check it out! https://docs.nvidia.com/cuda/cuda-programming-guide/02-basics/writing-tile-kernels.html

#cuda #cuTile #cpp

2.4. Writing Tile Kernels — CUDA Programming Guide

Show thread

Dani (

Antifascista)1d ago

@mhoemmen

What's wrong with

import cuda.tile;

in C++? Just asking for a friend 😊
Thanks a lot for doing this! 🎉

Show thread

Mark Hoemmen 1d ago

@DanielaKEngert this is a reasonable request! i will ask about this!

note that you might see declarations of class templates in the header file, but the compiler knows about those things and implements them not necessarily by compiling C++ code

Excerpts from my talk this year (actual measurements):

Show thread

Mark Hoemmen 1d ago

@DanielaKEngert EXCELLENT (always glad to see experiments!!!)

Show thread

Mark Hoemmen 1d ago

@DanielaKEngert btw, NVCC doesn't support modules yet, but we're working on it!

Show thread

Dani (

Antifascista)1d ago

@mhoemmen
To put the shown slides into perspective: the mentioned small application is using (pretty much) latest Boost, Qt 6, Apache Xerces, (unadorned) Asio, NLohmann.Json, and the hardened C++23 standard library *without* changing *any* of the original sources - i.e. everything still compiles in the original #include form, as checked-out from the repositories. Therefore the direct comparision that would otherwise be impossible.

In other words: a full transition to modules would possibly be even faster to build.

The hesitation of people towards big modules is the no.1 performance killer that people tend to bitch about. And then there is that abomination called CMake that is the guarantee for thwarting such comparisions.

Show thread

Mark Hoemmen 1d ago

@DanielaKEngert i totally agree! this sort of evidence is most helpful : - )