Mastodawn

kkarhan 3d ago

Luke Wren

> "We have a new JIT backend!"
> Look inside
> Ad-hoc python build system which invokes the system CUDA compiler

Which industry am I talking about 🤔

Show thread

Andrew Zonenberg 3d ago

@wren6991 slop, i assume.

Although... I have been seriously thinking about one day making a JIT shader compiler in libscopehal that if you e.g. chain two simple math functions end to end, and nothing is using the intermediate results, will be able to concatenate the filter kernels to avoid allocating buffers and storing/loading from memory needlessly.

This would be probably done by having prebuilt sub-kernels that read/write from predefined local variable names or something, then doing simple string concatenation on the blocks, then invoking glslc at run time to generate SPIR-V from the compiiled kernel then feeding it to vulkan.

Show thread

Luke Wren 3d ago

@azonenberg Yes, the sloposystem. You might actually find llama.cpp's (many) GGML backends interesting to look at, because the approaches are quite varied but they all run the same compute graphs.

The CUDA backend does static C++ template specialisation of fused versions of operators at build time (AoT), then at runtime it walks the compute graph and looks for opportunities to replace combinations of basic operators with a single fused one.

The Metal backend is more like what you described: seems to generate and online-compile a megashader for the whole compute graph. There's a Vulkan backend too, but I have no idea how it works.

Show thread

Ignas Kiela 3d ago

@azonenberg @wren6991 SPIR-V is not that complicated, it would probably be viable to just do it all in that (so you don't need to ship glslc). Though likely a bit more complicated.

Show thread

kkarhan 3d ago

@wren6991 AI bros!