I want to power through and get this into a state where I can use it with live instruments, but I am completely exhausted 😴
I reworked some things and now my audio convolving compute shader can convolve ~11 milliseconds worth of audio samples with an average processing time of ~7 milliseconds. That's with one channel with a bit rate of 22050. When the bit rate is 44100, the average processing time is a paltry ~8 milliseconds.
also sometime in the last week I made it so it can operate entirely on a live input stream from SDL3 rather than a wave file, so in theory I can incorporate this into a modular setup now, but the results are higher latency than I'd like, and SDL3 doesn't give you much control over audio latency.
Apparently my best frame time can get as low as 3 ms. I think vulkan should let me VK_QUEUE_GLOBAL_PRIORITY_REALTIME this program, but sadly vulkan is being a coward about it.
ok the problem I'm having with latency now is that the audio latency in the system grows over time and I'm not sure why. like it starts snappy and after running for a short while it gets super laggy :/
I'm guessing it's because SDL3 can and will resize buffers as it wants to, whereas I'd rather it just go crazy if it under runs.
What I want to do is have a fixed size buffer for input and output, enough that I can have the output double or tripple buffered to smooth over hitches caused by linux. if my program can't keep up I don't want it to quietly allocate more runway I want it to scream at me LOUDLY and HORRIBLY, but it wont do that because I'll rejigger my program until it is perfect.
What actually happens is (sdl? poopwire?) just infinitybuffers so it never hitches and I get a second of latency after a little bit
I like that pipewire has an option to not be terrible ("pro audio" mode) and it doesn't work
99% of audio problems on linux these days are just programmers refusing to just fucking use alsa. I'm part of the problem, because I'm using SDL3 instead because the API is simple. SDL3 is part of the problem because when I tell it to just fucking use alsa it uses pipewire instead! and pipewire is part of the problem because it's just completely terrible. like, wayland terrible.
want to have low latency audio on linux? we have a tool for it, it's called STOP PILING LAYERS OF BOILERPLATE ON TOP OF ALSA YOU IDIOTS YOU ABSOLUTE FOOLS
I'm like 30% sure SDL3 is not the problem or at least not the only problem because I tried resetting the streams every frame with SDL_ClearAudioStream and it still accumulates latency (in addition to also now sounding atrocious due to missing samples).
I've also seen this happen with pipewire before in other situations, and it was resolved by bypassing pipewire.
*spaces out* so anyways, this is usually the point where I'd try to cut this down to a simple loop back with as few layers as possible and gradually build back towards my program until I either find where the fault is or or have something working properly. That would mean targeting ALSA directly, except that appears to not be possible without uninstalling pipewire-alsa, which I can't without uninstalling Steam :/
so abnormally, this means starting with a pipewire loopback instead and seeing if all you brave defenders of the status quo are flickering my lights or not.
this makes me unhappy, but the single silver lining here is pipewire's API docs seem to be a little more newbie friendly than ALSA's
ok I did it. I've got a program that writes a pipewire stream of F64 audio samples where each sample is the total elapsed time since the first frame, expressed in mintues.
I've got a second program that reads that pipewire stream, and checks the offset against it's own elapsed time since the first sample processed. This program prints out the calculated drift ever second.
The results are interesting.
In the first version of this, both programs just measured the time using std::chrono::steady_clock::time_point. This resulted in an oscillating drift that was well under a millisecond at its peak and nothing to be concerned about.
This is good! That means there's no place what so ever within pipewire on my computer for this specific audio setup where any intermediary buffers might be growing and adding more latency as the programs run.
This is not the interesting case.
In the second version, I changed the first program to instead calculate elapsed time as the frame number * the sampling interval, and left the second program alone.
In this version, the calculated drift is essentially the difference between the progress through the stream vs the amount of time that actually passed from the perspective of the observer. In this version, the amount of drift rises gradually. It seems the stream is advancing just a touch faster than it should.
The samples in the stream are reporting that more time has elapsed in the "recording" than actually has transpired according to the clock. The amount of drift accumulated seems to be a millisecond every few minutes.
I'm honestly not sure what to make of that.
anyways, for the curious, I put the source code for the experiment up here
https://github.com/Aeva/slowtimealso interesting is the drift is faster if I have the second program's monitor pin hooked up to my sound card, but there's still drift either way.
I think my conclusions from this are
1. the latency drift I observed with my experiments with pipewire today is probably inconsequential.
2. there is probably nothing sinister about pipewire.
3. if you have a chain of nodes that are a mix of push or pull driven and have different buffering strategies, you are in the Cool Zone
4. my program is probably going to have to handle "leap samples" in some situations. I admit I wasn't expecting that, but it feels obvious in retrospect.
5. the unplayable latency accumulation in my convolution experiment is problematic, but it is unrelated to the latency drift I observed today. This is probably going to be solved by stripping out all the SDL3 audio stuff and replacing it with using pipewire directly. this is thankfully only a minor inconvenience for me.
nice, pipewire has some special case stuff for filters
holy smokes I got it working :O!! i got my audio convolver working using the pipewire API directly!! and the latency seems to be very adequate for real time play :D
my revised opinion on pipewire is that I like that the API is wizards only. I'm a wizard, so that makes me feel special.

convolver/src/convolver.cpp at c5d1ca8ec8a4aafd640def16d68e1c84bbc6b240 · Aeva/convolver
Contribute to Aeva/convolver development by creating an account on GitHub.
GitHubgod damn this thing is so fucking cool. I've got it hooked up to my drum machine right now and the fm drum in particular is pretty good at isolating parts of the impulse response sample. I'm using a short sample from the Nier Automata song "Alien Manifestation" to convolve the drum machine and it sounds *amazing*. It's a shame I can't modulate the drum parameters on this machine, or I'd be doing some really wild stuff with this right now.
some small problems with this system:
1. I've had to turn down the sampling rate so I can convolve longer samples. 22050 hz works out ok though for what I've been messing with so far, so maybe it's not that big a deal. longer samples kinda make things muddy anyway
2. now I want to do multiple convolutions at once and layer things and that's probably not happening on this hardware XD
I'll probably have to switch to an fft based system for non-realtime convolution to make this practical for designing dynamic sound tracks for games that can run on a variety of hardware, otherwise I'll probably have to opt for actually recording my songs and stitching it together by some more conventional means
this thing is also really good at warming up my laptop XD
idk if I'm done playing around with this prototype yet, but I'd like to explore granular synthesis a bit soon. I think there's probably a lot of cool ways it can be combined with convolution, like having the kernel morph over time.
probably first is reworking this program so i can change out the convolution kernel without restarting it or at least make it so i don't have to recompile it each time
anyways i highly recommend building your own bespoke audio synthesis pipeline from scratch, it's a lot of fun
It occurred to me just now that I might be able to make this faster be rewriting it as a pixel shader. Each pixel in the output is an audio sample. Each PS thread reads a sample from the impulse response and the audio stream, multiplies them together, and writes out the result. To perform the dot product, the draw is instanced, and the add blend op is used to combine the results. I've also got a few ideas for variations that might be worthwhile.
Like, having the vertex shader or a mesh shader read the sample from the audio stream, have the PS read the impulse response, and stagger the draw rect. Main snag there is the render target might have to be 512x1 or something like that, or I'll have to do counter swizzling or something.
Also FP32 RGBA render targets would probably just batch 4 samples together for the sake of keeping the dimensions lower I guess.
I think this should be likely to be a lot faster, because I've made a 2D convolution kernel a lot slower by rewriting it to be compute in the past 😎 but if any of ya'll happen to have inside knowledge on if ihv's are letting raster ops wither and die because AAA graphics programmers think rasterization is passe now or something absurd like that do let me know.
I figure I should probably start recording my convolution experiments for reference, and this thread seems as good a place as any to post them.
Tonight's first experiment: An excerpt from a The King In Yellow audio book convolved with a short clip from the Chrono Cross OST (Chronopolis)
@aeva are you able to time vary the impulse response? I wonder what it would sound like if you varied it by e.g. resampling it to get shorter and shorter as time goes on
@halcy I was thinking about doing something like that but stretching / compressing the sample history of the audio stream instead, which is functionally the same idea. For a simple reverb IR you'd get a pitch shift in the delayed sound as well as it seeming to get faster or slower. I'm hoping slowing it down would sound like a tape stretching almost.
@halcy that would be non-uniform time mutation which may or may not be what you were thinking
@aeva yeah no same effect except something something aliasing whatever who cares probably sounds more fun if there's artifacts