99% of audio problems on linux these days are just programmers refusing to just fucking use alsa. I'm part of the problem, because I'm using SDL3 instead because the API is simple. SDL3 is part of the problem because when I tell it to just fucking use alsa it uses pipewire instead! and pipewire is part of the problem because it's just completely terrible. like, wayland terrible.
want to have low latency audio on linux? we have a tool for it, it's called STOP PILING LAYERS OF BOILERPLATE ON TOP OF ALSA YOU IDIOTS YOU ABSOLUTE FOOLS

I'm like 30% sure SDL3 is not the problem or at least not the only problem because I tried resetting the streams every frame with SDL_ClearAudioStream and it still accumulates latency (in addition to also now sounding atrocious due to missing samples).

I've also seen this happen with pipewire before in other situations, and it was resolved by bypassing pipewire.

*spaces out* so anyways, this is usually the point where I'd try to cut this down to a simple loop back with as few layers as possible and gradually build back towards my program until I either find where the fault is or or have something working properly. That would mean targeting ALSA directly, except that appears to not be possible without uninstalling pipewire-alsa, which I can't without uninstalling Steam :/
so abnormally, this means starting with a pipewire loopback instead and seeing if all you brave defenders of the status quo are flickering my lights or not.
this makes me unhappy, but the single silver lining here is pipewire's API docs seem to be a little more newbie friendly than ALSA's

ok I did it. I've got a program that writes a pipewire stream of F64 audio samples where each sample is the total elapsed time since the first frame, expressed in mintues.

I've got a second program that reads that pipewire stream, and checks the offset against it's own elapsed time since the first sample processed. This program prints out the calculated drift ever second.

The results are interesting.

In the first version of this, both programs just measured the time using std::chrono::steady_clock::time_point. This resulted in an oscillating drift that was well under a millisecond at its peak and nothing to be concerned about.

This is good! That means there's no place what so ever within pipewire on my computer for this specific audio setup where any intermediary buffers might be growing and adding more latency as the programs run.

This is not the interesting case.

In the second version, I changed the first program to instead calculate elapsed time as the frame number * the sampling interval, and left the second program alone.

In this version, the calculated drift is essentially the difference between the progress through the stream vs the amount of time that actually passed from the perspective of the observer. In this version, the amount of drift rises gradually. It seems the stream is advancing just a touch faster than it should.

The samples in the stream are reporting that more time has elapsed in the "recording" than actually has transpired according to the clock. The amount of drift accumulated seems to be a millisecond every few minutes.

I'm honestly not sure what to make of that.

anyways, for the curious, I put the source code for the experiment up here https://github.com/Aeva/slowtime
also interesting is the drift is faster if I have the second program's monitor pin hooked up to my sound card, but there's still drift either way.

I think my conclusions from this are

1. the latency drift I observed with my experiments with pipewire today is probably inconsequential.

2. there is probably nothing sinister about pipewire.

3. if you have a chain of nodes that are a mix of push or pull driven and have different buffering strategies, you are in the Cool Zone

4. my program is probably going to have to handle "leap samples" in some situations. I admit I wasn't expecting that, but it feels obvious in retrospect.

5. the unplayable latency accumulation in my convolution experiment is problematic, but it is unrelated to the latency drift I observed today. This is probably going to be solved by stripping out all the SDL3 audio stuff and replacing it with using pipewire directly. this is thankfully only a minor inconvenience for me.
nice, pipewire has some special case stuff for filters
holy smokes I got it working :O!! i got my audio convolver working using the pipewire API directly!! and the latency seems to be very adequate for real time play :D
my revised opinion on pipewire is that I like that the API is wizards only. I'm a wizard, so that makes me feel special.

that or I'm just good at creating wizard problems for myself. either way I'm in a good mood.

https://github.com/Aeva/convolver/blob/c5d1ca8ec8a4aafd640def16d68e1c84bbc6b240/src/convolver.cpp#L509

convolver/src/convolver.cpp at c5d1ca8ec8a4aafd640def16d68e1c84bbc6b240 · Aeva/convolver

Contribute to Aeva/convolver development by creating an account on GitHub.

GitHub
god damn this thing is so fucking cool. I've got it hooked up to my drum machine right now and the fm drum in particular is pretty good at isolating parts of the impulse response sample. I'm using a short sample from the Nier Automata song "Alien Manifestation" to convolve the drum machine and it sounds *amazing*. It's a shame I can't modulate the drum parameters on this machine, or I'd be doing some really wild stuff with this right now.

some small problems with this system:

1. I've had to turn down the sampling rate so I can convolve longer samples. 22050 hz works out ok though for what I've been messing with so far, so maybe it's not that big a deal. longer samples kinda make things muddy anyway

2. now I want to do multiple convolutions at once and layer things and that's probably not happening on this hardware XD

I'll probably have to switch to an fft based system for non-realtime convolution to make this practical for designing dynamic sound tracks for games that can run on a variety of hardware, otherwise I'll probably have to opt for actually recording my songs and stitching it together by some more conventional means
this thing is also really good at warming up my laptop XD
idk if I'm done playing around with this prototype yet, but I'd like to explore granular synthesis a bit soon. I think there's probably a lot of cool ways it can be combined with convolution, like having the kernel morph over time.
probably first is reworking this program so i can change out the convolution kernel without restarting it or at least make it so i don't have to recompile it each time
anyways i highly recommend building your own bespoke audio synthesis pipeline from scratch, it's a lot of fun
It occurred to me just now that I might be able to make this faster be rewriting it as a pixel shader. Each pixel in the output is an audio sample. Each PS thread reads a sample from the impulse response and the audio stream, multiplies them together, and writes out the result. To perform the dot product, the draw is instanced, and the add blend op is used to combine the results. I've also got a few ideas for variations that might be worthwhile.
Like, having the vertex shader or a mesh shader read the sample from the audio stream, have the PS read the impulse response, and stagger the draw rect. Main snag there is the render target might have to be 512x1 or something like that, or I'll have to do counter swizzling or something.
Also FP32 RGBA render targets would probably just batch 4 samples together for the sake of keeping the dimensions lower I guess.
I think this should be likely to be a lot faster, because I've made a 2D convolution kernel a lot slower by rewriting it to be compute in the past 😎 but if any of ya'll happen to have inside knowledge on if ihv's are letting raster ops wither and die because AAA graphics programmers think rasterization is passe now or something absurd like that do let me know.

I figure I should probably start recording my convolution experiments for reference, and this thread seems as good a place as any to post them.

Tonight's first experiment: An excerpt from a The King In Yellow audio book convolved with a short clip from the Chrono Cross OST (Chronopolis)

Tonight's second convolution experiment: The same audio book excerpt, but convolved with a frog instead.

Recordings of speech seem to convolve really well with music and weird samples like this, but it really depends on the voice and what you pick as a kernel.

I should remember to try the inverse of the first experiment later (but not tonight)
I had a really great thing going with the chronopolis sample as the impulse response, and using my drum machine to drive it yesterday. The FM drum is really great for isolating specific sounds from the impulse response. I did try to record it, but I recorded the unfiltered line in on accident instead, so I'll have to redo it later
Experiment 3: Impulse response is a clip from the audio book where the guy is dramatically saying the word "Carcosa". I got a pretty trippy dark ambiance out of it with the drum machine with it earlier, but I didn't feel like recreating it, so I ran a bunch of songs through it instead and Fire Coming Out Of A Monkey's Head sounded the most interesting with it. The Chrono Cross songs I tried didn't feel distorted enough to bother posting, and this one kinda doesn't either but its interesting.
Experiment 4: Same impulse response as the previous one, it's the clip from the audio book where the guy is saying "Carcosa", but this time I'm convolving it with VCV Rack. I've got a feedback loop of two sine wave oscillators that are modulating eachother's frequency. the output of the one that is functioning as the carrier is being feathered by a pair of low frequency oscillators before applying an envelope.
I'm really blown away by what I can do with fm synthesis + convolution.
Experiment 4a: here's another with that same impulse response and nearly the same vcvrack patch, but this time it sounds like a Cello or something instead
@aeva what's your convolution thing sound like with this impulse response i generated (wav file is very loud and bright on its own be careful) https://cancel.fm/stuff/share/gen%20IR%20for%20aeva.wav
@cancel I'll give it a try this evening. Is there a particular sort of song or recording you'd like me to convolve?
@aeva i think anything that isn't tonal would work
@cancel should I leave the little bit of leading silence in the sample?
@aeva Shouldn’t matter
BBC Sound Effects

BBC Sound Effects

@cancel it might sound interesting with the drum machine, I'll try it out in a little while
@cancel and here it is with the drum machine. doesn't change the sound all that much, but it's pleasant imo. makes it sound a bit muffled but also a bit more fruity, especially with the fm drum.