if you think about it, vulkan logical device implies the existence of vulkan illogical device

I'm slowly working through the vulkan spec writing a compute-only vulkan program from scratch that doesn't render anything, and it's going pretty well because the spec is really well written and I already know more or less exactly what I want to do anyway, but I just want to say just how silly (fun) it feels to write a program like this because you get to just skip over large swaths of the API.

Like, I'm working from the spec because the tutorials all make it more complicated.

also the tutorials I reviewed all did the annoying thing where the tutorial squirrels away the stuff you're trying to learn or reference into abstractions that only serve the needs of the tutorial writer, which given my goal is very specifically to *not draw anything*, there's really not much of a point to any of them lol. I'm really not the intended audience here though :3

I think it's cute that practically every vulkan command has one or more optional args to let you enter Hard Mode

(sorry for the double post, I added this to the wrong thread)

ok even with just the compute-only subset vulkan is a slog D:

I wonder how many people have actually managed to knuckle down and write a complete, useful vulkan program from scratch (no copy pasting from tutorials and stack overflow, no offloading significant parts to 3rd party libraries like VMA)

To think if I power through and get this thing working I could potentially be like the 20th person to bother

oh, update on my little vulkan compute project, last night I got as far as repeatedly dispatching an empty compute shader and allocating some memory 😎 I'm in the home stretch! I think I just need to figure out the resources / resource binding stuff and then I'll be able to start on my DSP experiment :3

which mostly means the next things are figuring out the least effort way of getting audio data into C++ (probably stb_vorbis?) and writing even more boilerplate for alsa...

Success! I got the vulkan compute shader cranking out the fibonacci series and reading it back to the CPU through a 8 byte persistently mapped buffer. Should be smooth sailing from here.
ok *whew* I finally did it! I implemented convolution reverb as a vulkan compute shader, and the results seem to be correct. I have it convolving the audio up front at the moment, but it seems to be reasonably fast and the results seem more or less correct. I'm using SDL3 to verify the output. It doesn't look like it'll be too crazy to rework it such that the stream is generated live.
it turns out the main difficulty working with vulkan is accidentally breaking your laptop in half

I reworked it so the convolution shader processes the audio in tandem with playback, so I'm *very* close to getting this working with live audio streams.

But more importantly, I used this to convolve my song "strange birds" with a choir-ish fanfare sound effect from a game I used to play as a kid and the result is like the grand cosmos opened up before me and I'm awash in the radiant light of the universe. Absolutely incredible.

I want to power through and get this into a state where I can use it with live instruments, but I am completely exhausted 😴
I reworked some things and now my audio convolving compute shader can convolve ~11 milliseconds worth of audio samples with an average processing time of ~7 milliseconds. That's with one channel with a bit rate of 22050. When the bit rate is 44100, the average processing time is a paltry ~8 milliseconds.
also sometime in the last week I made it so it can operate entirely on a live input stream from SDL3 rather than a wave file, so in theory I can incorporate this into a modular setup now, but the results are higher latency than I'd like, and SDL3 doesn't give you much control over audio latency.
Apparently my best frame time can get as low as 3 ms. I think vulkan should let me VK_QUEUE_GLOBAL_PRIORITY_REALTIME this program, but sadly vulkan is being a coward about it.

ok the problem I'm having with latency now is that the audio latency in the system grows over time and I'm not sure why. like it starts snappy and after running for a short while it gets super laggy :/

I'm guessing it's because SDL3 can and will resize buffers as it wants to, whereas I'd rather it just go crazy if it under runs.

What I want to do is have a fixed size buffer for input and output, enough that I can have the output double or tripple buffered to smooth over hitches caused by linux. if my program can't keep up I don't want it to quietly allocate more runway I want it to scream at me LOUDLY and HORRIBLY, but it wont do that because I'll rejigger my program until it is perfect.

What actually happens is (sdl? poopwire?) just infinitybuffers so it never hitches and I get a second of latency after a little bit

I like that pipewire has an option to not be terrible ("pro audio" mode) and it doesn't work
99% of audio problems on linux these days are just programmers refusing to just fucking use alsa. I'm part of the problem, because I'm using SDL3 instead because the API is simple. SDL3 is part of the problem because when I tell it to just fucking use alsa it uses pipewire instead! and pipewire is part of the problem because it's just completely terrible. like, wayland terrible.
want to have low latency audio on linux? we have a tool for it, it's called STOP PILING LAYERS OF BOILERPLATE ON TOP OF ALSA YOU IDIOTS YOU ABSOLUTE FOOLS

I'm like 30% sure SDL3 is not the problem or at least not the only problem because I tried resetting the streams every frame with SDL_ClearAudioStream and it still accumulates latency (in addition to also now sounding atrocious due to missing samples).

I've also seen this happen with pipewire before in other situations, and it was resolved by bypassing pipewire.

*spaces out* so anyways, this is usually the point where I'd try to cut this down to a simple loop back with as few layers as possible and gradually build back towards my program until I either find where the fault is or or have something working properly. That would mean targeting ALSA directly, except that appears to not be possible without uninstalling pipewire-alsa, which I can't without uninstalling Steam :/
so abnormally, this means starting with a pipewire loopback instead and seeing if all you brave defenders of the status quo are flickering my lights or not.
this makes me unhappy, but the single silver lining here is pipewire's API docs seem to be a little more newbie friendly than ALSA's

ok I did it. I've got a program that writes a pipewire stream of F64 audio samples where each sample is the total elapsed time since the first frame, expressed in mintues.

I've got a second program that reads that pipewire stream, and checks the offset against it's own elapsed time since the first sample processed. This program prints out the calculated drift ever second.

The results are interesting.

In the first version of this, both programs just measured the time using std::chrono::steady_clock::time_point. This resulted in an oscillating drift that was well under a millisecond at its peak and nothing to be concerned about.

This is good! That means there's no place what so ever within pipewire on my computer for this specific audio setup where any intermediary buffers might be growing and adding more latency as the programs run.

This is not the interesting case.

In the second version, I changed the first program to instead calculate elapsed time as the frame number * the sampling interval, and left the second program alone.

In this version, the calculated drift is essentially the difference between the progress through the stream vs the amount of time that actually passed from the perspective of the observer. In this version, the amount of drift rises gradually. It seems the stream is advancing just a touch faster than it should.

The samples in the stream are reporting that more time has elapsed in the "recording" than actually has transpired according to the clock. The amount of drift accumulated seems to be a millisecond every few minutes.

I'm honestly not sure what to make of that.

anyways, for the curious, I put the source code for the experiment up here https://github.com/Aeva/slowtime
also interesting is the drift is faster if I have the second program's monitor pin hooked up to my sound card, but there's still drift either way.

I think my conclusions from this are

1. the latency drift I observed with my experiments with pipewire today is probably inconsequential.

2. there is probably nothing sinister about pipewire.

3. if you have a chain of nodes that are a mix of push or pull driven and have different buffering strategies, you are in the Cool Zone

4. my program is probably going to have to handle "leap samples" in some situations. I admit I wasn't expecting that, but it feels obvious in retrospect.

5. the unplayable latency accumulation in my convolution experiment is problematic, but it is unrelated to the latency drift I observed today. This is probably going to be solved by stripping out all the SDL3 audio stuff and replacing it with using pipewire directly. this is thankfully only a minor inconvenience for me.
nice, pipewire has some special case stuff for filters
holy smokes I got it working :O!! i got my audio convolver working using the pipewire API directly!! and the latency seems to be very adequate for real time play :D
my revised opinion on pipewire is that I like that the API is wizards only. I'm a wizard, so that makes me feel special.

that or I'm just good at creating wizard problems for myself. either way I'm in a good mood.

https://github.com/Aeva/convolver/blob/c5d1ca8ec8a4aafd640def16d68e1c84bbc6b240/src/convolver.cpp#L509

convolver/src/convolver.cpp at c5d1ca8ec8a4aafd640def16d68e1c84bbc6b240 · Aeva/convolver

Contribute to Aeva/convolver development by creating an account on GitHub.

GitHub
god damn this thing is so fucking cool. I've got it hooked up to my drum machine right now and the fm drum in particular is pretty good at isolating parts of the impulse response sample. I'm using a short sample from the Nier Automata song "Alien Manifestation" to convolve the drum machine and it sounds *amazing*. It's a shame I can't modulate the drum parameters on this machine, or I'd be doing some really wild stuff with this right now.

some small problems with this system:

1. I've had to turn down the sampling rate so I can convolve longer samples. 22050 hz works out ok though for what I've been messing with so far, so maybe it's not that big a deal. longer samples kinda make things muddy anyway

2. now I want to do multiple convolutions at once and layer things and that's probably not happening on this hardware XD

I'll probably have to switch to an fft based system for non-realtime convolution to make this practical for designing dynamic sound tracks for games that can run on a variety of hardware, otherwise I'll probably have to opt for actually recording my songs and stitching it together by some more conventional means
this thing is also really good at warming up my laptop XD
idk if I'm done playing around with this prototype yet, but I'd like to explore granular synthesis a bit soon. I think there's probably a lot of cool ways it can be combined with convolution, like having the kernel morph over time.
probably first is reworking this program so i can change out the convolution kernel without restarting it or at least make it so i don't have to recompile it each time
anyways i highly recommend building your own bespoke audio synthesis pipeline from scratch, it's a lot of fun
It occurred to me just now that I might be able to make this faster be rewriting it as a pixel shader. Each pixel in the output is an audio sample. Each PS thread reads a sample from the impulse response and the audio stream, multiplies them together, and writes out the result. To perform the dot product, the draw is instanced, and the add blend op is used to combine the results. I've also got a few ideas for variations that might be worthwhile.
Like, having the vertex shader or a mesh shader read the sample from the audio stream, have the PS read the impulse response, and stagger the draw rect. Main snag there is the render target might have to be 512x1 or something like that, or I'll have to do counter swizzling or something.
@aeva we've been meaning to, tbh
@ireneista it's very satisfying to make sounds

@aeva I built my own audio system and hate every time I have to work on it, so I guess different strokes and all that.

(fwiw:
https://shirakumo.github.io/libmixed/
https://shirakumo.github.io/cl-libmixed/
https://shirakumo.github.io/harmony/ )

libmixed: About libmixed

@shinmera mine rewards me with magnificent sounds every time i play with it 😌

@aeva Agreed! My DSP project is the most coding fun I've had in years, with bonus fun sounds too 🥳

The Graphics Programmer to Audio Programmer pipeline is real 😂

Frankly @aeva ? I'd love to if I understood where to start and the involved math, it'd be a pleasure to suck at it as I do with rendering!
@aeva something about scrolling up through this thread and the length of it make me somewhat doubt that statement…
@aeva I have notes on shader reflection if you need them :3

@aeva Yeah, I had much the same thought myself and I'm working towards it (slowly 😅)

Currently I have FFT based morphing for grains (grab two grain chunks, FFT into blocks, then over X frame linearly blend between the two), and FFT based convolution for a filtering, so its only a short hop to mash 'em together 🤔

@bobvodka oh cool, how's it sound?

@aeva Sounds good to me.
Its nice how, over a long sample and a lot of "frames' you can hear the target sample slowly come in at various frequencies.

Should work well on smaller grains too; need to implement it in my 'grain swarm' code at some point.

@aeva I think being good at both creating and solving wizard problems is the defining trait of a wizard
@aeva constantly setting up hurdles for Future You to jump over, so to speak

@aeva also probably the reason why 20th level Wizards aren't running the world

With like say a Monk or Barbarian it's pretty obvious. If you just maintain a large enough distance you're pretty safe.

But Wizards? They got Power Word Kill and Wish and Time Stop and yet somehow those poor border towns are overrun by *Giant Rats* and Goblins? Why do they let this happen?

Answer: on it on it on it, as soon as they figure out why the enchanted quill refuses to draw black runes without magenta ink

@rygorous @aeva any relation to getting hoisted by your own petard? (For a Discworld wizard I'm sure the answer is yes)
@aeva nice work, very clean code 👌
@aeva are there any microphone equalizers for Linux? I want to emulate the Ghub experience in windows. Without Ghub. Or Windows.
@Surlytom @aeva Need to know distro and DE to answer this. KDE has it built in.
@lyrial @aeva
Oh right. Ubuntu 24.04 with Gnome
@Surlytom @aeva Take any ladspa or lv2 equalizer, hook it into a filter-chain node and link that up to your microphone. Or use EasyEffects to automate that.
@scherzog @aeva
Great starting point. Thanks!
@Surlytom @aeva (I wanted to go the "manual" route so I set up the dsp chain the way I wanted in carla and the wrote a small helper program to dump all the plugin parameters from the carla project file into something vaguely reminiscent of a pipewire configuration file.)

@aeva I'm a little late, but if you want to test things with ALSA or JACK without dealing with them directly, we've been pretty happy with PortAudio at work. https://www.portaudio.com/

We use it to sling 32 channels of latency-sensitive audio on Windows. (In our case with WDMKS, which ≈ ALSA on Windows.)

As far as abstractions go it's pretty thin and focused on low-level.

(There's no backend for PipeWire directly, but my understanding is PipeWire provides a JACK-compatible API that works well.)

PortAudio - an Open-Source Cross-Platform Audio API

PortAudio is a cross platform, open-source, audio I/O library. It provides a very simple API for recording and/or playing sound using a simple callback function.

@aeva Also I don't believe you need to uninstall PipeWire to use ALSA directly. I think you can just stop the PipeWire service.

Also I stumbled upon this when checking if PortAudio supported PipeWire: https://gitlab.freedesktop.org/pipewire/pipewire/-/wikis/Config-PipeWire#setting-buffer-size-quantum
You've maybe already seen it, but that PIPEWIRE_LATENCY environment variable might be helpful.

The linked FAQ also mentions a "Pro Audio Profile" that sounds very useful for what you're doing.

Config PipeWire · Wiki · PipeWire / pipewire · GitLab

Multimedia processing graphs

GitLab
@aeva Also also: I don't think it is, but if this is at all related to your C# project and you end up liking PortAudio, here are the C# bindings I made for that work project: https://github.com/horizongir/PortAudio.NET
(Barely tested on Linux, but should work. If you can confirm it works for you I can find some time to add Linux to our CI and publish NuGet packages.)
GitHub - horizongir/PortAudio.NET: WIP low-level C# bindings for PortAudio.

WIP low-level C# bindings for PortAudio. Contribute to horizongir/PortAudio.NET development by creating an account on GitHub.

GitHub
@PathogenDavid the project is currently a C++ project but I might check it out in the future
@aeva what is a "monitor pin" in the context of this program?
@aeva
Feels like a rounding or off by one error. Are you sure FramesProcessed = 1, and then (FramesProcessed + frame) is correct? What if FramesProcessed = 0?
@jannem the source is right there
@aeva yep. And no, that is correct.