Mastodawn

Dear #dsp / #mathstodon: I come to you deeply vexed about a subject.

We have well established, in my Teardown talk, that I believe that the Fourier Transform is the most beautiful concept in DSP -- using a simple change of basis into a frequency-orthogonal form unlocks so much optimization. And the Fast Fourier Transform is the most beautiful algorithm, doing this lovely dance of carefully reusing computation that you have already performed to simplify what by all rights ought be an $O(n^2)$ computation into an $O(n log n)$ computation.

However, I am upset, angry, and simply vexed about a different algorithm: the Short Time Fourier Transform. Often, it seems like I would like to get information about a *changing* signal with time granularity that is smaller than just one Fourier transform width. For instance, I would like to know how the cross-correlation of two signals varies over time. Or I would like to evaluate many offsets into a signal to choose which best lines my block up with my Fourier window. The Short Time Fourier Transform is the common solution to this. The algorithm, I fear, is dumb as rocks -- if you would like to split evaluate a signal with $k$ overlapping parts, you just... do $k$ Fourier transforms of your choice.

It feels like there ought be a better way. With all of the beauty and brains of the Cooley-Tukey Fast Fourier Transform, they so very exquisitely reuse computation to form the output. But when you do a STFT on top of that, if you want $k$ overlapping blocks, the cost is... $O(kn log n)$. Even though each of the $k$ may have all but one sample overlapping in there! It feels like we must be throwing away so much valuable information to just repeatedly redo a FT over and over again.

Dear Mathstodon, I beg of you. What am I missing here? If I want to do many closely overlapping Fourier transforms, *is* there a better way? Is this a sign that I am trying to do something fully wrong -- and if so, what's the better way to do, for example, the things I described before? Or have I just stumbled in to the dark shadow of frequency-domain operations that we dare not speak of?