Mastodawn

I am looking for a sensible measure of association between two time series with counts, but with a lot of zeros as well. For some odd reason I can't seem to find much literature on that. Anyone who can point me in a useful direction, preferably with some ready-to-use #RStats function?

If not, I can implement the measure myself as well.

Show thread

Dr. U Apr 25, 2023

@JorisMeys What about dynamic time warping to estimate similarity?

Show thread

Joris Meys Apr 25, 2023

@transportationtalk I've been looking into that, but the excess amount of zeroes makes it less straightforward than I'd hoped. Euclidean distances in the dtw algorithm aren't very meaningful I'm afraid.

Show thread

Bob SomeAle Apr 25, 2023

@JorisMeys @transportationtalk Are the zeroes meaningful? That is to say, does a correlation between the zeroes indicate a similarity between the time series?

Show thread

Joris Meys Apr 25, 2023

@bob_some_ale @transportationtalk Yes, the zeroes are meaningful in the sense that they indicate there was no detection. The problem is that we have very few detections, so classical measures give very high associations regardless of detections, simply because of the many zeroes.

Show thread

Bob SomeAle Apr 25, 2023

@JorisMeys @transportationtalk What if you turn the absence of detection into your primary variable. Maybe the rolling sum of days (or whatever your time period is) with no detection. Then you could use some basic time series analyses (ARIMA probably) to detect lags and correlation between the two. For example here are two cumulative series, perfectly correlated but lagged by 4.

Show thread

Nicola Rennie Apr 25, 2023

@JorisMeys

Dynamical correlation, maybe?

https://cran.r-project.org/web/packages/dynCorr/dynCorr.pdf

Not sure how well it will work when there are lots of zeros, but I've found it useful in other situations when more common methods have failed

Show thread

Joris Meys Apr 25, 2023

@nrennie Interesting approach, thanks! It seems like a correlation between polynomial smoothers through the time series, which might not be that easy with the many zeroes. But we could try a similar approach with a different smoother (eg a floating average, or something based on a zero-inflated approach).

Show thread

Sylvia Wenmackers 🦉🍀Apr 25, 2023

@JorisMeys Does coarse-graining (binning) help to get rid of the zeros? (And perhaps average over several such measures to smooth out the exact position of the bins?)

Show thread

Joris Meys Apr 25, 2023

@SylviaFysica Definitely something to try, thanks for the suggestion! I might try some rolling averages and see where that leads us.