Mastodawn

Clément Canonne Mar 4, 2023

📊It's a bit late, but time for our (bi)weekly quiz! Today, a thing I probably* talked about before, but will almost certainly* come very handy at some point, even if it does look a bit random*.

"Low-space Chebyshev!" 🎲 Or: randomness, but cheap. #weaeklyquiz

* Bad puns, everywhere.

1/9

Show thread

Clément Canonne Mar 4, 2023

Suppose you have an algorithm A, randomized as many algos are nowadays, and somewhere in its analysis you end up having to show that the sum of a bunch of independent things, X=∑ₖ Xₖ is close to its expected value: X≈𝔼[X].

(That is, you want to prove concentration of X around its expectation)

2/9

Show thread

Clément Canonne Mar 4, 2023

Say you want to show this is true with high constant probability: "X = 𝔼[X]]±small deviation" with probability say .99

That happens a lot! Streaming algos, learning algos, pretty much all over the place.

So you want to do that. The Xₖ's are the result of what your #algorithm does (or how you choose to analyze it), so...

3/9

Show thread

Clément Canonne Mar 4, 2023

... typically a bit hard to analyze. All you know is that they're independent, and after some excruciating computations you managed to compute the expectation and bound variance of every Xₖ. That should be enough to get that "high-probability statement"?

Q1: Who you gonna call? 👻

4/9

Wassily Hoeffding...

40.9%

Pafnuty Chebyshev?

31.8%

Andrej Markov!

13.6%

I call a friend 👀

13.6%

Poll ended at Mar 7, 2023 at 3:58am.

Show thread

Clément Canonne Mar 4, 2023

Now, you managed to do that, and lo and behold, it worked! You managed to prove that your algo returns a correct (or accurate enough) solution with probability at least 99%.

Huzzah!

Except that someone points to you that, unfortunately, those Xₖ's you were NOT independent.

5/11

Show thread

Clément Canonne Mar 4, 2023

That's a bummer, potentially. Alright, maybe you can still salvage something? Is your whole analysis (and possibly your algorithm) doomed?

Maybe not? 🤞

Q2: What's the minimum thing you'd need the Xₖ's to be for your analysis to still go through?

6/11

Negatively associated

Negatively correlated

55.6%

Nothing, all good!

16.7%

I call a friend, I said 👀

27.8%

Poll ended at Mar 7, 2023 at 4:02am.

Show thread

Clément Canonne Mar 4, 2023

Anyways, turns out, it's all good. Your variance-based analysis is safe! Your algo is correct!

You can breathe again.

Or... can you? There's still some small issue.. actually, not so small: your algorithm A? It uses a LOT of memory. All those random bits to generate and store! All that space!

7/11

Show thread

Clément Canonne Mar 4, 2023

Well, why is that? If each Xₖ-thingy your algo computes requires some number of random bits to be generated, and all the Xₖ's are independent, then that often ends up being a LOT of random bits that are needed to generate all those.

And often, you need to keep all those random bits available during the whole computation!

8/11

Show thread

Clément Canonne Mar 4, 2023

(Typical example: generate a random matrix M, and multiply it to each data point coming to you, sequentially. Need to store M!)

That's bad. Good random bits 🎲 are hard to come by (💰), and memory is not cheap either.

9/11

Show thread

Clément Canonne Mar 4, 2023

Say for simplicity each Xₖ only requires one (1) random bit to be generated. But you have n of those (1≤k≤n) and for your analysis they were either independent, or whatever you answered in Q2.

That's n 🎲 bits! Can you use less?

Q3: How many random bits does your algo need? Much less that n?

10/11

Not really, Ω(n)

17.6%

Yes, O(log n)

41.2%

Yes, O(√n)

29.4%

I want my phone call

11.8%

Poll ended at Mar 7, 2023 at 4:20am.

Show thread

Clément Canonne

Alright, that's all for today! I want to apologize for not mentioning Herman Chernoff in Q1, I ran out of poll options space.

Hope you found this #weaeklyquiz enjoyable! as usual, answers and discussion in a few days. Until then, please write comment and questions below!

11/end