Mastodawn

Title: P2: causal inference A/B learing [2024-10-29 Tue]
make this hack automatic.
蠡 #dailyreport #abtest #multiarmedbandit #mab

anoncheg Apr 23

Title: P2: P0: causal inference A/B learing [2024-10-29 Tue]
Popular algorithms:
- Upper Confidence Bound (UCB) - deterministic, optimal
- Thompson Sampling - stochastic, optimal
- Epsilon Greedy - stochastic, approximate

Thompson Sampling and UCB have asymptotic regret lower #dailyreport #abtest #multiarmedbandit #mab

anoncheg Apr 23

Title: P1: causal inference A/B learing [2024-10-29 Tue]
bound (where N is the number of arms and T is the number
of time steps).
: O(√(N*T*log(T)))

*regret* is difference between max possible reward and
collected. *optimal* means algoritms able to achive
minimal regret when T → ∞.

I hacked my first remote machine. I created a separate
account and cleared logs. I didn't break any
configuration. Now I am going to spend a day or so to #dailyreport #abtest #multiarmedbandit #mab

anoncheg Apr 23

Title: P1: P0: causal inference A/B learing [2024-10-29 Tue]
I have been reading about solving causal inference A/B
test as a Multi-Armed Bandit problem. These is ML
Reinforcement learning algorithms that applied at once
and enhance during time. #dailyreport #abtest #multiarmedbandit #mab