Teaching ChatGPT-4o is a great way to learn.

It's always nice to notice you know something ChatGPT doesn't know, as it typically means you know something most specialists in the field don't know:
https://chatgpt.com/share/67ac8053-bcf8-8002-951a-89bda381d3ff

#LLM #mathematics #MarkovChains #MDPs

ChatGPT - MDP vs Markov Chain

Shared via ChatGPT

ChatGPT

'Model-Free Representation Learning and Exploration in Low-Rank MDPs', by Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal.

http://jmlr.org/papers/v25/22-0687.html

#reinforcement #exploration #mdps

Model-Free Representation Learning and Exploration in Low-Rank MDPs

'Q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity', by Ali Kara, Naci Saldi, Serdar Yüksel.

http://jmlr.org/papers/v24/21-1457.html

#quantization #quantized #mdps

Q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity

'Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints', by Qinbo Bai, Vaneet Aggarwal, Ather Gattami.

http://jmlr.org/papers/v24/21-0117.html

#mdps #markov #pcmdp

Provably Sample-Efficient Model-Free Algorithm for MDPs with Peak Constraints