Mastodawn

How does a #ReinforcementLearning agent decide what to do? Part 3 of my RL series tackles this by defining policies, MDPs and trajectories. We'll keep building up to fully grasping PPO!

https://shawnhymel.com/3328/reinforcement-learning-part-3-policies-markov-decision-processes-mdps-and-trajectories/?utm_source=mastodon&utm_medium=social&utm_campaign=rl_blog

#AI #MachineLearning #robotics #engineering #education

Reinforcement Learning Part 3: Policies, Markov Decision Processes (MDPs), and Trajectories - Shawn Hymel

In the third part of this reinforcement learning (RL) series, we’re going to give a formal definition for a policy and then conceptualize how actions and

Shawn Hymel