what's the weirdest thing you've stumbled upon in #RL #reinforcementlearning ?
i'll start:
if using neural nets* you don't actually need any rewards to train an agent optimally on episodic cartpole.
* or positive initializations
what's the weirdest thing you've stumbled upon in #RL #reinforcementlearning ?
i'll start:
if using neural nets* you don't actually need any rewards to train an agent optimally on episodic cartpole.
* or positive initializations