Mastodawn

Mark Riedl Nov 22, 2022

Is it time to rethink exploration in reinforcement learning to be about more than just finding the best policy for the immediate task at hand?

@balloch and I say yes: https://arxiv.org/abs/2210.06168

@rockt and Greffenstette say yes: https://arxiv.org/abs/2211.07819

The Role of Exploration for Task Transfer in Reinforcement Learning

The exploration--exploitation trade-off in reinforcement learning (RL) is a well-known and much-studied problem that balances greedy action selection with novel experience, and the study of exploration methods is usually only considered in the context of learning the optimal policy for a single learning task. However, in the context of online task transfer, where there is a change to the task during online operation, we hypothesize that exploration strategies that anticipate the need to adapt to future tasks can have a pronounced impact on the efficiency of transfer. As such, we re-examine the exploration--exploitation trade-off in the context of transfer learning. In this work, we review reinforcement learning exploration methods, define a taxonomy with which to organize them, analyze these methods' differences in the context of task transfer, and suggest avenues for future investigation.

arXiv.org