This #DeepRL paper from University of Alberta seems quite cool:
"Deep reinforcement learning without experience replay, target networks, or batch updates"
As the title says, they succeeded in training deep RL networks in streaming setting getting rid of replay buffers.
The main tricks for that to work seem to be signal normalization and bounding the step-size 🤯
💻Code: http://github.com/mohmdelsayed/streaming-drl
📄Paper: https://openreview.net/pdf?id=yqQJGTDGXN
