Map-based experience replay: a memory-efficient solution to catastrophic forgetting in reinforcement learning

Burhan Hafez , Tilman Immisch , Tom Weber , Stefan Wermter

Frontiers in Neurorobotics, Volume 17, pages 1127642, doi: 10.3389/fnbot.2023.1127642 - Jun 2023 Open Access

Associated documents :

Deep reinforcement learning (RL) agents often suffer from catastrophic forgetting, forgetting previously found solutions in parts of the input space when training new data. Replay memories are a common solution to the problem by decorrelating and shuffling old and new training samples. They naively store state transitions as they arrive, without regard for redundancy. We introduce a novel cognitive-inspired replay memory approach based on the Grow-When-Required (GWR) self-organizing network, which resembles a map-based mental model of the world. Our approach organizes stored transitions into a concise environment-model-like network of state nodes and transition edges, merging similar samples to reduce the memory size and increase pair-wise distance among samples, which increases the relevancy of each sample. Overall, our study shows that map-based experience replay allows for significant memory reduction with only small decreases in performance.

@Article{HIWW23,
 	 author =  {Hafez, Burhan and Immisch, Tilman and Weber, Tom and Wermter, Stefan},
 	 title = {Map-based experience replay: a memory-efficient solution to catastrophic forgetting in reinforcement learning},
 	 booktitle = {}
 	 journal = {Frontiers in Neurorobotics},
 	 editors = {}
 	 number = {}
 	 volume = {17},
 	 pages = {1127642},
 	 year = {2023},
 	 month = {Jun},
 	 publisher = {Frontiers Media S.A.},
 	 doi = {10.3389/fnbot.2023.1127642},
 }