Mental Modelling of Reinforcement Learning Agents by Language Models

Transactions on Machine Learning Research, doi: 10.48550/arXiv.2406.18505 - Jan 2025 Open Access
Associated documents :  
Can emergent language models faithfully model the intelligence of decision-making agents? Though modern language models already exhibit some reasoning ability, and theoretically can potentially express any probable distribution over tokens, it remains underexplored how the world knowledge these pre-trained models have memorized can be utilised to comprehend an agent's behaviour in the physical world. This paper empirically examines, for the first time, how well large language models (LLMs) can build a mental model of reinforcement learning (RL) agents, termed agent mental modelling, by reasoning about an agent's behaviour and its effect on states from agent interaction history. This research attempts to unveil the potential of leveraging LLMs for elucidating RL agent behaviour, addressing a key challenge in explainable RL. To this end, we propose specific evaluation metrics and test them on selected RL task datasets of varying complexity, reporting findings on agent mental model establishment. Our results disclose that LLMs are not yet capable of fully realising the mental modelling of agents through inference alone without further innovations. This work thus provides new insights into the capabilities and limitations of modern LLMs, highlighting that while they show promise in understanding agents with a longer history context, preexisting beliefs within LLMs about behavioural optimum and state complexity limit their ability to fully comprehend an agent's behaviour and action effects.

 

@Article{LZSLW25, 
 	 author =  {Lu, Wenhao and Zhao, Xufeng and Spisak, Josua and Lee, Jae Hee and Wermter, Stefan},  
 	 title = {Mental Modelling of Reinforcement Learning Agents by Language Models}, 
 	 booktitle = {},
 	 journal = {Transactions on Machine Learning Research},
 	 editors = {},
 	 number = {},
 	 volume = {},
 	 pages = {},
 	 year = {2025},
 	 month = {Jan},
 	 publisher = {},
 	 doi = {10.48550/arXiv.2406.18505}, 
 }