Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Zhang, Danyang; Chen, Lu; Zhang, Situo; Xu, Hongshen; Zhao, Zihan; Yu, Kai

Computer Science > Computation and Language

arXiv:2306.07929 (cs)

[Submitted on 9 Jun 2023 (v1), last revised 30 Oct 2023 (this version, v2)]

Title:Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Authors:Danyang Zhang, Lu Chen, Situo Zhang, Hongshen Xu, Zihan Zhao, Kai Yu

View PDF

Abstract:Inspired by the insights in cognitive science with respect to human memory and reasoning mechanism, a novel evolvable LLM-based (Large Language Model) agent framework is proposed as REMEMBERER. By equipping the LLM with a long-term experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory. We further introduce Reinforcement Learning with Experience Memory (RLEM) to update the memory. Thus, the whole system can learn from the experiences of both success and failure, and evolve its capability without fine-tuning the parameters of the LLM. In this way, the proposed REMEMBERER constitutes a semi-parametric RL agent. Extensive experiments are conducted on two RL task sets to evaluate the proposed framework. The average results with different initialization and training sets exceed the prior SOTA by 4% and 2% for the success rate on two task sets and demonstrate the superiority and robustness of REMEMBERER.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2306.07929 [cs.CL]
	(or arXiv:2306.07929v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.07929

Submission history

From: Danyang Zhang [view email]
[v1] Fri, 9 Jun 2023 08:08:18 UTC (287 KB)
[v2] Mon, 30 Oct 2023 01:52:11 UTC (778 KB)

Computer Science > Computation and Language

Title:Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Large Language Models Are Semi-Parametric Reinforcement Learning Agents

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators