Can large language models explore in-context?

Krishnamurthy, Akshay; Harris, Keegan; Foster, Dylan J.; Zhang, Cyril; Slivkins, Aleksandrs

Computer Science > Machine Learning

arXiv:2403.15371 (cs)

[Submitted on 22 Mar 2024 (v1), last revised 28 Oct 2024 (this version, v3)]

Title:Can large language models explore in-context?

Authors:Akshay Krishnamurthy, Keegan Harris, Dylan J. Foster, Cyril Zhang, Aleksandrs Slivkins

View PDF HTML (experimental)

Abstract:We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making. We focus on native performance of existing LLMs, without training interventions. We deploy LLMs as agents in simple multi-armed bandit environments, specifying the environment description and interaction history entirely in-context, i.e., within the LLM prompt. We experiment with GPT-3.5, GPT-4, and Llama2, using a variety of prompt designs, and find that the models do not robustly engage in exploration without substantial interventions: i) Across all of our experiments, only one configuration resulted in satisfactory exploratory behavior: GPT-4 with chain-of-thought reasoning and an externally summarized interaction history, presented as sufficient statistics; ii) All other configurations did not result in robust exploratory behavior, including those with chain-of-thought reasoning but unsummarized history. Although these findings can be interpreted positively, they suggest that external summarization -- which may not be possible in more complex settings -- is important for obtaining desirable behavior from LLM agents. We conclude that non-trivial algorithmic interventions, such as fine-tuning or dataset curation, may be required to empower LLM-based decision making agents in complex settings.

Comments:	Accepted to NeurIPS 2024. This version: added references to related and concurrent work
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2403.15371 [cs.LG]
	(or arXiv:2403.15371v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2403.15371

Submission history

From: Akshay Krishnamurthy [view email]
[v1] Fri, 22 Mar 2024 17:50:43 UTC (1,206 KB)
[v2] Fri, 12 Jul 2024 14:52:49 UTC (1,166 KB)
[v3] Mon, 28 Oct 2024 19:55:46 UTC (1,190 KB)

Computer Science > Machine Learning

Title:Can large language models explore in-context?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Can large language models explore in-context?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators