Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following

Kumar, Sai Adith Senthil; Yan, Hao; Perepa, Saipavan; Yue, Murong; Yao, Ziyu

Computer Science > Computation and Language

arXiv:2504.06460 (cs)

[Submitted on 8 Apr 2025]

Title:Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following

Authors:Sai Adith Senthil Kumar, Hao Yan, Saipavan Perepa, Murong Yue, Ziyu Yao

View PDF

Abstract:Large Language Models (LLMs) are now increasingly widely used to simulate personas in virtual environments, leveraging their instruction-following capability. However, we discovered that even state-of-the-art LLMs cannot simulate personas with reversed performance (e.g., student personas with low proficiency in educational settings), which impairs the simulation diversity and limits the practical applications of the simulated environments. In this work, using mathematical reasoning as a representative scenario, we propose the first benchmark dataset for evaluating LLMs on simulating personas with reversed performance, a capability that we dub "counterfactual instruction following". We evaluate both open-weight and closed-source LLMs on this task and find that LLMs, including the OpenAI o1 reasoning model, all struggle to follow counterfactual instructions for simulating reversedly performing personas. Intersectionally simulating both the performance level and the race population of a persona worsens the effect even further. These results highlight the challenges of counterfactual instruction following and the need for further research.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2504.06460 [cs.CL]
	(or arXiv:2504.06460v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.06460

Submission history

From: Hao Yan [view email]
[v1] Tue, 8 Apr 2025 22:00:32 UTC (337 KB)

Computer Science > Computation and Language

Title:Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Can LLMs Simulate Personas with Reversed Performance? A Benchmark for Counterfactual Instruction Following

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators