Does Writing with Language Models Reduce Content Diversity?

Padmakumar, Vishakh; He, He

Computer Science > Computation and Language

arXiv:2309.05196 (cs)

[Submitted on 11 Sep 2023 (v1), last revised 1 Jul 2024 (this version, v3)]

Title:Does Writing with Language Models Reduce Content Diversity?

Authors:Vishakh Padmakumar, He He

View PDF HTML (experimental)

Abstract:Large language models (LLMs) have led to a surge in collaborative writing with model assistance. As different users incorporate suggestions from the same model, there is a risk of decreased diversity in the produced content, potentially limiting diverse perspectives in public discourse. In this work, we measure the impact of co-writing on diversity via a controlled experiment, where users write argumentative essays in three setups -- using a base LLM (GPT3), a feedback-tuned LLM (InstructGPT), and writing without model help. We develop a set of diversity metrics and find that writing with InstructGPT (but not the GPT3) results in a statistically significant reduction in diversity. Specifically, it increases the similarity between the writings of different authors and reduces the overall lexical and content diversity. We additionally find that this effect is mainly attributable to InstructGPT contributing less diverse text to co-written essays. In contrast, the user-contributed text remains unaffected by model collaboration. This suggests that the recent improvement in generation quality from adapting models to human feedback might come at the cost of more homogeneous and less diverse content.

Comments:	ICLR 2024
Subjects:	Computation and Language (cs.CL); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
Cite as:	arXiv:2309.05196 [cs.CL]
	(or arXiv:2309.05196v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2309.05196

Submission history

From: Vishakh Padmakumar [view email]
[v1] Mon, 11 Sep 2023 02:16:47 UTC (1,233 KB)
[v2] Wed, 6 Mar 2024 20:48:40 UTC (1,260 KB)
[v3] Mon, 1 Jul 2024 16:36:30 UTC (1,291 KB)

Computer Science > Computation and Language

Title:Does Writing with Language Models Reduce Content Diversity?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Does Writing with Language Models Reduce Content Diversity?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators