Evaluating the Capabilities of LLMs for Supporting Anticipatory Impact Assessment

Allaham, Mowafak; Diakopoulos, Nicholas

Computer Science > Computation and Language

arXiv:2401.18028 (cs)

[Submitted on 31 Jan 2024 (v1), last revised 20 May 2024 (this version, v2)]

Title:Evaluating the Capabilities of LLMs for Supporting Anticipatory Impact Assessment

Authors:Mowafak Allaham, Nicholas Diakopoulos

View PDF HTML (experimental)

Abstract:Gaining insight into the potential negative impacts of emerging Artificial Intelligence (AI) technologies in society is a challenge for implementing anticipatory governance approaches. One approach to produce such insight is to use Large Language Models (LLMs) to support and guide experts in the process of ideating and exploring the range of undesirable consequences of emerging technologies. However, performance evaluations of LLMs for such tasks are still needed, including examining the general quality of generated impacts but also the range of types of impacts produced and resulting biases. In this paper, we demonstrate the potential for generating high-quality and diverse impacts of AI in society by fine-tuning completion models (GPT-3 and Mistral-7B) on a diverse sample of articles from news media and comparing those outputs to the impacts generated by instruction-based (GPT-4 and Mistral-7B-Instruct) models. We examine the generated impacts for coherence, structure, relevance, and plausibility and find that the generated impacts using Mistral-7B, a small open-source model fine-tuned on impacts from the news media, tend to be qualitatively on par with impacts generated using a more capable and larger scale model such as GPT-4. Moreover, we find that impacts produced by instruction-based models had gaps in the production of certain categories of impacts in comparison to fine-tuned models. This research highlights a potential bias in the range of impacts generated by state-of-the-art LLMs and the potential of aligning smaller LLMs on news media as a scalable alternative to generate high quality and more diverse impacts in support of anticipatory governance approaches.

Comments:	10 pages + research ethics and social impact statement, references, and appendix. Under conference review
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
Cite as:	arXiv:2401.18028 [cs.CL]
	(or arXiv:2401.18028v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.18028

Submission history

From: Mowafak Allaham [view email]
[v1] Wed, 31 Jan 2024 17:43:04 UTC (112 KB)
[v2] Mon, 20 May 2024 23:34:39 UTC (786 KB)

Computer Science > Computation and Language

Title:Evaluating the Capabilities of LLMs for Supporting Anticipatory Impact Assessment

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Evaluating the Capabilities of LLMs for Supporting Anticipatory Impact Assessment

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators