Word Importance Explains How Prompts Affect Language Model Outputs

Hackmann, Stefan; Mahmoudian, Haniyeh; Steadman, Mark; Schmidt, Michael

Computer Science > Artificial Intelligence

arXiv:2403.03028 (cs)

[Submitted on 5 Mar 2024]

Title:Word Importance Explains How Prompts Affect Language Model Outputs

Authors:Stefan Hackmann, Haniyeh Mahmoudian, Mark Steadman, Michael Schmidt

View PDF HTML (experimental)

Abstract:The emergence of large language models (LLMs) has revolutionized numerous applications across industries. However, their "black box" nature often hinders the understanding of how they make specific decisions, raising concerns about their transparency, reliability, and ethical use. This study presents a method to improve the explainability of LLMs by varying individual words in prompts to uncover their statistical impact on the model outputs. This approach, inspired by permutation importance for tabular data, masks each word in the system prompt and evaluates its effect on the outputs based on the available text scores aggregated over multiple user inputs. Unlike classical attention, word importance measures the impact of prompt words on arbitrarily-defined text scores, which enables decomposing the importance of words into the specific measures of interest--including bias, reading level, verbosity, etc. This procedure also enables measuring impact when attention weights are not available. To test the fidelity of this approach, we explore the effect of adding different suffixes to multiple different system prompts and comparing subsequent generations with different large language models. Results show that word importance scores are closely related to the expected suffix importances for multiple scoring functions.

Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
ACM classes:	I.2.7; I.5.2
Cite as:	arXiv:2403.03028 [cs.AI]
	(or arXiv:2403.03028v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2403.03028

Submission history

From: Mark Steadman [view email]
[v1] Tue, 5 Mar 2024 15:04:18 UTC (2,991 KB)

Computer Science > Artificial Intelligence

Title:Word Importance Explains How Prompts Affect Language Model Outputs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Word Importance Explains How Prompts Affect Language Model Outputs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators