Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?

Ye, Mengyu; Kuribayashi, Tatsuki; Kobayashi, Goro; Suzuki, Jun

Computer Science > Computation and Language

arXiv:2412.15628 (cs)

[Submitted on 20 Dec 2024]

Title:Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?

Authors:Mengyu Ye, Tatsuki Kuribayashi, Goro Kobayashi, Jun Suzuki

View PDF HTML (experimental)

Abstract:Elucidating the rationale behind neural models' outputs has been challenging in the machine learning field, which is indeed applicable in this age of large language models (LLMs) and in-context learning (ICL). When it comes to estimating input attributions (IA), ICL poses a new issue of interpreting which example in the prompt, consisting of a set of examples, contributed to identifying the task/rule to be solved. To this end, in this paper, we introduce synthetic diagnostic tasks inspired by the poverty of the stimulus design in inductive reasoning; here, most in-context examples are ambiguous w.r.t. their underlying rule, and one critical example disambiguates the task demonstrated. The question is whether conventional IA methods can identify such an example in interpreting the inductive reasoning process in ICL. Our experiments provide several practical findings; for example, a certain simple IA method works the best, and the larger the model, the generally harder it is to interpret the ICL with gradient-based IA methods.

Comments:	Preprint
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2412.15628 [cs.CL]
	(or arXiv:2412.15628v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2412.15628

Submission history

From: Mengyu Ye [view email]
[v1] Fri, 20 Dec 2024 07:35:42 UTC (296 KB)

Computer Science > Computation and Language

Title:Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators