SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Miao, Ning; Teh, Yee Whye; Rainforth, Tom

Computer Science > Artificial Intelligence

arXiv:2308.00436 (cs)

[Submitted on 1 Aug 2023 (v1), last revised 5 Oct 2023 (this version, v3)]

Title:SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Authors:Ning Miao, Yee Whye Teh, Tom Rainforth

View PDF

Abstract:The recent progress in large language models (LLMs), especially the invention of chain-of-thought prompting, has made it possible to automatically answer questions by stepwise reasoning. However, when faced with more complicated problems that require non-linear thinking, even the strongest LLMs make mistakes. To address this, we explore whether LLMs are able to recognize errors in their own step-by-step reasoning, without resorting to external resources. To this end, we propose SelfCheck, a general-purpose zero-shot verification schema for recognizing such errors. We then use the results of these checks to improve question-answering performance by conducting weighted voting on multiple solutions to the question. We test SelfCheck on three datasets (GSM8K, MathQA, and MATH) and find that it successfully recognizes errors and, in turn, increases final answer accuracies.

Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2308.00436 [cs.AI]
	(or arXiv:2308.00436v3 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2308.00436

Submission history

From: Ning Miao [view email]
[v1] Tue, 1 Aug 2023 10:31:36 UTC (429 KB)
[v2] Wed, 2 Aug 2023 08:45:40 UTC (429 KB)
[v3] Thu, 5 Oct 2023 12:59:59 UTC (376 KB)

Computer Science > Artificial Intelligence

Title:SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators