SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Miao, Ning; Teh, Yee Whye; Rainforth, Tom

Computer Science > Artificial Intelligence

arXiv:2308.00436v2 (cs)

[Submitted on 1 Aug 2023 (v1), revised 2 Aug 2023 (this version, v2), latest version 5 Oct 2023 (v3)]

Title:SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Authors:Ning Miao, Yee Whye Teh, Tom Rainforth

View PDF

Abstract:The recent progress in large language models (LLMs), especially the invention of chain-of-thoughts (CoT) prompting, makes it possible to solve reasoning problems. However, even the strongest LLMs are still struggling with more complicated problems that require non-linear thinking and multi-step reasoning. In this work, we explore whether LLMs have the ability to recognize their own errors, without resorting to external resources. In particular, we investigate whether they can be used to identify individual errors within a step-by-step reasoning. To this end, we propose a zero-shot verification scheme to recognize such errors. We then use this verification scheme to improve question-answering performance, by using it to perform weighted voting on different generated answers. We test the method on three math datasets-GSM8K, MathQA, and MATH-and find that it successfully recognizes errors and, in turn, increases final predictive performance.

Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2308.00436 [cs.AI]
	(or arXiv:2308.00436v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2308.00436

Submission history

From: Ning Miao [view email]
[v1] Tue, 1 Aug 2023 10:31:36 UTC (429 KB)
[v2] Wed, 2 Aug 2023 08:45:40 UTC (429 KB)
[v3] Thu, 5 Oct 2023 12:59:59 UTC (376 KB)

Computer Science > Artificial Intelligence

Title:SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators