LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

Gao, Bofei; Cai, Zefan; Xu, Runxin; Wang, Peiyi; Zheng, Ce; Lin, Runji; Lu, Keming; Lin, Junyang; Zhou, Chang; Xiao, Wen; Hu, Junjie; Liu, Tianyu; Chang, Baobao

Computer Science > Computation and Language

arXiv:2406.14024v2 (cs)

[Submitted on 20 Jun 2024 (v1), revised 30 Jun 2024 (this version, v2), latest version 18 Oct 2024 (v4)]

Title:LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

Authors:Bofei Gao, Zefan Cai, Runxin Xu, Peiyi Wang, Ce Zheng, Runji Lin, Keming Lu, Junyang Lin, Chang Zhou, Wen Xiao, Junjie Hu, Tianyu Liu, Baobao Chang

View PDF HTML (experimental)

Abstract:Mathematical verfier achieves success in mathematical reasoning tasks by validating the correctness of solutions. However, existing verifiers are trained with binary classification labels, which are not informative enough for the model to accurately assess the solutions. To mitigate the aforementioned insufficiency of binary labels, we introduce step-wise natural language feedbacks as rationale labels (i.e., the correctness of the current step and the explanations). In this paper, we propose \textbf{Math-Minos}, a natural language feedback enhanced verifier by constructing automatically-generated training data and a two-stage training paradigm for effective training and efficient inference. Our experiments reveal that a small set (30k) of natural language feedbacks can significantly boost the performance of the verifier by the accuracy of 1.6\% (86.6\% $\rightarrow$ 88.2\%) on GSM8K and 0.8\% (37.8\% $\rightarrow$ 38.6\%) on MATH. We have released our code and data for further exploration.

Comments:	9 pages
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2406.14024 [cs.CL]
	(or arXiv:2406.14024v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.14024

Submission history

From: Bofei Gao [view email]
[v1] Thu, 20 Jun 2024 06:42:27 UTC (2,184 KB)
[v2] Sun, 30 Jun 2024 13:44:04 UTC (2,184 KB)
[v3] Mon, 8 Jul 2024 08:37:33 UTC (2,184 KB)
[v4] Fri, 18 Oct 2024 06:59:24 UTC (2,324 KB)

Computer Science > Computation and Language

Title:LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators