Is Factuality Decoding a Free Lunch for LLMs? Evaluation on Knowledge Editing Benchmark

Bi, Baolong; Liu, Shenghua; Wang, Yiwei; Mei, Lingrui; Cheng, Xueqi

Computer Science > Computation and Language

arXiv:2404.00216v1 (cs)

[Submitted on 30 Mar 2024 (this version), latest version 4 Oct 2024 (v2)]

Title:Is Factuality Decoding a Free Lunch for LLMs? Evaluation on Knowledge Editing Benchmark

Authors:Baolong Bi, Shenghua Liu, Yiwei Wang, Lingrui Mei, Xueqi Cheng

View PDF HTML (experimental)

Abstract:The rapid development of large language models (LLMs) enables them to convey factual knowledge in a more human-like fashion. Extensive efforts have been made to reduce factual hallucinations by modifying LLMs with factuality decoding. However, they also pose risks of hindering knowledge updates, as they make models overly confident in known facts. In this work, we first revisite the current factuality decoding methods and verified their effectiveness in enhancing factual accuracy. Subsequently, we conduct further evaluation of several strong factuality decoding methods on the knowledge editing benchmark. All these decoding methods significantly diminish the performance of llama2 models compared to their original decoding, with the largest decrease being a staggering 81.3\%. This further indicates that the current existing decoding methods still cannot perfectly address the factual hallucinations, as they overlook the importance of preserving the flexibility for knowledge editing. Therefore, our work suggests that research into factual alignment should simultaneously focus on the effectiveness of knowledge editing.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2404.00216 [cs.CL]
	(or arXiv:2404.00216v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2404.00216

Submission history

From: Baolong Bi [view email]
[v1] Sat, 30 Mar 2024 02:08:28 UTC (706 KB)
[v2] Fri, 4 Oct 2024 03:30:24 UTC (10,602 KB)

Computer Science > Computation and Language

Title:Is Factuality Decoding a Free Lunch for LLMs? Evaluation on Knowledge Editing Benchmark

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Is Factuality Decoding a Free Lunch for LLMs? Evaluation on Knowledge Editing Benchmark

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators