Probabilistic Consensus through Ensemble Validation: A Framework for LLM Reliability

Naik, Ninad

Computer Science > Artificial Intelligence

arXiv:2411.06535 (cs)

[Submitted on 10 Nov 2024]

Title:Probabilistic Consensus through Ensemble Validation: A Framework for LLM Reliability

Authors:Ninad Naik

View PDF

Abstract:Large Language Models (LLMs) have shown significant advances in text generation but often lack the reliability needed for autonomous deployment in high-stakes domains like healthcare, law, and finance. Existing approaches rely on external knowledge or human oversight, limiting scalability. We introduce a novel framework that repurposes ensemble methods for content validation through model consensus. In tests across 78 complex cases requiring factual accuracy and causal consistency, our framework improved precision from 73.1% to 93.9% with two models (95% CI: 83.5%-97.9%) and to 95.6% with three models (95% CI: 85.2%-98.8%). Statistical analysis indicates strong inter-model agreement ($\kappa$ > 0.76) while preserving sufficient independence to catch errors through disagreement. We outline a clear pathway to further enhance precision with additional validators and refinements. Although the current approach is constrained by multiple-choice format requirements and processing latency, it offers immediate value for enabling reliable autonomous AI systems in critical applications.

Comments:	8 pages, 6 tables
Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2411.06535 [cs.AI]
	(or arXiv:2411.06535v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2411.06535

Submission history

From: Ninad Naik [view email]
[v1] Sun, 10 Nov 2024 17:32:16 UTC (284 KB)

Computer Science > Artificial Intelligence

Title:Probabilistic Consensus through Ensemble Validation: A Framework for LLM Reliability

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Probabilistic Consensus through Ensemble Validation: A Framework for LLM Reliability

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators