Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Lyu, Boxuan; Kamigaito, Hidetaka; Funakoshi, Kotaro; Okumura, Manabu

Computer Science > Computation and Language

arXiv:2406.11632 (cs)

[Submitted on 17 Jun 2024]

Title:Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Authors:Boxuan Lyu, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura

View PDF HTML (experimental)

Abstract:Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding offers an alternative by seeking hypotheses with the highest expected utility.
In this work, we show that Quality Estimation (QE) reranking, which uses a QE model as a reranker, can be viewed as a variant of MBR. Inspired by this, we propose source-based MBR (sMBR) decoding, a novel approach that utilizes synthetic sources generated by backward translation as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR significantly outperforms QE reranking and is competitive with standard MBR decoding. Furthermore, sMBR calls the utility function fewer times compared to MBR. Our findings suggest that sMBR is a promising approach for high-quality NMT decoding.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2406.11632 [cs.CL]
	(or arXiv:2406.11632v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.11632

Submission history

From: Boxuan Lyu [view email]
[v1] Mon, 17 Jun 2024 15:13:52 UTC (8,052 KB)

Computer Science > Computation and Language

Title:Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators