Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

Bao, Keqin; Wan, Yu; Liu, Dayiheng; Yang, Baosong; Lei, Wenqiang; He, Xiangnan; Wong, Derek F.; Xie, Jun

Computer Science > Computation and Language

arXiv:2210.10049 (cs)

[Submitted on 18 Oct 2022 (v1), last revised 17 Feb 2023 (this version, v2)]

Title:Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

Authors:Keqin Bao, Yu Wan, Dayiheng Liu, Baosong Yang, Wenqiang Lei, Xiangnan He, Derek F.Wong, Jun Xie

View PDF

Abstract:In this paper, we present our submission to the sentence-level MQM benchmark at Quality Estimation Shared Task, named UniTE (Unified Translation Evaluation). Specifically, our systems employ the framework of UniTE, which combined three types of input formats during training with a pre-trained language model. First, we apply the pseudo-labeled data examples for the continuously pre-training phase. Notably, to reduce the gap between pre-training and fine-tuning, we use data pruning and a ranking-based score normalization strategy. For the fine-tuning phase, we use both Direct Assessment (DA) and Multidimensional Quality Metrics (MQM) data from past years' WMT competitions. Finally, we collect the source-only evaluation results, and ensemble the predictions generated by two UniTE models, whose backbones are XLM-R and InfoXLM, respectively. Results show that our models reach 1st overall ranking in the Multilingual and English-Russian settings, and 2nd overall ranking in English-German and Chinese-English settings, showing relatively strong performances in this year's quality estimation competition.

Comments:	WMT 2022 QE Shared Task. arXiv admin note: text overlap with arXiv:2210.09683
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2210.10049 [cs.CL]
	(or arXiv:2210.10049v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.10049

Submission history

From: Keqin Bao [view email]
[v1] Tue, 18 Oct 2022 08:55:27 UTC (47 KB)
[v2] Fri, 17 Feb 2023 15:48:42 UTC (47 KB)

Computer Science > Computation and Language

Title:Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Alibaba-Translate China's Submission for WMT 2022 Quality Estimation Shared Task

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators