Explaining Translationese: why are Neural Classifiers Better and what do they Learn?

Amponsah-Kaakyire, Kwabena; Pylypenko, Daria; van Genabith, Josef; España-Bonet, Cristina

Computer Science > Computation and Language

arXiv:2210.13391 (cs)

[Submitted on 24 Oct 2022]

Title:Explaining Translationese: why are Neural Classifiers Better and what do they Learn?

Authors:Kwabena Amponsah-Kaakyire, Daria Pylypenko, Josef van Genabith, Cristina España-Bonet

View PDF

Abstract:Recent work has shown that neural feature- and representation-learning, e.g. BERT, achieves superior performance over traditional manual feature engineering based approaches, with e.g. SVMs, in translationese classification tasks. Previous research did not show $(i)$ whether the difference is because of the features, the classifiers or both, and $(ii)$ what the neural classifiers actually learn. To address $(i)$, we carefully design experiments that swap features between BERT- and SVM-based classifiers. We show that an SVM fed with BERT representations performs at the level of the best BERT classifiers, while BERT learning and using handcrafted features performs at the level of an SVM using handcrafted features. This shows that the performance differences are due to the features. To address $(ii)$ we use integrated gradients and find that $(a)$ there is indication that information captured by hand-crafted features is only a subset of what BERT learns, and $(b)$ part of BERT's top performance results are due to BERT learning topic differences and spurious correlations with translationese.

Comments:	16 pages, 7 figures, 4 tables. The first 2 authors contributed equally. Accepted to BlackboxNLP 2022 (at EMNLP 2022)
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2210.13391 [cs.CL]
	(or arXiv:2210.13391v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2210.13391

Submission history

From: Daria Pylypenko [view email]
[v1] Mon, 24 Oct 2022 16:43:28 UTC (3,345 KB)

Computer Science > Computation and Language

Title:Explaining Translationese: why are Neural Classifiers Better and what do they Learn?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Explaining Translationese: why are Neural Classifiers Better and what do they Learn?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators