Adapting Language Models for Non-Parallel Author-Stylized Rewriting

Syed, Bakhtiyar; Verma, Gaurav; Srinivasan, Balaji Vasan; Natarajan, Anandhavelu; Varma, Vasudeva

Computer Science > Computation and Language

arXiv:1909.09962 (cs)

[Submitted on 22 Sep 2019 (v1), last revised 31 Oct 2020 (this version, v3)]

Title:Adapting Language Models for Non-Parallel Author-Stylized Rewriting

Authors:Bakhtiyar Syed, Gaurav Verma, Balaji Vasan Srinivasan, Anandhavelu Natarajan, Vasudeva Varma

View PDF

Abstract:Given the recent progress in language modeling using Transformer-based neural models and an active interest in generating stylized text, we present an approach to leverage the generalization capabilities of a language model to rewrite an input text in a target author's style. Our proposed approach adapts a pre-trained language model to generate author-stylized text by fine-tuning on the author-specific corpus using a denoising autoencoder (DAE) loss in a cascaded encoder-decoder framework. Optimizing over DAE loss allows our model to learn the nuances of an author's style without relying on parallel data, which has been a severe limitation of the previous related works in this space. To evaluate the efficacy of our approach, we propose a linguistically-motivated framework to quantify stylistic alignment of the generated text to the target author at lexical, syntactic and surface levels. The evaluation framework is both interpretable as it leads to several insights about the model, and self-contained as it does not rely on external classifiers, e.g. sentiment or formality classifiers. Qualitative and quantitative assessment indicates that the proposed approach rewrites the input text with better alignment to the target style while preserving the original content better than state-of-the-art baselines.

Comments:	Accepted for publication in Main Technical Track at AAAI 20
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:1909.09962 [cs.CL]
	(or arXiv:1909.09962v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1909.09962

Submission history

From: Gaurav Verma [view email]
[v1] Sun, 22 Sep 2019 08:13:28 UTC (430 KB)
[v2] Sun, 17 Nov 2019 09:19:29 UTC (436 KB)
[v3] Sat, 31 Oct 2020 06:43:02 UTC (436 KB)

Computer Science > Computation and Language

Title:Adapting Language Models for Non-Parallel Author-Stylized Rewriting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Adapting Language Models for Non-Parallel Author-Stylized Rewriting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators