Controlled Diversity: Length-optimized Natural Language Generation

Schenke, Diana Marie; Baumann, Timo

Computer Science > Computation and Language

arXiv:2502.19347 (cs)

[Submitted on 26 Feb 2025]

Title:Controlled Diversity: Length-optimized Natural Language Generation

Authors:Diana Marie Schenke, Timo Baumann

View PDF HTML (experimental)

Abstract:LLMs are not generally able to adjust the length of their outputs based on strict length requirements, a capability that would improve their usefulness in applications that require adherence to diverse user and system requirements. We present an approach to train LLMs to acquire this capability by augmenting existing data and applying existing fine-tuning techniques, which we compare based on the trained models' adherence to the length requirement and overall response quality relative to the baseline model. Our results demonstrate that these techniques can be successfully applied to train LLMs to adhere to length requirements, with the trained models generating texts which better align to the length requirements. Our results indicate that our method may change the response quality when using training data that was not generated by the baseline model. This allows simultaneous alignment to another training objective in certain scenarios, but is undesirable otherwise. Training on a dataset containing the model's own responses eliminates this issue.

Comments:	ISCA/ITG Workshop on Diversity in Large Speech and Language Models
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2502.19347 [cs.CL]
	(or arXiv:2502.19347v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2502.19347

Submission history

From: Diana Marie Schenke [view email]
[v1] Wed, 26 Feb 2025 17:38:58 UTC (910 KB)

Computer Science > Computation and Language

Title:Controlled Diversity: Length-optimized Natural Language Generation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Controlled Diversity: Length-optimized Natural Language Generation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators