On the Trade-off between Redundancy and Local Coherence in Summarization

Cardenas, Ronald; Galle, Matthias; Cohen, Shay B.

doi:10.1613/jair.1.15191

Computer Science > Computation and Language

arXiv:2205.10192 (cs)

[Submitted on 20 May 2022 (v1), last revised 6 Jun 2024 (this version, v2)]

Title:On the Trade-off between Redundancy and Local Coherence in Summarization

Authors:Ronald Cardenas, Matthias Galle, Shay B. Cohen

View PDF HTML (experimental)

Abstract:Extractive summaries are usually presented as lists of sentences with no expected cohesion between them and with plenty of redundant information if not accounted for. In this paper, we investigate the trade-offs incurred when aiming to control for inter-sentential cohesion and redundancy in extracted summaries, and their impact on their informativeness. As case study, we focus on the summarization of long, highly redundant documents and consider two optimization scenarios, reward-guided and with no supervision. In the reward-guided scenario, we compare systems that control for redundancy and cohesion during sentence scoring. In the unsupervised scenario, we introduce two systems that aim to control all three properties -- informativeness, redundancy, and cohesion -- in a principled way. Both systems implement a psycholinguistic theory that simulates how humans keep track of relevant content units and how cohesion and non-redundancy constraints are applied in short-term memory during reading. Extensive automatic and human evaluations reveal that systems optimizing for -- among other properties -- cohesion are capable of better organizing content in summaries compared to systems that optimize only for redundancy, while maintaining comparable informativeness. We find that the proposed unsupervised systems manage to extract highly cohesive summaries across varying levels of document redundancy, although sacrificing informativeness in the process. Finally, we lay evidence as to how simulated cognitive processes impact the trade-off between the analyzed summary properties.

Comments:	Accepted to JAIR
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2205.10192 [cs.CL]
	(or arXiv:2205.10192v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2205.10192
Journal reference:	Journal of Artificial Intelligence Research, 80, 273-326 (2024)
Related DOI:	https://doi.org/10.1613/jair.1.15191

Submission history

From: Ronald Cardenas Acosta [view email]
[v1] Fri, 20 May 2022 14:10:28 UTC (448 KB)
[v2] Thu, 6 Jun 2024 13:27:20 UTC (1,394 KB)

Computer Science > Computation and Language

Title:On the Trade-off between Redundancy and Local Coherence in Summarization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On the Trade-off between Redundancy and Local Coherence in Summarization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators