SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models

Ramakrishna, Anil; Wan, Yixin; Jin, Xiaomeng; Chang, Kai-Wei; Bu, Zhiqi; Vinzamuri, Bhanukiran; Cevher, Volkan; Hong, Mingyi; Gupta, Rahul

Computer Science > Computation and Language

arXiv:2504.02883 (cs)

[Submitted on 2 Apr 2025]

Title:SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models

Authors:Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong, Rahul Gupta

View PDF HTML (experimental)

Abstract:We introduce SemEval-2025 Task 4: unlearning sensitive content from Large Language Models (LLMs). The task features 3 subtasks for LLM unlearning spanning different use cases: (1) unlearn long form synthetic creative documents spanning different genres; (2) unlearn short form synthetic biographies containing personally identifiable information (PII), including fake names, phone number, SSN, email and home addresses, and (3) unlearn real documents sampled from the target model's training dataset. We received over 100 submissions from over 30 institutions and we summarize the key techniques and lessons in this paper.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2504.02883 [cs.CL]
	(or arXiv:2504.02883v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2504.02883

Submission history

From: Anil Ramakrishna [view email]
[v1] Wed, 2 Apr 2025 07:24:59 UTC (2,570 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2025-04

Change to browse by:

cs
cs.LG

References & Citations

export BibTeX citation

Computer Science > Computation and Language

Title:SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators