SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials

Jullien, Mael; Valentino, Marco; Freitas, André

Computer Science > Computation and Language

arXiv:2404.04963 (cs)

[Submitted on 7 Apr 2024]

Title:SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials

Authors:Mael Jullien, Marco Valentino, André Freitas

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) are at the forefront of NLP achievements but fall short in dealing with shortcut learning, factual inconsistency, and vulnerability to adversarial this http URL shortcomings are especially critical in medical contexts, where they can misrepresent actual model capabilities. Addressing this, we present SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for ClinicalTrials. Our contributions include the refined NLI4CT-P dataset (i.e., Natural Language Inference for Clinical Trials - Perturbed), designed to challenge LLMs with interventional and causal reasoning tasks, along with a comprehensive evaluation of methods and results for participant submissions. A total of 106 participants registered for the task contributing to over 1200 individual submissions and 25 system overview papers. This initiative aims to advance the robustness and applicability of NLI models in healthcare, ensuring safer and more dependable AI assistance in clinical decision-making. We anticipate that the dataset, models, and outcomes of this task can support future research in the field of biomedical NLI. The dataset, competition leaderboard, and website are publicly available.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2404.04963 [cs.CL]
	(or arXiv:2404.04963v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2404.04963

Submission history

From: Mael Jullien [view email]
[v1] Sun, 7 Apr 2024 13:58:41 UTC (8,288 KB)

Computer Science > Computation and Language

Title:SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:SemEval-2024 Task 2: Safe Biomedical Natural Language Inference for Clinical Trials

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators