Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings

Gomes, Gonçalo; Coutinho, Isabel; Martins, Bruno

Computer Science > Computation and Language

arXiv:2402.03172 (cs)

[Submitted on 5 Feb 2024]

Title:Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings

Authors:Gonçalo Gomes, Isabel Coutinho, Bruno Martins

View PDF

Abstract:Although the International Classification of Diseases (ICD) has been adopted worldwide, manually assigning ICD codes to clinical text is time-consuming, error-prone, and expensive, motivating the development of automated approaches. This paper describes a novel approach for automated ICD coding, combining several ideas from previous related work. We specifically employ a strong Transformer-based model as a text encoder and, to handle lengthy clinical narratives, we explored either (a) adapting the base encoder model into a Longformer, or (b) dividing the text into chunks and processing each chunk independently. The representations produced by the encoder are combined with a label embedding mechanism that explores diverse ICD code synonyms. Experiments with different splits of the MIMIC-III dataset show that the proposed approach outperforms the current state-of-the-art models in ICD coding, with the label embeddings significantly contributing to the good performance. Our approach also leads to properly calibrated classification results, which can effectively inform downstream tasks such as quantification.

Comments:	Accepted to EACL2024
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.03172 [cs.CL]
	(or arXiv:2402.03172v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2402.03172

Submission history

From: Gonçalo Emanuel Cavaco Gomes [view email]
[v1] Mon, 5 Feb 2024 16:40:23 UTC (568 KB)

Computer Science > Computation and Language

Title:Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Accurate and Well-Calibrated ICD Code Assignment Through Attention Over Diverse Label Embeddings

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators