Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event Classification

Sarker, Shouvon; Qian, Lijun; Dong, Xishuang

Computer Science > Computation and Language

arXiv:2306.07297 (cs)

[Submitted on 10 Jun 2023]

Title:Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event Classification

Authors:Shouvon Sarker, Lijun Qian, Xishuang Dong

View PDF

Abstract:The identification of key factors such as medications, diseases, and relationships within electronic health records and clinical notes has a wide range of applications in the clinical field. In the N2C2 2022 competitions, various tasks were presented to promote the identification of key factors in electronic health records (EHRs) using the Contextualized Medication Event Dataset (CMED). Pretrained large language models (LLMs) demonstrated exceptional performance in these tasks. This study aims to explore the utilization of LLMs, specifically ChatGPT, for data augmentation to overcome the limited availability of annotated data for identifying the key factors in EHRs. Additionally, different pre-trained BERT models, initially trained on extensive datasets like Wikipedia and MIMIC, were employed to develop models for identifying these key variables in EHRs through fine-tuning on augmented datasets. The experimental results of two EHR analysis tasks, namely medication identification and medication event classification, indicate that data augmentation based on ChatGPT proves beneficial in improving performance for both medication identification and medication event classification.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2306.07297 [cs.CL]
	(or arXiv:2306.07297v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2306.07297

Submission history

From: Xishuang Dong [view email]
[v1] Sat, 10 Jun 2023 20:55:21 UTC (253 KB)

Computer Science > Computation and Language

Title:Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event Classification

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event Classification

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators